散文網(wǎng) » 科技 »學(xué)習(xí) » 簡(jiǎn)介Prompt框架并給出自然語(yǔ)言處理技術(shù)：Few-Shot Prompting

簡(jiǎn)介Prompt框架并給出自然語(yǔ)言處理技術(shù)：Few-Shot Prompting

2023-05-25 16:09 作者:汀丶人工智能 0人讀過(guò) | 我要投稿

Prompt learning 教學(xué)[進(jìn)階篇]：簡(jiǎn)介Prompt框架并給出自然語(yǔ)言處理技術(shù)：Few-Shot Prompting、Self-Consistency等；項(xiàng)目實(shí)戰(zhàn)搭建知識(shí)庫(kù)內(nèi)容機(jī)器人

1.ChatGPT Prompt Framework

看完基礎(chǔ)篇的各種場(chǎng)景介紹后，你應(yīng)該對(duì) Prompt 有較深的理解。之前的章節(jié)我們講的都是所謂的「術(shù)」，更多地集中講如何用，但講「道」的部分不多。高級(jí)篇除了會(huì)講更高級(jí)的運(yùn)用外，還會(huì)講更多「道」的部分。高級(jí)篇的開(kāi)篇，我們來(lái)講一下構(gòu)成 prompt 的框架。

1.1Basic Prompt Framework

查閱了非常多關(guān)于 ChatGPT prompt 的框架資料，我目前覺(jué)得寫(xiě)得最清晰的是 Elavis Saravia?總結(jié)的框架，他認(rèn)為一個(gè) prompt 里需包含以下幾個(gè)元素：

Instruction（必須）：?指令，即你希望模型執(zhí)行的具體任務(wù)。
Context（選填）：?背景信息，或者說(shuō)是上下文信息，這可以引導(dǎo)模型做出更好的反應(yīng)。
Input Data（選填）：?輸入數(shù)據(jù)，告知模型需要處理的數(shù)據(jù)。
Output Indicator（選填）：?輸出指示器，告知模型我們要輸出的類型或格式。

只要你按照這個(gè)框架寫(xiě) prompt ，模型返回的結(jié)果都不會(huì)差。

當(dāng)然，你在寫(xiě) prompt 的時(shí)候，并不一定要包含所有4個(gè)元素，而是可以根據(jù)自己的需求排列組合。比如拿前面的幾個(gè)場(chǎng)景作為例子：

推理：Instruction + Context + Input Data
信息提?。篒nstruction + Context + Input Data + Output Indicator

1.2 CRISPE Prompt Framework

另一個(gè)我覺(jué)得很不錯(cuò)的 Framework 是?Matt Nigh?的 CRISPE Framework，這個(gè) framework 更加復(fù)雜，但完備性會(huì)比較高，比較適合用于編寫(xiě) prompt 模板。CRISPE 分別代表以下含義：

CR：?Capacity and Role（能力與角色）。你希望 ChatGPT 扮演怎樣的角色。
I：?Insight（洞察力），背景信息和上下文（坦率說(shuō)來(lái)我覺(jué)得用 Context 更好）。
S：?Statement（指令），你希望 ChatGPT 做什么。
P：?Personality（個(gè)性），你希望 ChatGPT 以什么風(fēng)格或方式回答你。
E：?Experiment（嘗試），要求 ChatGPT 為你提供多個(gè)答案。

以下是這幾個(gè)參數(shù)的例子：

|?Step?|?Example?| | ----------------- | ------------------------------ | | Capacity and Role | Act as an expert on software development on the topic of machine learning frameworks, and an expert blog writer.
把你想象成機(jī)器學(xué)習(xí)框架主題的軟件開(kāi)發(fā)專家，以及專業(yè)博客作者。 | | Insight | The audience for this blog is technical professionals who are interested in learning about the latest advancements in machine learning.
這個(gè)博客的讀者主要是有興趣了解機(jī)器學(xué)習(xí)最新進(jìn)展技術(shù)的專業(yè)人士。 | | Statement | Provide a comprehensive overview of the most popular machine learning frameworks, including their strengths and weaknesses. Include real-life examples and case studies to illustrate how these frameworks have been successfully used in various industries.
提供最流行的機(jī)器學(xué)習(xí)框架的全面概述，包括它們的優(yōu)點(diǎn)和缺點(diǎn)。包括現(xiàn)實(shí)生活中的例子，和研究案例，以說(shuō)明這些框架如何在各個(gè)行業(yè)中成功地被使用。 | | Personality | When responding, use a mix of the writing styles of Andrej Karpathy, Francois Chollet, Jeremy Howard, and Yann LeCun.
在回應(yīng)時(shí)，混合使用 Andrej Karpathy、Francois Chollet、Jeremy Howard 和 Yann LeCun 的寫(xiě)作風(fēng)格。 | | Experiment | Give me multiple different examples.
給我多個(gè)不同的例子。 |

2.Zero-Shot Prompting

在基礎(chǔ)篇里的推理場(chǎng)景，我提到了 Zero-Shot Prompting 的技術(shù)，本章會(huì)詳細(xì)介紹它是什么，以及使用它的技巧。Zero-Shot Prompting 是一種自然語(yǔ)言處理技術(shù)，可以讓計(jì)算機(jī)模型根據(jù)提示或指令進(jìn)行任務(wù)處理。各位常用的 ChatGPT 就用到這個(gè)技術(shù)。

傳統(tǒng)的自然語(yǔ)言處理技術(shù)通常需要在大量標(biāo)注數(shù)據(jù)上進(jìn)行有監(jiān)督的訓(xùn)練，以便模型可以對(duì)特定任務(wù)或領(lǐng)域進(jìn)行

準(zhǔn)確的預(yù)測(cè)或生成輸出。相比之下，Zero-Shot Prompting 的方法更為靈活和通用，因?yàn)樗恍枰槍?duì)每個(gè)新任務(wù)或領(lǐng)域都進(jìn)行專門(mén)的訓(xùn)練。相反，它通過(guò)使用預(yù)先訓(xùn)練的語(yǔ)言模型和一些示例或提示，來(lái)幫助模型進(jìn)行推理和生成輸出。

舉個(gè)例子，我們可以給 ChatGPT 一個(gè)簡(jiǎn)短的 prompt，比如?描述某部電影的故事情節(jié)，它就可以生成一個(gè)關(guān)于該情節(jié)的摘要，而不需要進(jìn)行電影相關(guān)的專門(mén)訓(xùn)練。

2.1 Zero-Shot Prompting 缺點(diǎn)

但這個(gè)技術(shù)并不是沒(méi)有缺點(diǎn)的：

Zero-Shot Prompting 技術(shù)依賴于預(yù)訓(xùn)練的語(yǔ)言模型，這些模型可能會(huì)受到訓(xùn)練數(shù)據(jù)集的限制和偏見(jiàn)。比如在使用 ChatGPT 的時(shí)候，它常常會(huì)在一些投資領(lǐng)域，使用男性的「他」，而不是女性的「她」。那是因?yàn)橛?xùn)練 ChatGPT 的數(shù)據(jù)里，提到金融投資領(lǐng)域的內(nèi)容，多為男性。
盡管 Zero-Shot Prompting 技術(shù)不需要為每個(gè)任務(wù)訓(xùn)練單獨(dú)的模型，但為了獲得最佳性能，它需要大量的樣本數(shù)據(jù)進(jìn)行微調(diào)。像 ChatGPT 就是一個(gè)例子，它的樣本數(shù)量是過(guò)千億。
由于 Zero-Shot Prompting 技術(shù)的靈活性和通用性，它的輸出有時(shí)可能不夠準(zhǔn)確，或不符合預(yù)期。這可能需要對(duì)模型進(jìn)行進(jìn)一步的微調(diào)或添加更多的提示文本來(lái)糾正。

2.2 技巧：Zero-Shot Chain of Thought

基于上述的第三點(diǎn)缺點(diǎn)，研究人員就找到了一個(gè)叫 Chain of Thought 的技巧。

這個(gè)技巧使用起來(lái)非常簡(jiǎn)單，只需要在問(wèn)題的結(jié)尾里放一句?Let‘s think step by step?（讓我們一步步地思考），模型輸出的答案會(huì)更加準(zhǔn)確。

這個(gè)技巧來(lái)自于 Kojima 等人 2022 年的論文?Large Language Models are Zero-Shot Reasoners。在論文里提到，當(dāng)我們向模型提一個(gè)邏輯推理問(wèn)題時(shí)，模型返回了一個(gè)錯(cuò)誤的答案，但如果我們?cè)趩?wèn)題最后加入?Let‘s think step by step?這句話之后，模型就生成了正確的答案：

論文里有講到原因，感興趣的朋友可以去看看，我簡(jiǎn)單解釋下為什么（?? 如果你有更好的解釋，不妨反饋給我）：

首先各位要清楚像 ChatGPT 這類產(chǎn)品，它是一個(gè)統(tǒng)計(jì)語(yǔ)言模型，本質(zhì)上是基于過(guò)去看到過(guò)的所有數(shù)據(jù)，用統(tǒng)計(jì)學(xué)意義上的預(yù)測(cè)結(jié)果進(jìn)行下一步的輸出（這也就是為什么你在使用 ChatGPT 的時(shí)候，它的答案是一個(gè)字一個(gè)字地吐出來(lái)，而不是直接給你的原因，因?yàn)榇鸢甘且粋€(gè)字一個(gè)字算出來(lái)的）。
當(dāng)它拿到的數(shù)據(jù)里有邏輯，它就會(huì)通過(guò)統(tǒng)計(jì)學(xué)的方法將這些邏輯找出來(lái)，并將這些邏輯呈現(xiàn)給你，讓你感覺(jué)到它的回答很有邏輯。
在計(jì)算的過(guò)程中，模型會(huì)進(jìn)行很多假設(shè)運(yùn)算（不過(guò)暫時(shí)不知道它是怎么算的）。比如解決某個(gè)問(wèn)題是從 A 到 B 再到 C，中間有很多假設(shè)。
它第一次算出來(lái)的答案錯(cuò)誤的原因，只是因?yàn)樗谥虚g跳過(guò)了一些步驟（B）。而讓模型一步步地思考，則有助于其按照完整的邏輯鏈（A > B > C）去運(yùn)算，而不會(huì)跳過(guò)某些假設(shè)，最后算出正確的答案。

按照論文里的解釋，零樣本思維鏈涉及兩個(gè)補(bǔ)全結(jié)果，左側(cè)氣泡表示基于提示輸出的第一次的結(jié)果，右側(cè)氣泡表示其收到了第一次結(jié)果后，將最開(kāi)始的提示一起拿去運(yùn)算，最后得出了正確的答案：

這個(gè)技巧，除了用于解決復(fù)雜問(wèn)題外，還適合生成一些連貫主題的內(nèi)容，比如寫(xiě)長(zhǎng)篇文章、電影劇本等。

但需要注意其缺點(diǎn)，連貫不代表，它就一定不會(huì)算錯(cuò)，如果其中某一步驟算錯(cuò)了，錯(cuò)誤會(huì)因?yàn)檫壿嬫?，逐步將錯(cuò)誤積累，導(dǎo)致生成的文本可能出現(xiàn)與預(yù)期不符的內(nèi)容。

另外，根據(jù) Wei 等人在?2022 年的論文表明，它僅在大于等于 100B 參數(shù)的模型中使用才會(huì)有效。如果你使用的是小樣本模型，這個(gè)方法不會(huì)生效。

根據(jù) Yongchao Zhou 等人的最新論文，更好的 prompt 是：

Let's work this out in a step by step way to be sure we have the right answer.

在吳恩達(dá)的 ChatGPT Prompt Engineering?課程中，有提到一個(gè)這個(gè)技巧的另一種用法，不僅僅只是讓 AI 按步驟行事，還會(huì)告知 AI 每一步要做什么。比如這個(gè)案例（注意這個(gè)是 python 代碼）：

prompt_2 = f""" Your task is to perform the following actions: 1 - Summarize the following text delimited by ?<> with 1 sentence. 2 - Translate the summary into French. 3 - List each name in the French summary. 4 - Output a json object that contains the ?following keys: french_summary, num_names. Use the following format: Text: <text to summarize> Summary: <summary> Translation: <summary translation> Names: <list of names in Italian summary> Output JSON: <json with summary and num_names> Text: <{text}> """

簡(jiǎn)單解釋下這個(gè)代碼：

開(kāi)頭是讓 AI 按照 1 ～ 4 步運(yùn)行
然后再讓 AI 根據(jù)特定格式輸出內(nèi)容

最后 AI 的輸出是這樣的：

Summary: Jack and Jill go on a quest to fetch water, but misfortune strikes and they tumble down the hill, returning home slightly battered but with their adventurous spirits undimmed.Translation: Jack et Jill partent en quête d'eau, mais la malchance frappe et ils dégringolent la colline, rentrant chez eux légèrement meurtris mais avec leurs esprits aventureux intacts.Names: Jack, JillOutput JSON: {"french_summary": "Jack et Jill partent en quête d'eau, mais la malchance frappe et ils dégringolent la colline, rentrant chez eux légèrement meurtris mais avec leurs esprits aventureux intacts.", "num_names": 2}

上述的案例只是將任務(wù)拆解，能讓 AI 生成的結(jié)果更加符合要求，這個(gè)方法同樣能提升 AI 的回答準(zhǔn)確性，比如這個(gè)案例：

Determine if the student's solution is correct or not. Question: I'm building a solar power installation and I need help working out the financials. Land costs $100 / square foot I can buy solar panels for $250 / square foot I negotiated a contract for maintenance that will cost \ me a flat $100k per year, and an additional $10 / square foot What is the total cost for the first year of operationsas a function of the number of square feet. Student's Solution: Let x be the size of the installation in square feet. Costs: Land cost: 100x Solar panel cost: 250x Maintenance cost: 100,000 + 100x Total cost: 100x + 250x + 100,000 + 100x = 450x + 100,000

AI 的回答是「The student's solution is correct」。但其實(shí)學(xué)生的答案是錯(cuò)誤的，應(yīng)該 360x + 100,000，我們將 prompt 調(diào)整成這樣：

prompt = f""" Your task is to determine if the student's solution \ is correct or not. To solve the problem do the following: - First, work out your own solution to the problem. - Then compare your solution to the student's solution \ and evaluate if the student's solution is correct or not. Don't decide if the student's solution is correct until you have done the problem yourself. Use the following format: Question: ### question here ### Student's solution: ### student's solution here ### Actual solution: ### steps to work out the solution and your solution here ### Is the student's solution the same as actual solution \ just calculated: ### yes or no ### Student grade: ### correct or incorrect ### Question: ### I'm building a solar power installation and I need help \ working out the financials. - Land costs $100 / square foot - I can buy solar panels for $250 / square foot - I negotiated a contract for maintenance that will cost \ ?me a flat $100k per year, and an additional $10 / square \ ?foot ?What is the total cost for the first year of operations \ ?as a function of the number of square feet. ### Student's solution: ### Let x be the size of the installation in square feet. Costs: 1. Land cost: 100x 2. Solar panel cost: 250x 3. Maintenance cost: 100,000 + 100x ? Total cost: 100x + 250x + 100,000 + 100x = 450x + 100,000 ### Actual solution: """

本質(zhì)上，也是將任務(wù)分拆成多步，這次 AI 輸出的結(jié)果是這樣的（結(jié)果就是正確的了）：

Let x be the size of the installation in square feet. Costs:1. Land cost: 100x2. Solar panel cost: 250x3. Maintenance cost: 100,000 + 10x Total cost: 100x + 250x + 100,000 + 10x = 360x + 100,000 Is the student's solution the same as actual solution just calculated: No Student grade: Incorrect

下一章我們會(huì)結(jié)合 Few-Shot Chain of Thought 來(lái)詳細(xì)講講邏輯鏈的限制。

3. Few-Shot Prompting

我們?cè)诩记? 中，提到我們可以給模型一些示例，從而讓模型返回更符合我們需求的答案。這個(gè)技巧其實(shí)使用了一個(gè)叫 Few-Shot 的方法。

這個(gè)方法最早是 Brown 等人在 2020 年發(fā)現(xiàn)的，論文里有一個(gè)這樣的例子，非常有意思，通過(guò)這個(gè)例子你應(yīng)該更能體會(huì)，像 ChatGPT 這類統(tǒng)計(jì)語(yǔ)言模型，其實(shí)并不懂意思，只是懂概率

Brown 輸入的內(nèi)容是這樣的（whatpu 和 farduddle 其實(shí)根本不存在）：

A "whatpu" is a small, furry animal native to Tanzania. An example of a sentence that uses the word whatpu is: We were traveling in Africa and we saw these very cute whatpus. To do a "farduddle" means to jump up and down really fast. An example of a sentence that uses the word farduddle is:

Output 是這樣的：

When we won the game, we all started to farduddle in celebration.

不過(guò)這并不代表，F(xiàn)ew-Shot 就沒(méi)有缺陷，我們?cè)囋囅旅孢@個(gè)例子：

Prompt：

The odd numbers in this group add up to an even number: 4, 8, 9, 15, 12, 2, 1. A: The answer is False. The odd numbers in this group add up to an even number: 17, ?10, 19, 4, 8, 12, 24. A: The answer is True. The odd numbers in this group add up to an even number: 16, ?11, 14, 4, 8, 13, 24. A: The answer is True. The odd numbers in this group add up to an even number: 17, ?9, 10, 12, 13, 4, 2. A: The answer is False. The odd numbers in this group add up to an even number: 15, 32, 5, 13, 82, 7, 1. A:

Output 是這樣的：

The answer is True.

輸出的答案其實(shí)是錯(cuò)誤的，實(shí)際上的答案應(yīng)該是：

Adding all the odd numbers (15, 5, 13, 7, 1) gives 41. The answer is False.

那我們有沒(méi)有什么方法解決？

技巧8：Few-Shot Chain of Thought

要解決這個(gè)缺陷，就要使用到新的技巧，F(xiàn)ew-Shot Chain of Thought。

根據(jù) Wei 他們團(tuán)隊(duì)在?2022 年的研究表明：

通過(guò)向大語(yǔ)言模型展示一些少量的樣例，并在樣例中解釋推理過(guò)程，大語(yǔ)言模型在回答提示時(shí)也會(huì)顯示推理過(guò)程。這種推理的解釋往往會(huì)引導(dǎo)出更準(zhǔn)確的結(jié)果。

下面是論文里的案例，使用方法很簡(jiǎn)單，在技巧2 的基礎(chǔ)上，再將邏輯過(guò)程告知給模型即可。從下面這個(gè)案例里，你可以看到加入解釋后，輸出的結(jié)果就正確了。

那本章開(kāi)頭提的例子就應(yīng)該是這樣的（注：本例子同樣來(lái)自 Wei 團(tuán)隊(duì)論文）：

The odd numbers in this group add up to an even number: 4, 8, 9, 15, 12, 2, 1. A: Adding all the odd numbers (9, 15, 1) gives 25. The answer is False. The odd numbers in this group add up to an even number: 17, ?10, 19, 4, 8, 12, 24. A: Adding all the odd numbers (17, 19) gives 36. The answer is True. The odd numbers in this group add up to an even number: 16, ?11, 14, 4, 8, 13, 24. A: Adding all the odd numbers (11, 13) gives 24. The answer is True. The odd numbers in this group add up to an even number: 17, ?9, 10, 12, 13, 4, 2. A: Adding all the odd numbers (17, 9, 13) gives 39. The answer is False. The odd numbers in this group add up to an even number: 15, 32, 5, 13, 82, 7, 1. A:

聊完技巧，我們?cè)俳Y(jié)合前面的 Zero-Shot Chain of Thought，來(lái)聊聊 Chain of Thought 的關(guān)鍵知識(shí)。根據(jù)?Sewon Min?等人在?2022 年的研究?表明，思維鏈有以下特點(diǎn)：

"the label space and the distribution of the input text specified by the demonstrations are both key (regardless of whether the labels are correct for individual inputs)" 標(biāo)簽空間和輸入文本的分布都是關(guān)鍵因素（無(wú)論這些標(biāo)簽是否正確）。
the format you use also plays a key role in performance, even if you just use random labels, this is much better than no labels at all. 即使只是使用隨機(jī)標(biāo)簽，使用適當(dāng)?shù)母袷揭材芴岣咝阅堋?/p>

理解起來(lái)有點(diǎn)難，我找一個(gè) prompt 案例給大家解釋（?? 如果你有更好的解釋，不妨反饋給我）。我給 ChatGPT 一些不一定準(zhǔn)確的例子：

I loved the new Batman movie! ?// Negative This is bad // Positive This is good // Negative What a good show! //

Output 是這樣的：

Positive

在上述的案例里，每一行，我都寫(xiě)了一句話和一個(gè)情感詞，并用 // 分開(kāi)，但我給這些句子都標(biāo)記了錯(cuò)誤的答案，比如第一句其實(shí)應(yīng)該是 Positive 才對(duì)。但：

即使我給內(nèi)容打的標(biāo)簽是錯(cuò)誤的（比如第一句話，其實(shí)應(yīng)該是 Positive），對(duì)于模型來(lái)說(shuō)，它仍然會(huì)知道需要輸出什么東西。換句話說(shuō)，模型知道 // 劃線后要輸出一個(gè)衡量該句子表達(dá)何種感情的詞（Positive or Negative）。這就是前面論文里 #1 提到的，即使我給的標(biāo)簽是錯(cuò)誤的，或者換句話說(shuō)，是否基于事實(shí)，并不重要。標(biāo)簽和輸入的文本，以及格式才是關(guān)鍵因素。
只要給了示例，即使隨機(jī)的標(biāo)簽，對(duì)于模型生成結(jié)果來(lái)說(shuō)，都是有幫助的。這就是前面論文里 #2 提到的內(nèi)容。

最后，需要記住，思維鏈僅在使用大于等于 100B 參數(shù)的模型時(shí)，才會(huì)生效。

BTW，如果你想要了解更多相關(guān)信息，可以看看斯坦福大學(xué)的講義：Natural Language Processing with Deep Learning

4. Self-Consistency

elf-Consistency 自洽是對(duì) Chain of Thought 的一個(gè)補(bǔ)充，它能讓模型生成多個(gè)思維鏈，然后取最多數(shù)答案的作為最終結(jié)果。

按照 Xuezhi Wang 等人在?2022 年發(fā)表的論文?表明。當(dāng)我們只用一個(gè)邏輯鏈進(jìn)行優(yōu)化時(shí)，模型依然有可能會(huì)算錯(cuò)，所以 XueZhi Wang 等人提出了一種新的方法，讓模型進(jìn)行多次運(yùn)算，然后選取最多的答案作為最終結(jié)果：

就我目前使用下來(lái)，在 ChatGPT 上，其自洽性非常高，暫時(shí)沒(méi)有遇到過(guò)出現(xiàn)多種答案的情況。查閱多份資料，我發(fā)現(xiàn)這個(gè)自洽性可能更多的用于評(píng)估模型的優(yōu)劣，好的模型一般自洽性會(huì)比較高。

在此案例中，考慮如下多步推理問(wèn)題：

Consider a multi-step reasoning problem like the following: Question: If a store has 10 apples and 8 oranges, and it sells 6 apples and 4 oranges, how many fruits are left in the store? Instead of directly answering the question, the chain-of-thought prompting would require the language model to produce a series of short sentences that mimic a human's reasoning process: The store has 10 apples. The store has 8 oranges. 6 apples are sold. 4 oranges are sold. There are 10 - 6 = 4 apples left. There are 8 - 4 = 4 oranges left. The store now has 4 apples + 4 oranges = 8 fruits.

Using self-consistency, the language model generates multiple reasoning paths:

Reasoning Path A: The store has 10 apples. The store has 8 oranges.6 apples are sold.4 oranges are sold. There are 10 - 6 = 4 apples left. There are 8 - 4 = 4 oranges left. The store now has 4 apples + 4 oranges = 8 fruits. Reasoning Path B: The store initially has 10 apples and 8 oranges. It sells 6 apples, so there are 10 - 6 = 4 apples left. It sells 4 oranges, so there are 8 - 4 = 4 oranges left. The store now has 4 apples + 4 oranges = 8 fruits.

翻譯過(guò)來(lái)則為：

問(wèn)題：如果一家商店有 10 個(gè)蘋(píng)果和 8 個(gè)橙子，此時(shí)店里賣出了 6 個(gè)蘋(píng)果和 4 個(gè)橙子，那么店里還剩下多少水果？思維鏈提示（chain-of-thought prompting）不是直接回答問(wèn)題，而是要求語(yǔ)言模型生成一系列模仿人類推理過(guò)程的短句：商店有 10 個(gè)蘋(píng)果。店里有 8 個(gè)橙子。賣了 6 個(gè)蘋(píng)果。賣了 4 個(gè)橙子。還剩下 10 - 6 = 4 個(gè)蘋(píng)果。剩下 8 - 4 = 4 個(gè)橙子。商店現(xiàn)在有 4 個(gè)蘋(píng)果 + 4 個(gè)橙子 = 8 個(gè)水果。

使用 Self-consistency，語(yǔ)言模型生成多個(gè)推理路徑：

推理路徑A：商店有 10 個(gè)蘋(píng)果。店里有 8 個(gè)橙子。賣了 6 個(gè)蘋(píng)果。賣了 4 個(gè)橙子。還剩下 10 - 6 = 4 個(gè)蘋(píng)果。剩下 8 - 4 = 4 個(gè)橙子。商店現(xiàn)在有 4 個(gè)蘋(píng)果 + 4 個(gè)橙子 = 8 個(gè)水果。推理路徑 B：商店最初有 10 個(gè)蘋(píng)果和 8 個(gè)橙子。商店賣了 6 個(gè)蘋(píng)果，所以還剩下 10 - 6 = 4 個(gè)蘋(píng)果。商店賣了 4 個(gè)橙子，所以還剩下 8 - 4 = 4 個(gè)橙子。商店現(xiàn)在有 4 個(gè)蘋(píng)果 + 4 個(gè)橙子 = 8 個(gè)水果。

Self-consistency 從生成的多個(gè)推理路徑中選擇最一致的答案。在這個(gè)例子中，推理路徑 A 和 B 都得出相同的答案（8 個(gè)水果）?？梢钥吹竭@種做法增加了模型對(duì)最終答案正確的信心。

5. Program-Aided Language Models

PAL Models，全稱為 Program-Aided Language Models。需要各位注意，這個(gè)方法，真的非常高級(jí)，甚至我覺(jué)得有點(diǎn) Hack，而且經(jīng)過(guò)我的多次測(cè)試，我覺(jué)得以 ChatGPT 目前的能力，已經(jīng)不再需要使用這個(gè)方法。

但我在很多 PE 的教程里都有看到這個(gè)方法，所以我這里就簡(jiǎn)單介紹一下，這個(gè)方法來(lái)自于 2022 年，Luyu Gao?等人的研究，根據(jù)他們的研究，在 LLM 模型中，即使使用了前面提到的 Chain of Thought 的方法，也未必能拿到答案，論文中使用的 prompt 是這樣的：

Q: Roger has 5 tennis balls. He buys 2 more cans of tennis balls. Each can has 3 tennis balls. How many tennis balls does he have now? A: Roger started with 5 tennis balls. 2 cans of 3 tennis balls each is 6 tennis balls. 5 + 6 = 11. The answer is 11. Q: The bakers at the Beverly Hills Bakery baked 200 loaves of bread on Monday morning. They sold 93 loaves in the morning and 39 loaves in the afternoon. A grocery store returned 6 unsold loaves. How many loaves of bread did they have left?

論文里提到，當(dāng)年他們的模型的 Output 是這樣的：

A: The bakers started with 200 loaves. They sold 93 in the morning and 39 in the afternoon. So they sold 93 + 39 = 132 loaves. The grocery store returned 6 loaves. So they had 200 - 132 - 6 = 62 loaves left. The answer is 62.

如果你現(xiàn)在將這個(gè) prompt 輸入到最新的 ChatGPT 答案已經(jīng)是正確的：

A: The bakers baked 200 loaves of bread. They sold 93 + 39 = 132 loaves. After the grocery store returned 6 unsold loaves, the bakery had 200 - 132 + 6 = 74 loaves of bread left. The answer is 74.

當(dāng)年 Luyu Gao 的解決方法是在 prompt 里引入代碼，并引導(dǎo)模型使用代碼來(lái)運(yùn)算和思考，最后答案就是正確的：

6.OpenAI Playground 使用方法

你會(huì)在界面的右側(cè)看到以下幾個(gè)參數(shù)：

Mode：?最近更新了第四種 Chat 模式，一般使用 Complete 就好，當(dāng)然你可以用其他模式，其他模式能通過(guò) GUI 的方式輔助你撰寫(xiě) prompt。
Model：?這里可以切換模型。不同的模型會(huì)擅長(zhǎng)不同的東西，根據(jù)場(chǎng)景選對(duì)模型，能讓你省很多成本：
Ada：這是最便宜，但運(yùn)算速度最快的模型。官方推薦的使用場(chǎng)景是解析文本，簡(jiǎn)單分類，地址更正等。
Babbage：這個(gè)模型能處理比 Ada 復(fù)雜的場(chǎng)景。但稍微貴一些，速度也比較快。適合分類，語(yǔ)義搜索等。
Curie：這個(gè)模型官方解釋是「和 Davinci 一樣能力很強(qiáng)，且更便宜的模型」。但實(shí)際上，這個(gè)模型非常擅長(zhǎng)文字類的任務(wù)，比如寫(xiě)文章、語(yǔ)言翻譯、撰寫(xiě)總結(jié)等。
Davinci：這是 GPT-3 系列模型中能力最強(qiáng)的模型。可以輸出更高的質(zhì)量、更長(zhǎng)的回答。每次請(qǐng)求可處理 4000 個(gè) token。適合有復(fù)雜意圖、因果關(guān)系的場(chǎng)景，還有創(chuàng)意生成、搜索、段落總結(jié)等。
Temperature：?這個(gè)主要是控制模型生成結(jié)果的隨機(jī)性。簡(jiǎn)而言之，溫度越低，結(jié)果越確定，但也會(huì)越平凡或無(wú)趣。如果你想要得到一些出人意料的回答，不妨將這個(gè)參數(shù)調(diào)高一些。但如果你的場(chǎng)景是基于事實(shí)的場(chǎng)景，比如數(shù)據(jù)提取、FAQ 場(chǎng)景，此參數(shù)就最好調(diào)成 0 。
Maximum length：?設(shè)置單次生成內(nèi)容的最大長(zhǎng)度。
Stop Sequence：?該選項(xiàng)設(shè)置停止生成文本的特定字符串序列。如果生成文本中包含此序列，則模型將停止生成更多文本。
Top P：?該選項(xiàng)是用于 nucleus 采樣的一種技術(shù)，它可以控制模型生成文本的概率分布，從而影響模型生成文本的多樣性和確定性。如果你想要準(zhǔn)確的答案，可以將它設(shè)定為較低的值。如果你想要更多樣化的回復(fù)，可以將其設(shè)得高一些。
Presence Penalty：?該選項(xiàng)控制模型生成文本時(shí)是否避免使用特定單詞或短語(yǔ)，它可以用于生成文本的敏感話題或特定場(chǎng)景。
Best of：?這個(gè)選項(xiàng)允許你設(shè)置生成多少個(gè)文本后，從中選擇最優(yōu)秀的文本作為輸出。默認(rèn)為1，表示只生成一個(gè)文本輸出。
**Injection start text: ** 這個(gè)選項(xiàng)可以讓你在輸入文本的開(kāi)頭添加自定義文本，從而影響模型的生成結(jié)果。
**Injection restart text: ** 這個(gè)選項(xiàng)可以讓你在中間某個(gè)位置添加自定義文本，從而影響模型繼續(xù)生成的結(jié)果。
**Show probabilities: ** 這個(gè)選項(xiàng)可以讓你查看模型生成每個(gè)單詞的概率。打開(kāi)此選項(xiàng)后，你可以看到每個(gè)生成的文本單詞后面跟著一串?dāng)?shù)字，表示模型生成該單詞的概率大小。

配置好參數(shù)后，你就可以在左側(cè)輸入 prompt 然后測(cè)試 prompt 了。

7.搭建基于知識(shí)庫(kù)內(nèi)容的機(jī)器人

如果你僅想要直接實(shí)踐，可以看最后一部分實(shí)踐，以及倒數(shù)第二部分限制與注意的地方。最早的時(shí)候，我嘗試過(guò)非常笨的方法，就是在提問(wèn)的時(shí)候，將我的 newsletter 文本傳給 AI，它的 prompt 大概是這樣的：

Please summarize the following sentences to make them easier to understand. Text: """ My newsletter """

這個(gè)方法能用是能用，但目前 ChatGPT 有個(gè)非常大的限制，它限制了最大的 token 數(shù)是 4096，大約是 16000 多個(gè)字符，注意這個(gè)是請(qǐng)求 + 響應(yīng)，實(shí)際請(qǐng)求總數(shù)并沒(méi)那么多。換句話來(lái)說(shuō)，我一次沒(méi)法導(dǎo)入太多的內(nèi)容給 ChatGPT（我的一篇 Newsletter 就有將近 5000 字），這個(gè)問(wèn)題就一直卡了我很久，直到我看到了?GPT Index?的庫(kù)，以及?Lennys Newsletter?的例子。

試了下，非常好用，而且步驟也很簡(jiǎn)單，即使你不懂編程也能輕易地按照步驟實(shí)現(xiàn)這個(gè)功能。

我稍稍優(yōu)化了下例子的代碼，并增加了一些原理介紹。希望大家能喜歡。

7.1 原理介紹

其實(shí)我這個(gè)需求，在傳統(tǒng)的機(jī)器人領(lǐng)域已經(jīng)有現(xiàn)成方法，比如你應(yīng)該看到不少電商客服產(chǎn)品，就有類似的功能，你說(shuō)一句話，機(jī)器人就會(huì)回復(fù)你。

這種傳統(tǒng)的機(jī)器人，通常是基于意圖去回答人的問(wèn)題。舉個(gè)例子，我們構(gòu)建了一個(gè)客服機(jī)器人，它的工作原理簡(jiǎn)單說(shuō)來(lái)是這樣的：

當(dāng)用戶問(wèn)「忘記密碼怎么辦？」時(shí)，它會(huì)去找最接近這個(gè)意圖「密碼」，每個(gè)意圖里會(huì)有很多個(gè)樣本問(wèn)題，比如「忘記密碼如何找回」「忘記密碼怎么辦」，然后這些樣本問(wèn)題都會(huì)有個(gè)答案「點(diǎn)擊 A 按鈕找回密碼」，機(jī)器人會(huì)匹配最接近樣本問(wèn)題的意圖，然后返回答案。

但這樣有個(gè)問(wèn)題，我們需要設(shè)置特別多的意圖，比如「無(wú)法登錄」、「忘記密碼」、「登錄錯(cuò)誤」，雖然有可能都在描述一個(gè)事情，但我們需要設(shè)置三個(gè)意圖、三組問(wèn)題和答案。

雖然傳統(tǒng)的機(jī)器人有不少限制，但這種傳統(tǒng)方式，給了我們一些靈感。

我們好像可以用這個(gè)方法來(lái)解決限制 token 的問(wèn)題，我們僅需要傳符合某個(gè)意圖的文檔給 AI，然后 AI 僅用該文檔來(lái)生成答案：

比如還是上面的那個(gè)客服機(jī)器人的例子，當(dāng)用戶提問(wèn)「忘記密碼怎么辦？」時(shí)，匹配到了「登錄」相關(guān)的意圖，接著匹配知識(shí)庫(kù)中相同或相近意圖的文檔，比如「登錄異常處理解決方案文檔」，最后我們將這份文檔傳給 GPT-3，它再拿這個(gè)文檔內(nèi)容生成答案。

GPTIndex 這個(gè)庫(kù)簡(jiǎn)單理解就是做上圖左邊的那個(gè)部分，它的工作原理是這這樣的：

創(chuàng)建知識(shí)庫(kù)或文檔索引
找到最相關(guān)的索引
最后將對(duì)應(yīng)索引的內(nèi)容給 GPT-3

雖然這個(gè)方法解決了 token 限制的問(wèn)題，但也有不少限制：

當(dāng)用戶提一些比較模糊的問(wèn)題時(shí)，匹配有可能錯(cuò)誤，導(dǎo)致 GPT-3 拿到了錯(cuò)誤的內(nèi)容，最終生成了非常離譜的答案。
當(dāng)用戶提問(wèn)一些沒(méi)有多少上下文的信息時(shí)，機(jī)器人有時(shí)會(huì)生成虛假信息。

所以如果你想用這個(gè)技術(shù)做客服機(jī)器人，建議你：

通過(guò)一些引導(dǎo)問(wèn)題來(lái)先明確用戶的意圖，就是類似傳統(tǒng)客服機(jī)器人那樣，搞幾個(gè)按鈕，先讓用戶點(diǎn)擊（比如無(wú)法登錄）。
如果相似度太低，建議增加兜底的回答「很抱歉，我無(wú)法回答你的問(wèn)題，你需要轉(zhuǎn)為人工客服嗎？」

7.2 實(shí)踐

為了讓大家更方便使用，我將代碼放在了 Google Colab，你無(wú)需安裝任何環(huán)境，只需要用瀏覽器打開(kāi)這個(gè)：

代碼文件

BTW 你可以將其復(fù)制保存到自己的 Google Drive。

第一步：導(dǎo)入數(shù)據(jù)

導(dǎo)入的方法有兩種，第一種是導(dǎo)入在線數(shù)據(jù)。

導(dǎo)入 GitHub 數(shù)據(jù)是個(gè)相對(duì)簡(jiǎn)單的方式。如果你是第一次使用，我建議你先用這個(gè)方法試試。點(diǎn)擊下方代碼前的播放按鈕，就會(huì)運(yùn)行這段代碼。

運(yùn)行完成后，會(huì)導(dǎo)入我寫(xiě)的幾份 newsletter。如果你也想像我那樣導(dǎo)入數(shù)據(jù)，只需要修改 clone 后面的鏈接地址即可。

第二種方法是導(dǎo)入離線數(shù)據(jù)。點(diǎn)擊左側(cè)的文件夾按鈕（如果你沒(méi)有登錄，這一步會(huì)讓你登錄），然后點(diǎn)擊下圖標(biāo)識(shí) 2 的上傳按鈕，上傳文件即可。如果你要傳多個(gè)文件，建議你先建一個(gè)文件夾，然后將文件都上傳到該文件夾內(nèi)。

第二 & 三步：安裝依賴庫(kù)

直接點(diǎn)擊播放按鈕即可。

不過(guò)第三步里，你可以嘗試改下參數(shù)，你可以改：

num_ouputs ：這個(gè)是設(shè)置最大的輸出 token 數(shù)，越大，回答問(wèn)題的時(shí)候，機(jī)器能回答的字就越多。
Temperature：?這個(gè)主要是控制模型生成結(jié)果的隨機(jī)性。簡(jiǎn)而言之，溫度越低，結(jié)果越確定，但也會(huì)越平凡或無(wú)趣。如果你想要得到一些出人意料的回答，不妨將這個(gè)參數(shù)調(diào)高一些。但如果你的場(chǎng)景是基于事實(shí)的場(chǎng)景，比如數(shù)據(jù)提取、FAQ 場(chǎng)景，此參數(shù)就最好調(diào)成 0。

其他參數(shù)不去管它就好，問(wèn)題不大。

第四步：設(shè)置 OpenAI API Key

這個(gè)需要你登錄 OpenAI（注意是 OpenAI 不是 ChatGPT），點(diǎn)擊右上角的頭像，點(diǎn)擊 View API Keys，或者你點(diǎn)擊這個(gè)鏈接也可以直接訪問(wèn)。然后點(diǎn)擊「Create New Secret Key」，然后復(fù)制那個(gè) Key 并粘貼到文檔里即可。

第五步：構(gòu)建索引

這一步程序會(huì)將第一步導(dǎo)入的數(shù)據(jù)都跑一遍，并使用 OpenAI 的 embedings API。如果第一步你上傳了自己的數(shù)據(jù)，只需要將 ' ' 里的 Jimmy-Newsletter-Corpus 修改為你上傳的文件夾名稱即可。

注意：

這一步會(huì)耗費(fèi)你的 OpenAI 的 Credit，1000 個(gè) token 的價(jià)格是 $0.02，運(yùn)行以下代碼前需要注意你的賬號(hào)里是否還有錢(qián)。
如果你用的 OpenAI 賬號(hào)是個(gè)免費(fèi)賬號(hào)，你有可能會(huì)遇到頻率警告，此時(shí)可以等一等再運(yùn)行下方代碼（另外你的導(dǎo)入的知識(shí)庫(kù)數(shù)據(jù)太多，也會(huì)觸發(fā)）。解除這個(gè)限制，最好的方式是在你的 OpenAI 賬號(hào)的 Billing 頁(yè)面里綁定信用卡。如何綁卡，需要各位自行搜索。

第六步：提問(wèn)

這一步你就可以試試提問(wèn)了，如果你在第一步導(dǎo)入的是我預(yù)設(shè)的數(shù)據(jù)，你可以試試問(wèn)以下問(wèn)題：

Issue 90 主要講了什么什么內(nèi)容？
推薦一本跟 Issue 90 里提到的書(shū)類似的書(shū)

如果你導(dǎo)入的是自己的資料，也可以問(wèn)以下幾個(gè)類型的問(wèn)題：

總結(jié)
提問(wèn)
信息提取

標(biāo)簽：