當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

openai-gpt_GPT-3报告存在的问题

發(fā)布時(shí)間：2023/12/15 编程问答 41 豆豆

生活随笔收集整理的這篇文章主要介紹了 openai-gpt_GPT-3报告存在的问题小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

openai-gpt

I’ve recently seen a massive number of articles about GPT-3, on Medium and elsewhere. I even wrote one. The language model is a significant development in AI, so it’s only natural that writers want to share their excitement with the world.

我最近在Medium和其他地方看到了大量關(guān)于GPT-3的文章。我什至寫了一個(gè) 。語言模型是AI的重大發(fā)展，因此，作家與世界分享自己的興奮是很自然的。

Here’s the problem: the ability of GPT-3 — namely the quality of its writing — is often exaggerated by published samples. In fact, there are not one, but two filters keeping the AI’s worst results from wide dissemination.

這就是問題所在：GPT-3的能力(即其寫作質(zhì)量)經(jīng)常被已發(fā)布的樣本夸大。實(shí)際上，沒有一個(gè)過濾器，但是有兩個(gè)過濾器使AI的最壞結(jié)果無法廣泛傳播。

Selection bias wouldn’t be a problem if any interested reader could access the GPT-3 API and make their own observations of its ability. However, access is currently severely limited. (AI Dungeon is often used to test GPT-3 by those of us without the full version, but its creator has recently outlined ways backdoor access to GPT-3 is being prevented.)

如果有興趣的讀者可以訪問GPT-3 API并對(duì)其功能進(jìn)行自己的觀察，那么選擇偏向就不會(huì)成為問題。但是，當(dāng)前訪問受到嚴(yán)重限制。 ( AI地牢通常被我們那些沒有完整版本的人用來測試GPT-3，但其創(chuàng)建者最近概述了如何防止對(duì)GPT-3進(jìn)行后門訪問。)

When reporting — and I use that term in its broadest possible interpretation to mean any writing about GPT-3 — is the only source of public information, selection biases ought to be considered in our understanding of the product. Here, I outline the obvious bias, and a less-obvious bias which exacerbates the issue.

報(bào)告時(shí)(我在最廣泛的解釋中使用該術(shù)語表示任何有關(guān)GPT-3的文字)是唯一的公共信息來源，因此，在理解產(chǎn)品時(shí)應(yīng)考慮選擇偏見。在這里，我概述了明顯的偏見，以及不太明顯的偏見，這加劇了該問題。

1.選擇寫作樣本以提高質(zhì)量 (1. Writing samples are selected for quality)

Say I’m writing an informative piece on GPT-3. I want to demonstrate that it can put together coherent strings of sentences, so I give it a prompt and examine the output.

假設(shè)我正在撰寫有關(guān)GPT-3的內(nèi)容豐富的文章。我想證明它可以將連貫的句子串在一起，所以我給了它一個(gè)提示，并檢查了輸出。

If I don’t like what I see, I’m likely to try again with a slightly different (perhaps longer) prompt. Even if I’m not actively selecting particular sentences that suit the purpose of my article, massaging the output creates a biased sample of writing that is not representative of GPT-3’s overall quality.

如果我不喜歡自己看到的內(nèi)容，則可能會(huì)再次嘗試使用稍有不同(可能更長)的提示。即使我沒有積極選擇適合我文章目的的特定句子，但對(duì)輸出進(jìn)行按摩也會(huì)產(chǎn)生有偏見的寫作樣本，不能代表GPT-3的整體素質(zhì)。

In the context of creating a narrative about the AI, it makes sense to showcase its best work rather than a fair representation of its limitations. This is the first problem.

在創(chuàng)建有關(guān)AI的敘述的背景下，有意義的是展示其最佳作品，而不是公平地表述其局限性。這是第一個(gè)問題。

2.文章越酷，觀看次數(shù)越多 (2. The cooler the article, the more views)

Consider the case where something does gets written about a function GPT-3 cannot perform. It might be a list of writing fails, or code that doesn’t compile.

考慮以下情況： 確實(shí)編寫了有關(guān)GPT-3 無法執(zhí)行的功能的信息。可能是寫入失敗或代碼未編譯的列表。

To me, that wouldn’t be an interesting piece, and I suspect it wouldn’t intrigue others either. I’m sure Tweets, Reddit posts, and longer articles detailing GPT-3’s unexpected failures are out there, but the fact of the matter is they’re not getting read.

對(duì)我來說，那不是件有趣的事，而且我懷疑它也不會(huì)吸引其他人。我敢肯定，這里有Tweets，Reddit帖子和更長的文章，詳細(xì)介紹了GPT-3的意外故障，但是事實(shí)是它們沒有被閱讀 。

On the surface, this doesn’t seem like a problem. It definitely isn’t necessary to read about everything that GPT-3 can’t do. The real problem is when positive results are favoured over negative ones for the same task. For example, if someone reported positive results for getting GPT-3 to write a legal document, this would undoubtedly receive more attention than an instance where the AI fails to generate a coherent document.

從表面上看，這似乎不是問題。絕對(duì)沒有必要閱讀GPT-3不能做的所有事情。真正的問題是，對(duì)于同一任務(wù)，正面結(jié)果勝于負(fù)面結(jié)果。例如，如果有人報(bào)告說讓GPT-3撰寫法律文件取得了積極成果，那么毫無疑問，這將比AI無法生成連貫文件的情況受到更多關(guān)注。

In essence, the way GPT-3 reporting currently works is analogous to running scientific trials without pre-registration. Publication bias, where statistically insignificant results don’t get published, can cause absurd findings to be accepted as solid research.

從本質(zhì)上講，GPT-3報(bào)告的當(dāng)前工作方式類似于無需預(yù)先注冊(cè)即可進(jìn)行的科學(xué)試驗(yàn)。出版偏見不會(huì)發(fā)表統(tǒng)計(jì)上微不足道的結(jié)果，這可能會(huì)導(dǎo)致荒謬的發(fā)現(xiàn)被接受為可靠的研究。

To be clear, I don’t think there is an imperative for writers to publish more negative results from GPT-3. There is, however, an obligation to contextualize samples with the way in which they were generated and how many negative results were obtained in the process.

需要明確的是，我認(rèn)為作者沒有必要發(fā)布GPT-3的更多負(fù)面結(jié)果。但是，有義務(wù)根據(jù)樣本的生成方式和在此過程中獲得多少負(fù)面結(jié)果來對(duì)樣本進(jìn)行情境化。

After all, human selection — on the level of individual pieces of writing or how the larger body of work gets consumed — of an AI’s output is a combination of our intelligence with that of a computer program, and that’s a beautiful thing.

畢竟，人工智能的輸出是人工選擇，無論是在單個(gè)作品的層次上還是在更大的工作量上，人類的選擇都是我們的智慧與計(jì)算機(jī)程序的結(jié)合，這是一件很美的事情。

翻譯自: https://towardsdatascience.com/the-problem-with-gpt-3-reporting-93c7b5b58400

openai-gpt

總結(jié)

以上是生活随笔為你收集整理的openai-gpt_GPT-3报告存在的问题的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇：属实赚麻了！《满江红》7天为光线传媒创收
下一篇：机器学习凝聚态物理_机器学习遇到了凝聚