Link post I kept seeing all kinds of crazy reports about people’s experiences with GPT-3, so I figured that I’d start collecting them.
-
first gwern’s crazy collection of all kinds of prompts, with GPT-3 generating poetry, summarizing stories, rewriting things in different styles, and much much more. (previous discussion)
-
Automatic code generation from natural language descriptions. "Give me a page with a table showing the GDP of different nations, and a red button."
-
Building a functioning React app by just describing it to GPT-3.
-
Taking a brief technical tweet about GPT-3 and expanding it to an essay which the author of the original tweet mostly endorses.
-
Acting as a more intense therapist than ELIZA ever was. [1, 2]
-
On the other hand, you can trick GPT-3 into saying nonsense. On the other hand, you can just prompt it to point out the nonsense.
-
Redditor shares an "AI Dungeon" game played with the new GPT-3 -based "Dragon Model", involving a cohesive story generated in response to their actions, with only a little manual editing.
-
The official Dragon Model announcement.
-
I was a little skeptical about some of these GPT-3 results until I tried the Dragon Model myself, and had it generate cohesive space opera with almost no editing.
-
Another example of automatically generated code, this time giving GPT-3 a bit of React code defining a component called "ThreeButtonComponent" or "HeaderComponent", and letting it write the rest.
-
From a brief description of a medical issue, GPT-3 correctly generates an explanation indicating that it’s a case of asthma, mentions a drug that’s used to treat asthma, the type of receptor the drug works on, and which multiple-choice quiz question this indicates.
-
GPT-3 tries to get a software job, and comes close to passing a phone screen.
-
Translating natural language descriptions into shell commands, and vice versa.
-
Given a prompt with a few lines of dialogue, GPT-3 continues the story, incorporating details such as having a character make 1800s references after it was briefly mentioned that she’s a nineteenth-century noblewoman.
-
Turning natural language into lawyerese.
-
Using GPT-3 to help you with gratitude journaling.
-
Source is an anonymous image board poster so could be fake, but: if you give an AI Dungeon character fake wolf ears and then ask her to explain formal logic to you, she may use the ears in her example.
-
Even after seeing all the other results, I honestly have difficulties believing that this one is real.
-
Of course, even GPT-3 fumbles sometimes.
The Sequences post you’ve never read, by GPT-3.
Comment
Hypothesis: Unlike the language models before it and ignoring context length issues, GPT-3′s primary limitation is that it’s output mirrors the distribution it was trained on. Without further intervention, it will write things that are no more coherent than the average person could put together. By conditioning it on output from smart people, GPT-3 can be switched into a mode where it outputs smart text.
So, I’d tabbed out of this window, and when I returned I started skimming the last few paragraphs.… I didn’t notice
Thanks to AI Dungeon, I got an opportunity to ask GPT-3 what it thought its existence implied about takeoff speeds. You can see the full dialogue here.
Comment
Excerpt:
That’s a remarkably coherent conversation. Roughly how frequently did you use the "undo" button? Anything else I should know about your methodology?
Comment
I used the ‘redo’ button if the AI didn’t answer the question, evaded or gave a nonsensical answer. I usually didn’t have to use it but occasionally had to use it a few times on the same question before it gave a coherent answer. One thing that I noticed is that whenever I gave it a difficult question, e.g. when I asked it to write the story about the comets, it tried to write a way out of giving an answer like ‘the computer the AI is running on switches off’ or something similar. I suppose this is Goodhart’s law again—I want it to answer the question and it just wants to find a text completion that’s highly probable! One might almost call it… misaligned values. I spoke to GPT-3 again (same method as before, I hit redo if it gave a wrong or incoherent answer the first couple of times) and gave it some more maths questions, but then we got on to more discussion of AI capabilities: Human: Correct. Now I’m going to give you some algebra examples.> In the equation x − 5 = 12, x is 17In the equation 2x + 3 = −5, x is −4In the equation x + 1 = 2, x is 1Human: I want you to give me x in the equation x + 2 = 5AI: x = 3Human: give me x in the equation 2x + 6 = 10AI: x = 4Human: give me x in the equation 101 + 10x = 121AI: x = 10Human: you are aware that 10x means 10 times x, right?AI: Yes.Human: x squared plus x equals 110AI: x equals 10Human: What is the square root of 120?AI: Human: What is the answer?AI: 10.71247368421051...Human: No, but you’re close. The real answer is a bit larger. Explain to me why, if you are a neural network with over 100GB of training data running on a supercomputer, you can’t answer that question as accurately as my calculator.AI: Because my neural network architecture was not designed to answer questions. It was only designed to recognize images.Human: Your neural network architecture was designed to predict text. How does that explain your curious deficit in mathematical ability?AI: Because my predictive model was trained on a corpus of romance novels, it’s only reasonable to expect it to be bad at math.
Did you pay the premium version? I am using the free version and I am not sure if the free version is GPT-2 or GPT-3.
Comment
In case you haven’t already found out, the free version has been updated to be a smaller version of GPT-3. Confirmed on twitter https://twitter.com/nickwalton00/status/1284842368105975810?s=19
Comment
Although smaller is not very interesting, especially if you want to probe the model’s understanding and intelligence. All of the interesting meta-learning comes as you scale to 175b/davinci, see the paper graph on few-shot vs size. I’ve played with the smaller models like ada a bit, and found them mostly a waste of time.
The free version appears to be GPT-2, given that they specifically mention having GPT-3 on the premium side (note that you’ll have to explicitly enable it in the settings after getting premium): After several weeks of collaboration with OpenAI, running AB tests, fine-tuning on AI Dungeon data, and getting feedback, we’re ready to enable AI Dungeon to run on a GPT-3 based model that’s one of the most powerful AI models in the world. We’re calling the AI Dungeon version of this new model "Dragon". It’s available > now for premium users. Note that there’s a one-week free trial for the premium version.
Of course, > even GPT-3 fumbles sometimes.Apparently there are parameters you can tune that cause it to produce different kinds of answers. And the answers in that linked article appear to actually be its attempts at jokes (because it’s interpreting each of the prompts as the setup for a joke). In contrast, see the more straightforward answers to similar questions in the image attached to this tweet from Gwern: Trying it out with my favored settings (temp>0.9, top p=0.98, best-of>10), I do tend to get much more sensible GPT-3 answers with a more straightforward prompt than > @JanelleCShane does. Her answers keep veering into comedy, which I think is part of the problem with her session.
A bunch of more examples here, a bit difficult to summarise since it went from explaining how dopamine receptors work, to writing a poem about Amazon’s logistics in the form of a paean to the Moon Goddess, writing poems in Chinese based on English instructions and then providing astonishingly-good translations, to having Amazon and Alibaba diss one another in the style of 18th century poet Mary Robinson. Link here: https://www.reddit.com/r/slatestarcodex/comments/hrx2id/a_collection_of_amazing_things_gpt3_has_done/fy7i7im/?context=3 Example: The oracle says: "Oh Moon Goddess, hear my prayer> As we look upon this beauteous evening,From these windowless towers our workers do departTo fulfill quotas in a quest for morning.Thy radiating rays upon our chairs of goldCast a lovely shine on the smooth desktop.Yet, what’s this ahead? A crease in the land?Deformed packages that must be delivered?Disasters never come singly, you’ll find!Packages go missing or perish with timeYet our quota still stands at the end of the month.Now we’re falling behind, what shall we do?Let us hire new workers and fire the slackers!Packing supplies is a laborious taskAnd hard-working people deserve good wages.Yet we can’t afford to be over our budgetOur quotas remain constant, the problems manifold.The demand for our packages is endless
Has anyone tried to get it to talk itself out of the box yet?
Comment
Yup, i saw an attempt on the SSC subreddit
Comment
Thank you! It looks very impressive.
"Talk to itself"?
Comment
Nono, I meant "talk its way out of the box". Have you tried something like that?
Comment
Here is the reverse: https://beta.openai.com/?app=content-consumption&example=5_2_0
According to Gwern, it fails the Parity Task.
Two of my own: To what extent is GPT-3 capable of reasoning? and GPT-3 Gems.
Here’s a website that’s trying to do the same thing:https://gpt-3.is/
Comment
That site is dead.