I'm deeply suspicious of that number. It came from Similarweb, who track these things through analytics gathered from browser extensions.
I trust this article more: https://www.nytimes.com/2023/02/03/technology/chatgpt-openai...
"But two months after its debut, ChatGPT has more than 30 million users and gets roughly five million visits a day, two people with knowledge of the figures said."
"Two people with knowledge of the figures" is journalism speak for "I heard this off the record from people with insider info, and I'm ready to report it because those two different sources provided the same number".
I used it to help brainstorm talk titles and abstracts for a talk I was proposing the other day. What I ended up submitting was entirely written by me but was heavily influenced by the ChatGPT conversations.
https://til.simonwillison.net/macos/sips - I used it to figure out how to convert webp to PNG on macOS, and learned about an entirely new built-in command.
I often use it as a thesaurus - "what's a good word / term for X?"
I'm self-employed and a journalist asked me for my job title, which I don't have. So I brainstormed some ideas with ChatGPT.
I pasted in the output of a SQLite "explain query plan" query and asked for an explanation - which helped me figure out enough to write a section of this TIL: https://til.simonwillison.net/sqlite/subqueries-in-select
This is just from the past few days.
I play DnD with my friends and I’m usually the dungeon master. I use ChatGPT to help me world build, and flesh out details.
Don’t imagine asking ChatGPT what should happen in the next session. More like asking for options for the name and title of a non-player character. Then it writes options, I twist them up, combine them and select the one I like the best.
I can even ask more complicated questions like “what was so and so’s first innovation and how did it help their village? Provide 5 options” and then chatgpt goes and does that. Maybe I like one, and then that is canon from then on, or maybe while I am reading them I get an even better idea.
Basically I use it as a bicycle for my creativity. And in that use case I care 0% if what it says is true, much more that it comes up with wild things. It also doesn’t have to be totally consistent, since what it outputs is just a first step in an editing process.
For example I did know that one of the main cities in my world have grown from a sleepy village into a bustling university town because two wizzards started a friendly competition between them. And then with the help of ChatGPT I have iteratively expanded that core idea into this backstory of the city: https://docs.google.com/document/d/19dea6p9WuLcZIRVX2ecYMw8W...
Of course still relying on google in the background, but increasingly rarely, and presuming all the negative commentary we've been seeing online are folk who simply haven't tested it in anger yet. Today's chatgpt hallucination is yesterday's Google blogspam etc. Folk for some reason continue to act like the old world was perfect. This is much closer to perfection than anything we ever had, and infinitely more comprehensive. Google as we knew it is already dead, because the medium google was built for just got made obsolete. This is far closer to a new Internet iteration (WAIS, FTP, Gopher, HTTP, Web2.0, ...) than it is a new search engine
Now watch as the search engines try to adapt it to their recency-biased ads model and fail miserably, as what we have is already better than what they were able to sell. Very unclear bing or Google or anyone you've heard of will win this round, its suddenly a very exciting time in tech again
Another aspect I find very exciting is that these effectively represent a return to a curation-driven Internet, selection of input data for model training is probably an interesting new form of diversification. Who cares about having a site in the world wide web if its not part of the inputs for the language models used by millions of users? That's a completely new structure for the dissemination of ideas, marketing, "SEO" etc., and a brand new form of mass media
And then you can run searches using the standard terms, which gives better results, and also when writing code have more- informatively-named variables and better-structured data.
They tried instead putting these RFPs through ChatGPT, and they were blown away by the responses they got. Of course, the responses still need to go through a thorough edit and review process, but that was also true when humans were writing the first draft.
He told me that ChatGPT obviated a couple people's jobs, with the added bonus that the turnaround time between receiving a proposal and sending a response was much faster.
I've used ChatGPT for programming aid. I've started writing some Python packages. I haven't written Python in a long time, it doesn't "flow" easily for me. ChatGPT has been helpful here for scaffolding some code.
It often gets things wrong -- but I know enough to recognize when it's gone off the rails, and then nudge it in the right direction.
A concrete example: I wanted to do an iterative breadth-first traversal of a tree. I asked ChatGPT to produce it. It produced a correct implementation, albeit a recursive one. After being reminded that I wanted an iterative version, its second attempt was the right thing.
This is a pretty small thing, I guess! But for me, it was neat to be able to specify something at a higher level and have the computer sort out the details.
My favorite has been to feed it some slab of text and let it generate synopsis and then bullet points and then you can expand on those bullet points.
You feed it some prompts and let it expand on a topic.
My main use has been to generate multiple choice quizzes on a variety of topics for my students.
ChatGPT does a very good job 90% of a time and if some quiz is off, you just let ChatGPT make some more.
If need be you can generate code in oh about 20-30 programming languages. Vast majority of it will be ideomatic too. Sure Copilot (which is GPT-2 or GPT-3 now?) could do some of the tasks as well but the generality of ChatGPT is astounding.
Now, it is not going to write a big application for you but ChatGPT could generate large parts of the code.
So far the only weakness (besides the rare hallucination) I've found ChatGPT tends to make up word endings in less popular human languages when translating from English. That is probably due to how tokenization is done.
Has anyone been able to have ChatGPT generate gramatically "incorrect" English? I am talking about regular prompts (not rap or dialects).
ChatGPT helped point something obvious out that I had totally missed in my original problem solving.
> Can someone tell me what the hell they use ChatGPT for?
Yes, I can help explain what ChatGPT is used for. ChatGPT is a large language model developed by OpenAI that can generate human-like responses to text-based prompts. It has been trained on a vast amount of text data to understand the nuances of language and can be used for a wide range of natural language processing (NLP) tasks, such as:
1. Text generation: ChatGPT can generate new text in response to a prompt, such as writing a story, composing a poem, or generating product descriptions.
2. Language translation: ChatGPT can translate text from one language to another, making it a useful tool for language learning or communication between people who speak different languages.
3. Chatbot development: ChatGPT can be used to create chatbots that can engage in conversations with humans and provide helpful responses to their inquiries.
4. Text classification: ChatGPT can be used to classify text into categories, such as sentiment analysis, topic modeling, or identifying spam emails.
5. Question-answering: ChatGPT can be used to answer questions posed in natural language, such as providing information on a topic or answering customer support inquiries.
These are just a few examples of the many use cases for ChatGPT. It has the potential to revolutionize the way we interact with technology and make it easier for people to communicate with each other across language barriers.brainstorming funny/catchy slogans: not all are winners, but since it can crank out dozens almost immediately, I can pick what I like and quickly modify them in the time it takes me to think of one or two independently. As soon as I verify they aren't ripoffs of existing material, I may use one or two.
Writing poetry - it helped me to write sonnets, and further modified them to specifications. The recipients were quite impressed.
Translating existing poetry of mine into Arabic, while retaining the meaning AND rhyming in Arabic, a feat which is extremely difficult for me
Writing a business plan to my specifications that was actually useful
Writing letters to a landlord to get out of a lease
In addition, I have run my own fiction through it and had it rewrite it relatively convincingly in the styles of Lee Child, Danielle Steele, and Dashiell Hammett. That is more for fun, but I can see uses for it.
Lastly, I have attempted to use it to determine guilt in an investigation where I had already determined the guilty party, to see how close it was to replacing me. The answer it gave was wrong, but I could see that this was because of user error and it is only a matter of time.
Tallest people in US - filter cannot answer personal characteristics off limits.
What number come up most often playing the lottery - I do not have that information
show me a list of 100 different ...- 10 results..
It seems to hate polite. Please give me.. NO vs give me NOW here you go
It is not useful for me. I ask it programming questions and hate the output.. or know where they got the output and can see they missed key steps.
I feel like I know what it will answer and it's mostly surface level answers.
For people who don't want a conversation and can find the information quicker the hype doesn't add up. Im fairness tiktok bores me.
The confidently wrong is simply because it is not human (even though many humans do the same); it is in fact not confident; you just attribute that to it because of the language it uses. In fact, it feels nothing confident or otherwise; it just gives you crap and then you can ask it to think about it again.
Like I was asking it something and it told me the answer was 11, then it explained why the answer was 11 but at the end of the explanation it concluded the answer was 12. So I asked what it is 11 or 12; it said 12 and 11 was a mistake. Which was indeed correct. Again, that is not unlike humans at all; humans often say something with confidence and after a big of pondering, they correct themselves. Difference is, humans actually do feel confident or unsure etc.
You write and then you fine tune and fix; that is what it does. But faster and cheaper than humans. You engage in a conversation about the problem; it generates code, you check the code and tell what’s wrong, it will generate again with fixes etc. Especially with boring things like structural transformations aka a large json doc transformed to another large json doc with all kinds of operations in between, chatgpt is simply a lot faster than I would be, even with errors. Just paste a two json doc and ask typescript types and code for both to transform them one to another. Then add operations by tell on which nodes and what they should do. Ask to make jest tests.
I use it like a turbo-assistant.
Can you make this <long text> more succinct?
Convert the following list into a LaTeX enumerated list. Please put a period at the end of each item.
Please convert all of the metric numbers using the siunitx package.
Convert the following into a Table using booktabs syntax.
It's really helpful in doing a lot of grunt work.
Write me a few paragraphs describing <x> (Yes. I have to check it to make sure it hasn't gone insane, but it spews workable prose for shit I don't want to write).
I also use it to generate utility functions for stuff I can't be bothered to write, like find the next valid IP address in a list of IPs.
I also used it a few times to learn about the history of countries like china and cambodia.
Even a few time when I've been banging my head against the wall with some faulty code, I asked it what was wrong and it told me exactly.
The key with Chat is that you need to always validate any statements of fact (or code) - as hallucination is a pretty consistent 30% or so of the queries I send it. Google is probably used 75% of the time when I need to go double check a fact to make sure it's actually based on reality.
Is much faster than googling or digging through blogs, gives me direct answers instead of having to dig through forums or documentation to find the piece of information I’m looking for, and there’s no ads or other garbage.
It’s become my sort of first line of defense looking for information. Imperfect but often pretty good.
It can summarise (covert freeform text to json etc), it can expand (convert these bullet points in to content) and it can transpose (rewrite this python as java).
It's a starting prompt for fiction and a research time saver for non fiction. It's a multipurpose tool. Functionally it's also a UI if you write it as hardcoded reply formats.
It's also the better english to finnish translation than gtranslarw. Also copywriting as certain genres are highly repetitive.
I don't blindly trust it's output, but it saves me a ton of time in handling the to me bs extra stuff by filling in the edges.
I'm using it to learn French. I'm using it for figuring out if my book idea makes sense. To tell me how Typescript works, or how it compares to languages I already know. I use it to compare products I'm interested in, to make educated guesses where comparable products are manufactured in.
It's not as smart as my colleagues, but much smarter than a rubber duck, and it has a mountain of data behind it.
It changes everything and brings amazing potential to the table.
Although it's free, I pay $20 for pro version ($240 per year) plus taxes, and use it daily. I get a lot of benefits from using it.
I use it to learn about things, solve problems, suggest approaches, critique my own proposals and approaches, generate code scaffolding and smaller code solutions, help me draft emails of all kinds, etc. I find it highly useful in a variety of contexts. You can give it obfuscated impossible code and it can analyze it and tell you what it does in seconds: https://imgur.com/a/m40TR4d (someone else's result)
It can help you find bugs and mistakes in your own code.
You can also ask it to tell you about a subject and it can give you a summary. Just tell it what you want and it'll do its best.
What areas did you use it where you got wrong results for basic things, to the point where you don't find it useful? Its major limitations are around logical numeracy (it gets numbers wrong) and lack of a visual cortex, which means you can't use it for graphics code or to write you visually correct solutions. Also, it doesn't speak foreign languages perfectly, it makes some grammatical mistakes.
I asked chatgpt about what people use it for and it gave these answers: https://imgur.com/a/qzUF5Ya
It mentions that it can generate a hypothesis. So a scientist can absolutely use it to make some suggestions, for example try "Generate five hypotheses a chemist might test as part of an undergraduate study program" - here are some examples: https://imgur.com/a/hOtGgKN
I'm no chemist, but those seem fine for me as undergraduate lab work tests. It's probably not going to get you a Ph.D. but often you don't need one, just a few quick brainstorming suggestions.
Some people have it plan all their meals and create recipes for them, which they then cook and eat. There are thousands of recipe sites, the reason people use ChatGPT is because they can just describe what they want, what they have, and have it come up with its own recipes based on what is available and can be purchased.
Just describe what you need and what you want it to do and it does a good job for you on all sorts of tasks.
https://www.twitch.tv/polyvoxlive
It's not 100% flawless, but for this task, it is exceptionally good (in particular, it makes fewer mistakes than the davinci models).
Can you give examples about what and how you asked and what it said?
BTW How you ask is the lion’s share of making it useful to you
For eg, I like to scribble things And then ask gpt model to generate an outline, summarize something, make it more impactful, sound authoritative etc. this does help me sound better since English is my second language but I do verify the output.
You just ask it: provide a list of idioms/turn of phrases equivalent to <this situation>
And it will give you everything it knows about. It's great for exactly these types of queries which are hard to google (if you try, you just get blogspam)
Also, ChatGPT writes simple unit tests and SQL queries very well.
It also simplified a JQ thing I had built, and it also told me how to structure the data how I wanted, within that filter.
>Write a job listing for a front end engineer with 3 years in vue and mention some bullshit about how innovative $company_name is and how great it is to work a $company_name.
Two pages of corporatese and got a bunch of good responses.
I’m less suspicious. Anecdotally, I’ve compared SimilarWeb on a few low-traffic sites of mine to the results according to an open source analytics tool and SimilarWeb got surprisingly close. They call it their "proprietary dataset".
As a side-note, I suspect that their sources include more than just browser extensions or it wouldn’t be so accurate for small sites. Couldn’t they buy data from autonomous systems or internet exchanges and extrapolate from that while correlating IPs with demographics? They only report rough estimates so SSL wouldn’t be a problem for their analytics.
This (and other open-source AI models), not 'ChatGPT', 'DALLE-2', etc is what will change the AI landscape for everyone, permanently forever.
https://github.com/laion-ai/open-assistant being built in the open already. you can contribute too.
please also notice that the article you linked is about the text classifier of the frontend and not the LLM itself
From the graph (above) linked by the top comment in your [2], I'm wondering whether this demonstrates more anti-conservative bias than liberal bias, or whether the alternative meanings of conventionally conservative versus conventionally liberal words dictate the frequency of a flag.
For instance, "Republican" means a variety of things around the world, but "Democrat" is far more likely to indicate the US Democrat party (which is frequently misstated as the "Democratic party"), or a national Democrat party in general. People would tend to write "I'm a democrat" to assign their membership to the party, whereas they'd say "I'm democratic" to assign their leanings toward the system. But "I'm a republican" means both.
GPT3/4 is simply too expensive for consumer GPUs, any open sourced versions will have to run on A100s in the cloud, so by nature centralized. Granted, having multiple providers also counts as removing the moat.
But BLOOM for example (An attempt at replicating GPT3), no one actually uses. Because its simply too expensive for inferior performance to GPT3
DALLE2 was disrupted, because
1. OpenAI at the time was dumb enough to put a waitlist on something that costed money. They didn't make the same mistake with ChatGPT.
2. Stable Diffusion was not only open sourced, but heavily heavily optimized in parameter count compared to alternative models, making it viable on consumer GPUs.
We will have a ton of bullshit at scale. And the web will be done for.
Which open source alternatives to GPT-3 have you seen that most impressed you?
I've not yet found any that are remotely as useful as GPT-3, at least for the kinds of things I want to use them for (generating SQL queries from human text, summarizing text, that kind of thing)
Where are these numbers coming from? An 80GB A100 GPU is certainly more than capable of hosting a 1.5B GPT. We were running 774M on rinky-dink cards back in 2019 for our inference purposes.
I don’t understand how they went from talking about 175B params across 32 cards to 774M on one card. 175B divided by 32 is 5.4B.
In fact, I’m not sure what they’re saying in general. They seem to be confusing data parallelism with model parallelism with memory fragmentation, while namedropping a bunch of training techniques.
The hard part of ChatGPT isn’t the size. It’s the training process. It took a small army of contractors rating outputs as good or bad. Once that dataset gets replicated, we can start talking about size. Hopefully LAION will deliver.
>I don’t understand how they went from talking about 175B params across 32 cards to 774M on one card. 175B divided by 32 is 5.4B.
They claim 774M is the size of GPT-L which if run in conjunction with their RLHF would require 80GB A100 GPU to train (using their RLHF PyTorch implementation). They additionally claim that training GPT-3(175B params) plus RLHF would need 64 * 80gb = 5120gb of memory if using PyTorch implementation of RLHF or 32 * 80gb = 2560gb if going Colossal AI route.
To be honest, these numbers do look to me to be a bit of a cheesy ad for their product but hey they need to put food on their table too. I'm not sure if the dataset would be such a huge problem otherwise Britannica would still be ahead of Wikipedia. Given an army of volunteers willing to produce it OpenAI brigade of contractors has no chance.
Is LAION starting a community project to rate model outputs? I didn't see anything on their site.
Or 32 3090 cards, which would run you under $40k total.
The thing that makes ChatGPT interesting (over regular GPT-3) is the RLHF process, but this article doesn't seem to touch on that at all, unless I've missed something.
-> https://medium.com/walmartglobaltech/the-journey-of-open-ai-...
GPT-4 is around the corner, and it's allegedly 100x more powerful than it'd predecessor.
-> https://medium.com/geekculture/gpt-4-100x-more-powerful-than...
The only benefit of a specialized architecture is minimizing resource usage.
I really think we should stick to Nick Bostrom’s (or pls fix attribution) term “intelligence explosion”
But back to what I’m really saying here: “Generative AI eruption” is a mouthful whereas “intelligence explosion” is concise.
Do I just run it and speak to it in terminal? Or what is the next step to make it useful for search or more?