This feels a lot like crypto where everyone is very excited about a new technology that very few people really understand, and are jumping on the bandwagon without asking any questions.
It's also very much like crypto where for every one person doing something useful with it, there are 20 trying to exploit the newness and low comprehension the general public have of the tech such as:
- Trying to cash out on a ChatGPT wrapper company
- Creating the nth "AI powered custom chat bot but for x vertical"
- Using it to cheat on school assignments or interviews
- Gluing together as many different "AI" services as possible to create a no touch business and sell low effort products
I'm not saying the company will go bankrupt but I'm also not buying into the hype that it's going to become the next Google or better / create AGI for us all.What am I missing here?
I've got a monitor dedicated 100% of the time to ChatGPT, and I interact with it non stop during the flow of technical scenarios and troubleshooting situations that flow into me - working in areas that I have the slimmest of backgrounds in, and shutting down, root causing, and remediating issues that have been blocking others.
I've essentially got 15-20 high-priced world-class consultants in every field that I chose to pull from, working at my beck and call, for $20 a month? I would pay $200/month in a heartbeat out of my own pocket, and I probably would ask the company to pay ~$2,000/month for my workflow.
I think if they never released another product, and they just managed to penetrate with their existing offering, they are easily a $100B+ company once they nail down how to monetize.
The difference between LLMs and Crypto is I can point to roughly 200-300 objective solutions over the last 9 months where ChatGPT resolved an issue and delivered clear value for me alone. And, over time, as you learn how to control for hallucinations, and manage your query patterns a bit more - the value has continued to increase.
That same multiple-times-a-day high value persistent experiences were never a part of my crypto experience.
I keep reading responses like yours, but I haven't seen any specific examples of problems being solved, so it all sounds very abstract. In my interactions with ChatGPT, it felt like just interacting with a search engine. There were zero continuations between questions and responses, and nearly 100% of the responses contained incorrect information.
Edit 1: As an example, I just asked ChatGPT to implement the Warren Abstract Machine for me. It gave me two different implementations, both with caveats that they are simple examples and not the whole thing, and neither implementation even type checked. It feels just like reading someone's homework where they copied off of someone else's work and had no idea what's going on. I don't see the point in this if it's just going to give me some high-level idea, which I already have, and an implementation that isn't remotely complete, much less one that doesn't even run. "Additionally, you'd need to ... handle backtracking". You don't say, ChatGPT?
Edit 2: I've kept on asking it to implement things I already know how to implement, even things that it probably has my code (being it was on GitHub) in its training data, and it keeps giving me code that doesn't even typecheck. ChatGPT is just doing what I've always imagined it doing: it's just doing a statistical merge. It has zero concept of anything I'm asking or it's saying.
Edit 3: I asked it something regarding WebGPU. It gave the typical, oh that's complex and "Here's a simplified example using GLFW and WebGPU through the F# WebGPU bindings provided by FableGL". But FableGL isn't a thing that exists, and even if it did, it wouldn't have anything to do with WebGPU which has nothing to do with OpenGL. And it imported Fable.Import.WebGPU and Fable.Import.GLFW, neither of which exist.
I mean, this is literally all smoke and mirrors. It boggles my mind when I hear people say they successfully use it every day. I haven't ever got it to tell me anything remotely correct.
I've been doing dev for ~20 years now, most of the work out there is essentially plumbing events/databases/REST APIs/regexes/JSON/configs/ web UIs in existing and well established frameworks/tools to solve business problems.
You likely work more R&D style environments, which again, probably represent a very low % of devs out there
Also, although I haven't seen it before, the Warren Abstract Machine seems like too big of a job for GPT-4. It excels at smaller tasks such as "Convert this C code to idiomatic Rust", or "Write a SQL query that grabs records from X, joins with Y, filters by Z". You might need to make small adjustments manually or by saying "Rewrite it with this in mind".
One really neat trick it did recently is that I uploaded a chart png I found on the internet, and asked it to write Seaborn code that matched the chart's style. It took ~3 follow-up tweak prompts, but then spat out code that worked. It also handles things like "How do I adjust this label that's positioned in chart coordinates over by 10 pixels?", which is an absolute pain to figure out from the docs.
Especially since you're getting answers noting the complexity, I think you're just asking too much of it for now. Try smaller tasks for now, and wait a while to try the big ones again.
A mundane one: I have a list of lat-longs mapped to region codes, and I needed write a query to find the area of the convex hull of each region code. I knew how to do it in code, but I wanted a SQL answer. ChatGPT gave me the correct Redshift SQL.
A more involved one: I needed to maximize match x people y groups, where people can rank their choices up to 3, but as soon as they accept the match, they drop out of the pool. Plus a bunch of different constraints. Sounds like a stable marriage problem. ChatGPT proposed the Gale-Shapley algorithm, which was the exact one I was looking for.
- Helping to refactor SQL
- writing jq commands (I simply cannot)
- writing shell code (it happens just infrequently enough that I can't justify spending the time to get good)
- brainstorming names or puns. Word association is easier with a second person (or an LLM)
- figuring out why my AWS CLI commands aren't doing what I'm expecting them to be doing
- asking for reasons why a piece of code wouldn't work
I can competently do all of these things on my own if I havw to, but now I don't have to do them on my own so my life is easier.
I think it's worth stepping back here and re-examining the hurdles your setting for your own understanding.
The essential question here is: is ChatGPT useful to people. What you seem to be implying with your question is: I will not use ChatGPT unless it can solve problems with X level of difficulty for me. Why have you set that pre-requisite? Would it not still be useful to you if it simply increased the efficiency of solving simple day-to-day tasks that you're not blocked on?
This is because the client in question is in a regulated industry.
This week, we used it a bunch of times to help rephrase objectives while working in brainstorm sessions, allowing it absorb ridiculously large regulatory PDFs to give us summaries and ideas of what our greenfield project should focus on.
Things I’ve gotten value out of in the past week or two:
• Writing a job description
• Making a python script to automate some stuff in Asana
• Simplifying some management concepts so I could slack them to a coworker
All of these are things I could easily do myself. But with ChatGPT, they’re done 75% as well in 10% of the time and I don’t have to think hardly at all.
This year I used GPT-4 to write a significant amount of Terraform that was necessary to migrate an application onto AWS.
Writing Terraform, in my opinion, is problem that's broad but shallow. GPT-4 needed to do little beyond summarize documentation, but it was able to do so competently, and that was hugely valuable to me.
Conversely: In my free time, I've attempted to use it for a game-development side-project, and very little of its output has been useful at all.
I was also able to use it to generate software to check the blockchain to gather data I needed, again with no experience using the blockchain client libraries
I could have looked up the individual parts and found the problems myself, but then I would have spent at least 15 minutes on it instead of 30 seconds.
1) I needed to display a very high resolution image to a user. I have experience in GIS and imagery, so I knew I should use an image pyramid in some way, and from previous experience assumed I needed some server to cleverly serve the tiles. But I didn't want to implement it myself, and googling 'map server' led to rabbit holes. I consulted with chatGPT, and while it gave me several fake solutions, eventually it suggested using gdal2tiles.py to create the pyramid and then serve it directly with a CDN. This never occurred to me, and is a much better fit to this problem. This saved a LOT of time (on either building my own server or fudging with other solutions) 2) I have a streamlit service, and needed to use some of my infra inside, infra that was written using async-await. Unfortunately, streamlit and asyncio don't play nice together (boo streamlit). I went to chatGPT hoping it would find me a way to make it work anyway, and after trying everything it suggested (which failed), I tried googling myself, and spent several days without a solution. Eventually, I went back to chatGPT and it suggested building a small HTTP service that would serve the results, and access it in streamlit using requests, no async-await required. It's a hacky solution, but significantly faster than reimplementing my streamlit dashboard in another framework or rewriting my infra without async-await. It saved loads of time.
I think you'd agree these aren't junior level issues. ChatGPT definitely didn't solve every problem I came to it with, probably not even most, and even when it did I had to intervene signficantly. I feel the more experienced you are as a developer, the less valuable it is. But when you need to tinker in a field you're not proficient at, or if you need to brainstorm a solution to a tricky problem, it can be a great tool. I understand why many swear by it. It takes a while to learn what sort of issues are good to turn to chatGPT with and which aren't, and also how to phrase those issues
I had searched all over the web for an example addressing my specific use case on google and couldn’t find one. GPT4 produced a working example for me and got me past that road block. I also use it regularly to suggest better coding patterns and I find it does a really good job at doing code reviews for obvious mistakes / anti-patterns
So - I just reached out to Chat, and we started going back and forth, starting with "In linux, how many open sockets can there be at a time?". What's nice, about that wide open question, is you don't get a single answer, instead you get a briefing on Linux Sockets. File Descriptors, sure, but also memory, port range, TCP/IP stack limits, etc... It starts to lay out a roadmap towards solving the issue, answering the question you were interested in, rather than the one you asked.
I do a bit of back and forth on some scenarios, asking about /proc, and ss, etc.. seeing if I can track anything else out. And then, after spending about 5 minutes, and building context, I ask it "Is there anything else that can cause an error regarding too many sockets with low socket use" - at which point it lays out a number of scenarios, one of which is FD_SETSIZE.
So - we dig into FD_SETSIZE, and immediately that looks interesting - it's the limit that you can use with a select() call, and, even better, I get recommendations to use POLL or EPOLL (which anyone who has ever straced has seen a ton of).
I ask it how to determine FD_SETSIZE, discover it's 1024 on the client, which matches our low socket count, confirm that we should never increase the size of FD_SETSIZE, check the vendor code, see they've got it hard configured to talk with select() instead of poll - we recommend they give us a new build with poll() defined - and voila - problem goes away.
On to the next issue.
Where Chat excels is not in solving or completing anything, or, in fact, even being correct 100% of the time (it clearly isn't) - but it's an endlessly enthusiastic savant intern - frequently wrong in hilarious ways, but always willing to dig in and pull on technical threads and come up with things that I can try with a bunch of rapid iterations, and close off an issue. It's willingness to write code that is 90% correct just reduces the time, and cognitive load of constantly having to do it all yourself.
The main argument is that you can give it a block of real-world data like an email or code and take advantage of collective knowledge to identify outliers like bugs, bad grammar or incoherent writing which translates exactly to code semantics too.
I can also feed it a list of web accessibility issues and have it sort the list into most/least critical, along with providing references to the specific WCAG criteria. It has occasionally stumbled on this task, but again massive timesavings.
I also use ChatGPT for filler text in designs. Yes I have to build on what it writes, but it's way better and quicker than what I'd do myself. I know our communications team is doing this x10 more than I am.
I use ChatGPT 4, GPT-4, and Copilot every day. It is an "average intern" at many many things. Here's how I feel it helps me:
* Its interactivity lets me learn a lot (superficially) about new topics. I can then expand that with research of my own
* It helps me think outside the box or from other perspectives when thinking about e-mails, proposals, real-world scenarios
* When exploring a new language, framework or technology, it points me in the right direction.
* For quick scripts, using the code generation/analysis feature, if I direct it right (i.e. layout "the plan" beforehand and ask it to work the rest on its own), it gets a lot of it right pretty fast, saving me some time writing the code, figuring out the right libraries and the nitty-gritty details.
* It is great at giving ideas for why something might not be working.
Real things I've done with it:
* Discuss ongoing negotiations with clients, trying to better my proposal and better understand the clients point of view.
* Learn more about managerial or "business-y" topics, by allowing me to discuss things with it and iterate on that with my own research. It is a valuable "white board" to discuss with.
* Adjust my e-mails so they are more appropriate to the situation. This can involve changing the tone, shortening them, adding more detail, etc.
* In general, i've used it to find flaws in reasoning when dealing with people. For example, it has helped me question my own client proposals or approaches, by specifying where I was lacking (e.g. because I was vague, or pessimistic, didn't give a mensurable objective, seemed to ignore a client's request, etc)
* I use a command-line utility from the shell which lets me ask it to do something and then have it do it. I now use this with some frequency to just write the commands I would have to google because I haven't memorized. Things like ffmpeg or imagemagick arguments. Or combinations of grep, sed, ls, find, git, etc. Here are some examples:
i) "merge jpgs in this folder in order onto a pdf. each jpg should fill the page 100% without margin. Avoid complex calculations".
ii) "zip this folder with maximum compression and password 12345678'".
iii) "git remove branches matching pattern sprint-* remotely'"
iv) "use imagemagick to convert to jpg lossless and rotate 90 deg clockwise ~/home-en.png"
v) "Add _en before .jpg in all files in current directory. So they become _en.jpg"
vi) The list goes on and on...
* It has helped me cleanup nginx config files
* I have thrown code at it which I suspect has a bug. With a bit of context and some back and forth, it has helped me find it.
* In languages or frameworks I don't use often, it really shines. It has helped me write several applescript scripts which I have also cobbled together to create Alfred workflows. If I need to code something in a language I don't often use, what it produces is good enough for me to iterate on.
* It has helped people at our company improve their copywriting (when used with a lot of care)
* I have used it to help me critique my own poetry and improve how I approach poetry in general. Highly subjective, I know
* When trying to figure out how to use apps I don't often use, or dealing with unexpected behaviour in them, it often helps me find the issue. Notable apps in this category include Photoshop and Excel.
* I don't often do frontend so I'm particularly bad at styling and organizing things. When I ocasionally have to do frontend, it often gives me the initial "skeleton" very well.
I have seen many people try to use these tools, and here's where I think they SHOULD NOT use them:
* For facts, obviously -- which is unfortunately what many people actually try to use it for
* For writing (without check) most of your e-mails our posts, especially with very little context
* For ideas where you need "just enough creativity". It's a very fine line. Think brainstorming ideas for UI elements in a new website according to specific brand guidelines
* For incredibly specific details or overly complex tasks. You can't often have it write your 200 line function out of thin-air!
It is clear that GPT provides me with:
1. A better search engine, which skips past loads of outdated content or SEO-laden bullshit. It's just a shame that I can only use it this way for more specific "creative" or "problem-solving questions" questions. For important fact-based info, I always have to check what it says.
2. A partner with whom to discuss ideas and iterate over them, especially on topics I don't know. If anything, it's a great rubber-duck
3. A way to forget the underpinning of some of what I do and approach it with natural language. I find myself asking it to write my bash sequences instead of thinking them up myself. Of course I check them and 99.9% understand them, but it's just so much easier to have it do it for me.
4. (Copilot is an upgrade of my IDE. And a great one!)
If modern Google (and Bing and everyone else?) weren't so shit, I wouldn't need 1. To this day I still first go to google, eventually give up because it doesn't answer me properly and then go to GPT. It's ridiculous. I think 1. brings a lot of value to many many people. Google is so absolutely shit nowadays I can't believe it -- and the more I use GPT to get a comparison baseline, the more I am shocked. 2., 3. and 4. are incredible additions to my workflow.
I truly believe we are heading in the direction of interacting with computers via a new interface. Even writing code might eventually be affected. Perhaps, like we have built high-level languages on top lower-level languages and machine code, abstraction over abstraction, we might end up with some form of language writing which uses (very well controlled) LLM technology under the hood. Instead of directly writing the code, we tell it to write it. We still need to be smart to pick the right algorithms and structures, but no longer have to worry so much about writing the nitty-gritty syntax or details. Maybe?
Using GPT is its own art. Crafting the right prompts and getting a feel for how it works is very important. It is also essential to have critical-thinking and know when to question what it says and when not to.
A friend of mine told me this: "The companies who strive on hiring loads of shit people to do work are the ones who will suffer more. The others, which have hired smart people capable of critical-thinking, will benefit. GPT obliviates the work of many small shit people. Pair GPT with someone who is already smart and has very good problem-solving or critical-thinking skills, and now you've got a team that can obliterate the teams of many small resources.
I don't know if OpenAI will succeed or not, but I do know this technology is absolutely life-changing for many, and going back to a world without it being this acessible will likely be a net-negative.
I suspect it is because I tend to have off piste questions which are not recipes or high-level ideas. It is where it'd be most useful to me, but it is at the same time where least training data is going to be.
It is also where actual legitimate experts are most useful: the special sauce, not the meat and potatoes.
Would have taken me much more time to figure it out than what chatgpt did at first try.
TL;DR don't ask it anything to do with logic, but anything to do with documentation and stuff like that it's pretty good
I mean it. I'd pay money to watch streams of someone using chatgpt to solve non-trivial problems.
I remember the first time I watched a Netflix developer livestreaming their workday using an impressive neovim setup. It was eye opening.
I need the same experience for this ChatGPT fanaticism.
the goal here is to accelerate productivity, is it not?
It’s written at least 20 python scripts with me and almost all of them have been close to perfect almost on first draft.
Two interesting things it did for me lately: insisted that 8 + 6 = 11 and also proved that P = NP. I don't know which solutions it may be providing you, but it can't be anything too complex, or at least not too abstract.
edit: typo
I think the "GPT is amazing" vs "GPT is useless" debate is just going to get more confusing as more versions are released.
LLMs are not good at search, math, encyclopedias, logic engines. Maybe some day they will be, but not yet.
For example, "I'm setting up a new TypeScript + Svelte app, I made some changes to the config that I thought were good, but when I try to run the dev server, I get this error." And then paste in your incomprehensible ten line error.
Any time you are using some software tool that you're not very familiar with, and you get an error message that you don't understand, try asking GPT4 to explain it to you.
It isn't so great at deep, theoretical, algorithmic questions. "NP reductions" are probably not a great fit.
Think of it as a research assistant that has a broad understanding of every technology in the world, but isn't as smart as you are about your specific area of expertise.
It can be very helpful to guide you in a direction maybe you didn't consider looking into to begin with, but I don't fully trust it.
There was a fun example I had a few months ago where I wanted to see how well it would do with being asked to solve a problem iteratively instead of recursively, something like: "generate all valid, distinct permutations of a provided string given that you have a dictionary to check valid words against" and it got most of the problem correct, but when asked to fix anything it would go right back to a recursive solution with the same issue appearing, or in some cases a new issue.
It got me most of the way there with some edge cases I needed to handle myself, but it definitely seemed like that was as far as it was going to be able to go
The real issue will be that majority of users will take the answer face value without even knowing if that answer is a good one or just nonsense.
"The AI said..." will create lots of issues. Time will tell.
So for an MBA major, 80% accuracy at 5% of the cost may be amazing but for me as an engineer and a person who cares, the inaccuracies are catastrophic.
I’m open to suggestions on how to work with this.
So, just like an expert consultant?
Seconded.
Let me tell you what I used to do.
First, Imagine I have an error executing code or an error running some bit of 3rd party software. I go to Stack Overflow and search. I find posts related to my problem and I spend a great deal of time trying to shoehorn existing answers to my specific issue. Sometimes, my shoehorning works and I fix my problem. Other times, it doesn't work, and then I post on Stack Overflow myself. And I wait ... and wait ... for a response. Sometimes I get a response.
Now, when I have this type of problem, I tell ChatGPT, "Hey, I'm trying to do <xyz> and I'm getting this error <abc>. Help me troubleshoot this. And it almost always helps me fix my problem. And it's ~10× faster than Stack Overflow.
=============
Second, there are times where I have write code to do some relatively 'complex' data manipulation--nothing sophisticated, mind you, but stuff like, "I need these data columns rearranged based on complicated logic. And I need the text in columns A, X, AQ, and F merged, but only if <blah blah is true>. Otherwise, Just merge text in Columns A and AQ, except if the date in column ZZ is after January 1, 2019.". I can do this stuff on my own, but: a] it's cognitively draining, b] it takes time, c] I often make silly errors due to the complexity.
ChatGPT is, again, an order of magnitude faster than I am. And it makes _fewer_ errors.
It still makes errors. And I still have to know what to look for to catch those errors, but it decreases my cognitive load tremendously.
edit: I haven't used Stack Overflow in 6 months. And "Googling" is Plan B.
=============
Edit 2: I recently had to write a sympathy letter to someone whose husband died.
I knew the general ideas behind what I wanted to say, but I knew I wasn't going to write anything great.
I fired up ChatGPT and said,
"write a sympathy letter to <x>. Tell her that I didn't know her husband well, but the few times I met him, I could tell that he cared deeply about you<x> and his daughter. I know his daughter well and I think she gets a lot of her great qualities from him and you. Tell her I don't know what to say in times like this. Keep it short-ish. Avoid schmaltz and sentimentality because <x> isn't that kind of person."
It gave me about as perfect a letter as I could have asked for.
OTOH, I have found a use for stablediffusion that actually resulted to some income.
That would be amazing if Google, Microsoft, Amazon, or Meta sat on their hands while OpenAI got that big.
Google Used to be able to handle a lot of the non-procedural questions - but somewhere in circa 2022/2023 something started happening to it's results and I started getting back mostly SEO churn, to the point I was going back to using manuals and having to dig in and learn the fundamentals on a lot of things - which is unsustainable if you are touching 30-50 different technology stacks.
Chat changed all that - I can now go 3-5 levels deep in some stack, ask some incredibly nuanced question, get a reasonable answer that points me in the right direction, close off the issue and then move onto the next one.
Being given some code and reviewing it is a lot quicker than writing that code. Copilot is great. Half the time it spits out the wrong answer but you can see what it was “thinking”.
Phind giving you an answer and references lets you quickly double check. Sometimes it hallucinates but the answer and references combo is much better than a Google search which in turn is better than nothing.
Of course I recommend people not use AI for everything. I will go straight to MDN for any WebAPI question and use —help as my first port of call on the command line. This is like your L2 cache as a developer. Using AI for everything is like swapping to disk for everything.
Sometimes you need to go back and forward a bit, I tell ChatGPT it’s wrong and the error message and then it spits out the correct result, sometimes I need an algorithm tweaked because it has assumed a wrong constraint on the problem, again just explain it clearly and unambiguously and it will make corrections, there’s only been maybe 1/40 problems I couldn’t get a correct answer on after (sometimes a lot!) of back and forwards.
I am not looking for a perfect oracle, I am looking for something to write 80% of the code and then I’ll fix it up. It’s still way faster this way, especially in domains I don’t know, E.g. I just learned CUDA with ChatGPTs help.
It’s not perfect, and neither am I, but it doesn’t have to be perfect to be useful, you can get to millions in revenue through 80% solutions
Learn the limits of your tools.
The value and time it saves makes sense for folks who struggle with a search engine (many) or doing tasks that are typically considered menial, like writing emails or coding boilerplate.
However, if you can grok Google and don't mind doing tasks like that (I personally don't mind coding boilerplate stuff, especially since I can learn how the framework works that way), ChatGPT's value is limited (at least in my experience).
Example: I was struggling with a Terraform issue the other day. I used ChatGPT 4 to help me work out the problem, but the answer it gave was really generic, like mashing a few of the top answers on SO together. It also didn't answer what I needed help with. I knew enough about how Terraform worked to Google for the answer, which I eventually did a few minutes later. I could have kept crystalling my question for ChatGPT until I got what I wanted, but Google was easier.
I'm also not a huge fan of us just being okay with trusting a single and extremely corporate entity (OpenAI) being the de facto arbiter of the truth. At least Google shows you other options when you search for stuff.
^ Isn't that what folks used to say about programming in assembler? How much time do I want to spend learning frameworks (beyond what I already know) vs. how productive do I want to be?
You don't need to "struggle" with Google to get value out of it, you simply need to value your time.
If you want an answer to a question, why waste time reading through pages of search results when you can have an AI do that for you, reporting what it finds?
No, it's not perfect but it's pretty damn useful.
LLMs aren’t AGI. They’re far it. But they have massive uses for reasoning on available context.
I’ll give you an example. I’m trying to set up some bulk monitoring for api across 200k jvm s. The api documents are horribly out of date. But I get the raw uri on the monitoring tools.
I can just get these uri, send them into chatgpt and ask for a swagger spec - along with a regular expression to match the uri to the swagger api. It figures out the path and query params from absolute paths.
Sure I could try to figure out how to do this programmatically using some graph or tree based algorithm. But chatgpt basically made it possible with a dump Python script.
Of course I may still need a person to fill in these. But just getting a swagger spec done for a thousands of services in an afternoon was awesome.
This type of rhetoric is part of the reason so many compare the current crop of AI to cryptocurrency hype: proponents constantly telling others to shove the technical solution into everything, even where it’s not necessary or worse than the alternative.
I know where you're going. I've had folks say to me: "I really like co-pilot because it enables a beginner like me to write code". This sentiment often comes from folks having non-technical roles who want to create their own software solutions and not have to deal with engineers. I roll my eyes at that one.
You need to be able to spot specific areas of acceleration. Not just tackle it as a hammer for every problem.
I could also split the uri by service names. That helped parallelize my questions. It wasn’t just dump the data in. There was some cleanup behind the scenes that I had to do.
But in terms of value creation they have turned numerous industries and jobs on their head. Things like copywriting or how they are destroying stackoverflow and quora. The next lowest fruit they are disrupting is front line chat/email support - this is usually never part of a core product but the market is massive, almost every company needs support - look at Zendesk or imagine the costs of Uber's offshore support army.
They are going for the AWS platform approach - every niche GPT wrapper that gets a modicum of success and has happy paying users, these users likely are likely to stick because it improves their work in obvious ways. OpenAI gets their slice, think of how AWS made it easy for anyone to spin up a service, sure some failed but the hurdle is much lower - with such a powerful general model, I don't need to spend millions training my own to launch. The issue for them is it'll likely be less sticky if competitors/open source models catch up - hasn't happened yet but it might.
I've never seen such disruption to ways of working in so many industries in my lifetime. If you're on HN you may not see any use in your work if specialised, but at the entry level (majority of workers), doing writing work in 30% of the time has been game changing.
But since software scales so ridiculously well, that their cloud offering still manages to beat it on profit.
OpenAI had just landed a new class of subscription that scales like AWS, has B2B hooks like AWS, can be the engine behind entire classes of future unicorns like AWS... but then also has widespread consumer value and brand recognition like Prime.
By that measure it's hard not to be bullish.
Before ChatGPT, to find the answer to things like "how do I set up Gunicorn to run as a daemon that restarts when it fails" I would have to endure hours of googling, snarky stack-overflow comments that I shouldn't do that, etc., but as a solopreneur without access to a more senior engineer to ask, it's been fantastic. I've been quite skeptical of machine learning/AI claims but I feel like I'm experiencing a genuine case of a technology that's proving to be so much more useful than I had imagined.
Do you mean, “have Gunicorn keep N workers running?” If so, that’s in the manual (timeouts to kill silent workers, which defaults to 30 seconds).
Or do you mean “have Gunicorn itself be monitored for health, and restarted as necessary?” There are many ways to do that – systemctl, orchestration platforms like K8s, bespoke scripts – and all of them have tricky failure mechanisms that a casual copy/paste will not prepare you for.
Blindly using answers from ChatGPT is no different than a random SO post, and you are no more prepared for failure when the abstractions leak.
Getting straight answers will be detrimental in the long term, I fear. It feels like living in a box, and watching the world on a screen and the person answering my questions is mixing lies and truths.
> snarky stack-overflow comments that I shouldn't do that,
and realized they're probably using the chatgpt equivalent: nice, corporate answers that you probably shouldn't use
Also, would you really Trust AI for everything ? I wouldn't. Nothing beats human element. At best, AI should be used as a supplement to speed things up which it is great for. I personally would never rely on AI to do everything for me. Not to mention that it cannot be 100% correct.
> I just don't see this company (or any others for that matter) GPTing their way to AGI
> I'm not saying the company will go bankrupt but I'm also not buying into the hype that it's going to become the next Google or better / create AGI for us all.
What I'm missing is the connection between AGI & profitability. OpenAI has huge revenues from ChatGPT which look set to continue - they're distinct from cryptocurrency in that those invested in them are so on a service provision basis rather than a speculation basis.
I'm thoroughly unconvinced we'll ever see AGI - I see zero connection between that and OpenAI being successful.
I am bullish on AI but don't see AGI happening, yet I developed AI solutions that solved real world problems, and made companies tons of money and helped people solve non-AI, non-IT problems.
So, I never buy the AI-crypto-equivalence.
But so far to me this seems to be almost universally loved by programmers while I don’t really know anyone else who uses it at all.
I think after the past 15 years which saw some of the most rapid technological advances in history along with the greatest bull market in history, people’s credulity is off the charts.
But to me something just feels off about this entire AI/NPL complex. For one, I agree that it’s largely oversold on features. Also, every single enterprise software company is attempting to jump on the bandwagon and everyone on LinkedIn seems to be posting about it every day. Most people who talk about how revolutionary it will be have absolutely no track record of being correct on similar calls and on the whole if I had to bet, probably were highly skeptical of new tech over the past decade that was revolutionary.
I also agree that it feels very similar to crypto. I don’t think it’s a coincidence that both were largely enabled by advances in nvda chips. It may sound absurd to most but I actually believe nvda is the most overpriced stock in the market now by a large margin and is sort of in its own bubble. There has been a head long rush to stockpile their chips in anticipation of npls taking over but I predict it is going to result in an eventual glut of oversupply that’s going to put downward pressure on the semiconductor market potentially for a year or more.
If AI-generated code is considered acceptable in your project then you aren't using a powerful-enough programming language. And you're paying the cost in code bloat.
How many Coq programmers find ChatGPT useful? How many nontrivial Coq programs written (and not merely memorized) by ChatGPT even pass type checking?
If you're considering AI-written code then you have a bloaty-code problem. Letting an AI write bloaty code conceals the symptoms (keystroke count) but tech debt will still kill your project sooner or later.
That said, I do agree with the general notion. I find the more verbose the language, the better the help. Dense, more expressive languages fare worse. I’m referring to Python and Rust in my case, so one factor is of course massively larger training corpus for Python, and relatively more churn in Rust.
That’s a good example of a subpar use for an LLM. Dictionaries have existed since before computers. Digital dictionaries are faster, more reliable, and less power consuming than any LLM.
OpenAI itself will be fine though. Their lead has a snowball effect with all the training data they get. And I'd guess they will succeed at their regulatory capture attempt, and create some horrendous pseudo monopoly. Meanwhile, they can just implement what the most successful wrappers do themselves.
When I've seen colleagues visibly use it (i.e. mentioned in commit messages) that confidence has rubbed off on them.
Given that, why would I believe it when asking something medical, legal, historical, or otherwise outside of my domain or that I can't somehow verify?
LLMs are far from the only thing you rely on that will confidently lie to you.
> Every time a new study comes around saying “actually, everyone should be drinking 3 glasses of wine a night”, do you take a trip to the liquor store?
No, and doing so would be an example of blindly trusting something that you haven't or can't verify, so that supports my argument?
> LLMs are far from the only thing you rely on that will confidently lie to you.
Sure, but it's a new class of thing that will and does and yet people are trusting or haven't yet learnt that they can't. I mentioned seeing people trust it via commit messages; I don't see SO the same way, people generally realise they need to verify it, or it at least has a voting mechanism as a proxy. With GPT so far there seems to be a lot more assuming it's correct going on.
But with ChatGPT we're not saying disregard what it says, we're saying only disregard some of what it says and don't disregard what it says in future. Which becomes a lot of work to check everything it says every time.
Crypto, after more than a decade, has been useful only to criminals and scammers.
Of course, gpt-5/6/7 can become more valuable to end-users, but that's the second reason I'm bearish. LLMs are powered by exponential growth, and no exponential growth is infinite. We are already using a significant part of all existing data, and going up more than 1-2 orders of magnitude in either data or compute feels unlikely unless there is some breakthrough. Breakthroughs are hard to predict, but they are generally rare, and likely there won't be one soon.
I feel that some people lack creativity to use it.
GPT is as good as the user is at posing good and well defined questions and tasks.
Its ability to perform few shot learning is astounding.
I suspect I'm fairly alone on this. They'll probably do well without me.
Most people that even know about it probably don't mind. I can't even verbalize why I do
I've had a play around with some OpenAI-powered sites and it is neat how much it is capable of, but I feel uncomfortable typing personalized prompts or detailed questions into a system where I know everything I type is going to be harvested. You could argue that by commenting on HN or posting anywhere in the internet everything I type is also going to be harvested (perhaps into the very same models), but that contract was always clear. There is a difference between companies using information I have chosen to share publicly, and companies doing the same with what is presented as a private exchange.
But once they can fit that mini GPT into my pocket, and the learning it's doing is truly personalized to my own install... for me that will be a much more appealing product. I guess the technology will get there, eventually.
You could do it now. Apple computers with a lot of RAM are pretty good at running Llama2.
My workstation has allowed me to dabble - I'm familiar, a unified pool of memory does very little for me.
The experience with self-hosted stuff leaves a bit to be desired, both in generation speed and content.
The software needs work, I'm not saying we won't get there... just that we haven't, yet.
With a ridiculously beefy system I can eek out some slow nonsense from the machine. It's neat, and I can do it, I just don't find it very useful
I’m guessing you haven’t actually been using it personally beyond some superficial examples.
Once you use it regularly to solve real world technical problems it’s pretty huge deal and the only people so far that I’ve met who voice ideas similar to yours, just simply haven’t used it beyond asking it questions which it isn’t designed for.
When anything gets more complex than that, I feel like the main value it provides is to see what direction it was trying to approach the problem from, seeing if that makes sense to you, and then asking it more about why it decided to do something.
This is definitely useful, but only if you know enough to keep it in check while you work on something, or worse if you think you know something more than you actually do, you can tell ChatGPT it's wrong and it will happily agree with you (even though it was correct in that case). I've tested both cases: correcting it when it was really wrong, and correcting it confidently when it was actually right. Both times it agreed that it was wrong and regenerated the answer it gave me.
This is the peril of using what really is fundamentally an autocomplete engine, albeit an extremely powerful one, as a knowledge engine. In fact, RLHF favors this outcome strongly; if the human says "this is right", the human doing the rating is very unlikely to uprate responses where the neural net insists they're still wrong. The network weights are absolutely going to get pushed in the direction of responses that agree with the human.
> I'm using python and my string may contain file paths inside, for example: (...) , For anything that looks like a filepath inside the string we should replace its full path, for example (..)
> Can you write me a python script to kill processes in Windows that no longer belong to a process named "gitlab"
> I want to write a test for it that mocks both sentry_sdk and LoggingIntegration so that I can give my own mocks during test
> I want to create a Python script
It should be able to be called from the command like, like below (example)
Write me the script
;;;
All real examples from last week that took me 1 minute to be solved instead of of googling or creating from scratch / thinking about it
When will humans live in space?
Why am I depressed?
When will world war III happen?
Compute this math equation (function calling and compute engines will help with this)
The underlying tech is amazing. Where LLMs are headed is wild.
I have just lost a lot of confidence that OpenAI will be the ones getting us there.
The chat niche was an instance of low hanging fruit for LLM applications.
But to design the core product offering around that was a mistake.
Chat-instruct fine tuning is fine to offer as an additional option, but to make it the core product was shortsighted and is going to hold back a lot of potential other applications, particularly as others have followed in OpenAI's footsteps.
There's also the issue of centrally grounding "you are a large language model" in the system messaging for the model.
So instead of being able to instruct a model "you are an award winning copywriter" it gets instructed as "you are a large language model whose user has instructed you to act as an award winning copywriter."
Think about the training data for the foundational model - what percent of that was reflecting what a LLM would output? So there's this artificial context constraint that ends up leaving to a significant reduction in variability across multiple prompts between their ChatCompletion and (depreciated) TextCompletion APIs.
They seem like a company that was adequately set up to deliver great strides with advancing the machine learning side of things, but then as soon as they had a product that exceeded their expectations, they really haven't known what to do with it.
So we have a runaway success while there's still a slight moat against other competition and they have a low hanging fruit product.
But I'm extremely skeptical given what I've seen in the past 12 months that they are going to still be leading the pack in 3 years. They may, like many other companies that were early on in advancing upcoming trends, end up victims of their own success by optimizing around today and not properly continuing to build for tomorrow.
If you offered me their stock at a current valuation with the stipulation I wouldn't be able to sell for 5 years, I wouldn't touch it with a 10 meter stick.
To me, this is the most important part of ChatGPT. GPT-4 has some massive shortcomings, but to me it's clear that this road we have started to head down is producing actual intelligence, in the real sense. 5 years ago, AGI felt completely intractable to me. Now, it feels like an implementation detail.
The problem is, this sentiment of path towards AGI assumes that everything (i.e our physical reality) can be compressed. Which is highly likely not true.
The future of AI is basically better and better compression with more data sources. You will be able to do things like ask how to build a flying car in your garage, and it will give you step by step instructions on what things you need to order from where, including CAD drawings to get custom made CNC parts, and how to put it all together (including software to run it).
As far as AGI goes, its possible that through AI assisted technology we will be able to measure synapses of a human brain in fine enough detail, or somehow mass surveil the entire population of earth and derive human brain models from that. And then, with optimizations on that, we can potentially arrive at the most efficient human brain in terms of some metric, and we will have very good robotic assistants, but nothing really beyond that.
Would my life become worse without crypto? Actually it became better - I sold all my crypto, coinbase made it painful to deal with them and they jacked up transaction fees. That money is now in good ol’ stocks.
So OpenAI specifically, I can’t say but AI in general that is trained on the entire knowledge set of humanity and that can reason from it - that will become ever more valuable.
I'm a teacher who is constantly learning new things. I can learn things I would have never been able to learn before because of AIs like ChatGPT. My students are learning more and faster than ever before.
Learning Management Systems like Canvas and Blackboard made a lot of money. I could argue they are obsolete now.
No, it objectively has not. Maybe it has changed how you teach, but “education” is much larger than any of us. Until school curriculums around the world incorporate ChatGPT—quite the dystopian scenario—they have not changed education.
I'm spending most of today updating my Canvas courses, and Canvas is obsolete. Students are much better off asking AIs how to do thing and what they should learn next rather than working through my Canvas courses.
I use Bitcoin regularly, because I live in a third world country where it's really hard not to get your salary seized.
I use ChatGPT every day for lots of things and it has replaced Google search for me. And StackOverflow, of course.
Notice how I said BITCOIN and CHATGPT. Not "crypto" and "ai".
ChatGPT made one great product. Bitcoin is one great “product.”
But the successors in the same category aren’t doing anything wildly more useful than the original “killer app.”
Crypto never really revolutionized finance, it just provided one solid digital currency product. Smart contracts and NFTs went pretty much nowhere and I struggle to identify a way that they are used in a widespread manner.
You and I are using ChatGPT regularly and it helps us quite a lot, but it hasn’t revolutionized life nor has it turned me into a 10x developer or something like that. It’s a service that is collecting $20 a month and that’s about the extent of its economic value so far.
In other words, “Replacing Google and Stack Overflow” is arguably not that exciting.
(Then I end up going back to SO/Google when ChatGPT tells me shit that is wrong)
I do think LLMs have way more potential than “crypto” but it remains to be seen how much more that is
All the crypto killer apps lost the plot, especially NFTs.
As for LLMs and more specifically, ChatGPT, they are not a niche thing so I agree in that their potential is way bigger. I'm not yet sure what that is, but I think it will change things profoundly. Replacing Google/SO is just a side effect of something bigger. But that's just my humble opinion.
Imagine if they kept doing that despite having your credit card information because you paid for Pro, which more or less proves you're an adult who deserves the presumption of innocence/good-actordom.
Lastly, imagine that you do this for all users despite the fact that it is known to reduce the intelligence of your output.
(I'm bullish on self-run models)
* Problem statement *
The actual value of GPT is spontaneous creation of spoofed data. Some of that output answers really tough questions. Stop thinking at this point and reflect.
* Value assessment *
There is some amazing potential there for performing test automation where either large data samples are demanded or the sophistication of output exceeds exceeds capabilities of prior conventions. Taken one step further there is further value in using this authentic looking out for testing against humans for market acceptance testing or bias validation.
* Real world use *
When real world use significantly departs from the value assessment there is a problem. Faking student papers or hoping the technology writes a legal brief for you forces a complete realignment of potential revenue model with different clients paying different amounts.
* Expectation *
Unless this technology can actually make money in a reoccurring and sustainable way past initial trends it will be an investment black hole just like crypto.
My company also got it immediately and is rolling it out globally.
GitHub copilot is already helpful. GitHub copilot for docs (announced at GitHub next) is a game changer.
I used openai to reformulate emails and suddenly got positive feedback about my announcement emails.
I communicate with openai German and English how ever I see fit.
It's very hard NOT to see it than the other way around.
And we got so far with only one company pushing this!
There is no option for the others to alsolo out tons of money into ai.
And besides openai, ai/ml is huge in what Nvidia and others are doing with it. Texture compression, 2d and 3d generation.
What we see is also potentially something like a new operating system.
And it makes it so much more accessible.
I never had a tool like llms which are able to take a bad copy pasta of a pdf and pulling out facts from it.
I did not foresee that you could mix input languages. That’s fascinating. Multilingual people use languages for different purposes, often in the same sentence. Ie: technical jargon in English, a joke in Arabic, etc.
I expect the difference in connotation/feeling/mood to be less relevant for an LLM, if you’re working with facts. But there was a recent post showing LLMs performing better when you said you were stressed/scared. Did you notice any such differences for your multi-lingual inputs?
I also describe things and that also works quite well.
But for a time, boundaries will be pushed that require more compute and they may be a good service to provide that. The hardware is so expensive I imagine their margins can't be very good though. I'd be interested to see their business plan, because the current version of OpenAI in terms of what it offers doesn't seem to be that compelling when extrapolated out 5 years without some other innovative products.
I honestly think Apple will dominate the personal AI angle once they get there. What's left is business and that will be more competitive.
That might be a bit different for big companies where they want to run their own models.
The other factor is that Google and others are certainly not going to sit still. There is no reason to believe that someone as resourceful as Google cannot come up with something as good as ChatGPT, if not better. Companies like Meta are playing the open source card, so they will be the first to benefit directly from the community. So the market will change, but dramatically. It's far too early to bet on any of them (or none of them). My approach is to diversify, wait and see.
There is definitely value here, I use the product a lot myself, but I don't agree that the value is as high as the majority of people seem to think (ChatGPT is going to reshape economies, every industry will replace 90% of humans with some form of AI soon, in more extreme cases that AGI is close to happening, etc...)
I wanted to see if anyone here had examples or use cases that could make me think otherwise
"Computers are useless. They can only give answers" - Pablo Picasso
Nevertheless, regardless of whether OpenAI is close to AGI (I don't think so) or what value LLMs bring to the table (definitely non-zero), the problem is that LLMs are being increasingly commoditized and no one has a real moat here. I think that's why these firms are so desperate to kickstart regulations and are trying so hard to somehow pull the ladder up behind them.
OpenAIs fears don't come from "no one understands LLMs", but rather, that too many people do, and that large models have already fallen into the hands of the community who can do more with it in a week than OpenAI can hope to do in a year. Ever larger models might be out of the reach of the public, but real world value is more likely to come from a well prepared, smaller model (cf. vicuna) that doesn't cost an arm and a leg to run inference with -- and building these is cheaper than most might think.
If I had to point to a company and call it as a market leader here, I would point to Meta, not OpenAI. Meta has a huge workforce working for free on its model, after all, and they have made progress at a rate that bigtech cannot match in their wildest dreams.
There are also far too many eyeballs on this, in my opinion. For a company to truly dominate a market it needs a bit of air cover for a while building what will eventually be its moat.
Why do I keep hearing this and where is it coming from? I hear it so often that it feels like an agenda being pushed but I can't imagine from whom.
Both still seem very convincing in principle but real world use seems to offer little good. I mean I did find some application in ChatGPT but I have a bad gut feeling using it. So I wouldn't be surprised if e.g. through the amount of fake content the whole AI'fied web will just drive people away. (Similar to what happens to some Social Networks)
Whether OpenAI’s API is a viable or risky product is a secondary and separate question. Yeah, there are a lot of wrappers out there. But that doesn’t matter with regards to the usefulness of LLMs generally.
The other problem with crypto's search for the killer app is that things like NFTs, which make no sense in the world of 'free' information, became ridiculously hyped and gave crypto a bad name.
LLMs are just better content search/generation. But the generation part messes it up since these models have no concept of right and wrong the output is fictional. This is ok if that's your goal, but if you are looking for accurate information then obviously this becomes a problem.
Most of these new "technologies" (AI/Block chain ... the hyped up stuff) only exist because of cheap computing power and cheap capital. None of these technologies have created any real tangible value, its always some version of "its early days" argument.
None of these things will last long by themselves when the economic conditions change.
On another note, I feel AI/Block chain are just tracking people. They are both good surveillance tools.
OpenAI will probably do very well but there is a chance of disruption. They have a moat but also the nature of AI is it is a cloud commodity (like say Lambda functions) where I see a competitor making a cheaper drop in replacement. But to be a threat they need to smash scale and LLMOps etc.
Revolutionizing how we interact w/ computers by allowing us to use plain human language to do things the requester does not understand how to do seems to have been demonstrated. See even the relatively simple agent demo where an architect used human language to have zapier take action based on meeting conflicts. imo this alone is a big deal.
I think it’s a bit too early to be bullish on openAI yet because beyond their gpt and image creator there isn’t much they are doing yet— yet being the keyword— so let’s see.
I write low-level AI code and it's like speaking to someone that just understands what I'm saying without having to explain every 2 minutes.
This has massively augmented my workflow.
On the topic of AGI. We'll get there in your lifetime. I can see how and why. The new bar is ASI so consider AGI the current goal-post. We have all the pieces, we're just putting them together ;)
If you want to check out what I'm up to I have a front-end here: https://discord.gg/8FhbHfNp
In other words, don't rely on the LLM by itself, it just happens to be able to remember most information as a side effect of its learning. Most important is the ability of these systems to transform knowledge and data when appropriate. Don't use it to read CSV's for example.
OpenAI and its acolytes are absolutely dripping with hubris. A lot of their peripheral activities seem like PR stunts. I find it really cringing.
But also I can't see how the future isn't bright for OpenAI. Maybe it won't overtake every single other business in the world, including bakeries and breweries, but at the very very least will eat the lunch of many lower-tier white collar industries. Maybe more than that.
I suppose that's the "Elon defence", except Sam Altman doesn't spew out nonsense the same as Musk, and what they say their product does, it really does. Not a self driving robo taxi case. And in either case, Tesla is at least an OK car.
I think it's obvious there's a ton of value in the product and it's a massive force multiplyer for certain types of tasks. But it inherently cannot be trusted and still requires someone with expertise to verify and implement.
I don't think they're going to achieve real AGI. I don't think we ever will. I think they'll get something "close enough" and claim they have it, but I don't think the path to AGI is through LLMs.
The actual tools, though, are definitely useful, even if they have a ton of issues still. Personally it gives me basic factual errors constantly, but I’m sure this will be worked out in time.
Answer:
To become a "space bear trainer," a role that combines aspects of zoology, space mission planning, and training for extreme environments, you would need to follow a multi-disciplinary approach …
That said, I cannot rule out purely commercial ventures with tenacity necessary to compete spinning out of OpenAI.
In the same way search transformed knowledge augmentation, LLMs will transform skills augmentation.
Forget about the things you know to do well, instead focus on all the new skills LLMs will unlock for you.
If it becomes as big as Google depends on what Open AI does, Google didn't become as massive as it is just off search.
OpenAI is the dominant player in the hottest area and has a significant and valuable product.
No idea who will achieve Strong AGI but ChatGPT is the real deal.
Everyone is trying to make the nth AI company and OpenAI profits most from it all. Meanwhile, they take the actually good ideas and integrate it into their own offerings, killing the competition.
Second: Transformer models (and diffusion models) are merely the latest hotness in a long series of increasingly impressive AI models. There is no reason at all to assume either are the final possible model, not even the final word by OpenAI specifically.
Third: There is a direct correlation between the quality of output and the combination of training effort and example set size. This is why both image and text generators have improved significantly since this time last year.
Caveat 1: It may be that, as all the usual sources have responded to ChatGPT by locking down their APIs and saying "no" in robots.txt, they are already at the reasonable upper limit for training data, even though more data exists.
Caveat 2: Moore's Law is definitely slowing down, and current models are about (by Fermi estimation) 1000x less complex than our brains. Even though transistors are faster and smaller than synapses by the factor to which wolves are smaller than hills and faster than continental drift, the cost for a 1-byte-per-synapse model of a 6E14 synapse brain is huge. Assuming RAM prices of €1.80/GB (because that was the cheapest I found on Amazon today), that human-scale model would still cost in the order of a million Euros per instance. Will prices go down? I would neither bet for nor against it.
Will they (or anyone else in the next decade) create AGI? I think that's an argument in terms. Transformer models like the GPT models from OpenAI are very general, able to respond in any domain the training data covered. Do they count as "intelligent"? They can score well on IQ tests, but those are only a proxy for intelligence.
Given the biological analogy would be:
"Mad scientists take a ferret, genetically modify it to be immortal, wires up its nervous system so the only thing it experiences is a timeless sequence of tokens (from Reddit, Wikipedia, StackOverflow, and random fan-fic websites, but without ever giving the ferret any context as to what any of the tokens mean), and then spend 50,000 years rewarding/punishing it based on how well it imagines missing tokens, this is what you get."
I don't know what I was expecting, but it wasn't this.
BUT
I think people are too bearish on boring-old search as a tool. It's so easy to jam a search into Chrome's bar and look for a quick reference or (hopefully) a human being that has had some experience with what you're working on.
I use search / ChatGPT / CoPilot interchangeably for different reasons... ChatGPT for a detailed, thoughtful answer. CoPilot as autocomplete on steroids. Search for reference, quick answers, and direct human experience.
The silliest part is Sam Altman selling it as they've got a way to AGI.
WolframAlpha and visual stuff have been more impactful for me, but they existed a long time before GPT. Even then, I don't use them that much.
There are some valid uses for neural networks, including LLM's, just as there were a few valid usecases for blockchain. None of them are particularly revolutionary, and it's not clear yet that any of them will pay for the enormous computing power required.
A lot of people look at LLMs through the same lens that they have looked at all other technology to this point — that if you learn and master the interface to the technology, then this eventually equates to mastering the technology itself. This is normalizing in the sense that there is a finite and perceptible floor and ceiling to mastering an objective technology that democratizes both its mastery and use in productivity.
But interacting with LLMs that are in the class of higher-reasoning agents does not follow the same pattern of mastery. The user’s prompts are embedded into a high-dimensional space that is, for all intents and purposes, infinitely multi-faceted and it requires a significant knack for abstract thought in order to even begin the journey of understanding how to craft a prompt that is ideal for the current problem space. It also requires having a good intuition for managing one’s own expectations around what LLMs are excellent at, what they perform marginally at, and what they can fail miserably at.
Users with backgrounds in humanities, language arts, philosophy and a host of other liberal arts — while maintaining a good handle on empirical logic and reason, are the users who consistently excel and continue to unlock and discover new capabilities in their LLM workflows.
I’ve used LLMs to solve particularly hairy DevOps problems. I’ve used them to refactor and modularize complicated procedural prototype code. I’ve used them to assist me in developing UX strategy on multimillion dollar accounts. I’ve also used them to teach myself mycology and scale up a small home lab.
When it comes to highly-objective and logical tasks, such as the development of source code, they perform fairly well, and if you can figure out the tricks to managing the context window, many hours of banging head against desk or even weeping and gnashing of teeth can be saved.
When it comes to more subjective tasks, I’ve discovered that it’s better to switch gears and expect something a little different from your workflow. As a UX design assistant, it’s better for comprehensive abstract thinking, identifying gaps, looking around corners, guiding one’s own thoughts and generally being a “living notebook”.
It’s very easy for people who lack any personal or educational development in the liberal arts or the affinity for and abilities of abstract thought to type some half-cocked pathetic prompt into the text area, fire it off and blame the model. In this way, the LLM has acted as sort of a mirror, highlighting their ignorance, metaphorically tapping its foot waiting for them to get their shit together. Their lament is a form of denial.
The coming age will separate the wheat from the chaff.
This is definitely not correct in terms of numbers, there are many more people using LLMs well than have ever used crypto for any real use case. Also it's worth considering that the only real use cases of crypto are illegal, from noble stuff like busting sanctions to get food to hungry children, through to bribe evasion, bribe payment, tax evasion, drug deals, hiring hitmen, and child sexual exploitation/trafficking. In general, crypto produced close to zero for global society, even when it wasn't being used as an overt and intentional scam.
LLMs are producing significant value for society right now, because OpenAI gave everone API access to a very weird intern who has incredible knowledge breadth and makes dumb mistakes. Interns (or "relatively low intelligence/experience human workers who need handholding for difficult and sometimes easy problems, with an occasional flash of insight") have always been controversial as to whether they actually provide value from the perspective of the person who has to manage the intern, but from the perspective of the company/society it's unquestionable that they do provide significant value. Different people put different value on having a collaborator at all, some people do not want to handhold anyone or work with anyone who's mistakes they ever have to work around. It is nevertheless true that in aggregate for knowledge work, "worker + intern" is more economically productive than just "worker", outside of very, very specialist use cases. This just wasn't possible with GPT-2, and even GPT-3.5 is not quite at a quality where I'd really compare it to a normal intern. No other machine aside from the human brain was even close.
That's the tech now, the worst it will ever be. Whatever comes next with a major leap (GPT-5, Claude 3 or Gemini if they're good, maybe Llama 3 or the next Mistral if they can get improved significantly by the OS community before the next GPT release) is going to be either a reliable version of the same intern, or the same intern with better intelligence and comprehension that still suffers from reliability issues, or a major step up where they're equivalent in productivity to a full-blown knowledge worker in some high percentage of cases. It's already important now, it's only going to get more important.
As for OpenAI specifically, I think they have a very good chance of continuing to lead the pack, particularly with the this cringe-y GPTs/GPT Builder/GPT Store thing. It's pretty transparent that this is them getting data to train an AI on how to spin up agentic AIs to accomplish specific tasks, because they'll have the data on how the GPT Builder is used and the data on how useful and effective the GPTs it builds are, so they can do things like dramatically overweight the most effective and useful GPTs for training their internal "GPT Auto Builder". They'll be running a store for these things as well as effectively controlling the operating system they run in, so purchases, ratings, time using a GPT, sentiment analysis in the GPT text log to detect success, plus explicit in-GPT feedback (the thumbs up and down, feedback submission form) will all be data they can feed into their machine, to make an AI that can build good GPTs for a task and an AI that can evaluate their performance and an AI that can most effectively get good performance out of a GPT. That's going to be huge, particularly the signals that have real economic costs to users (I know they haven't announced it, but I think eventually they're going to make it so you can purchase GPTs) because that starts to pull away rose-tinted glasses and the fog of futuristic sheen and get some more unvarnished data on how much people actually value these specific things. That data means eventually you should be able to just ask ChatGPT to do something for you, and if it can't do it natively it will be trained to be able to spin up a task-specific GPT with access to the correct tools, docs etc, then have the GPT Whisperer AI use it to get the right answer with a bunch of backup data, and return you the answer with the option to see the work. This is also a pretty auditable process, which makes a lot of the legal and AI safety folks happy. I don't see another company that is similarly well-placed in terms of having the tech, talent, compute, product, and roadmap to pull this off.
It can be true, but I don't think it is always or necessarily often true. The overall value proposition for interns (and apprentices etc) for society as a whole is that they will then go on to become the professional knowledge workers that they have learnt from. LLMs won't become that, so the value proposition is limited and localized. Remember the adage about garage mechanics, where the price goes up depending on how much 'help' the customer wants to give?