undefined | Better HN

0 comments

55 comments · 21 top-level

spaceman_20202y ago· 10 in thread

I'm legitimately starting to wonder what white collar workers will even do in 5-10 years.

This just Year 1 of this stuff going mainstream. Careers are 25-30 years long. What will someone entering the workforce today even be doing in 2035?

VirusNewbie2y ago

Even if we get Gemini 2.0 or GPT-6 that is even better at the stuff it's good at now... you've always been able to outsource 'tasks' for cheap. There is no shortage of people that can write somewhat generic text, write chunks of self contained code, etc.

This might lower the barrier of entry but it's basically a cheaper outsourcing model. And many companies will outsource more to AI. But there's probably a reason that most large companies are not just managers and architects who farm out their work to the cheapest foreign markets.

Similar to how many tech jobs have gone from C -> C++ -> Java -> Python/Go, where the average developer is supposd to accomplish a lot more than perviously, I think you'll see the same for white collar workers.

Software engieneering didn't die because you needed so much less work to do a network stack, the expectations changed.

This is just non technical white collar worker's first level up from C -> Java.

VikingCoder2y ago

[Guy who draws blue ducks for a living]: DAMNIT!

>What will someone entering the workforce today even be doing in 2035?

The same thing they're doing now, just with tools that enable them to do some more of it. We've been having these discussions a dozen times, including pre- and post computerization and every time it ends up the same way. We went from entire teams writing Pokemon in Z80 assembly to someone cranking out games in Unity while barely knowing to code, and yet game devs still exist.

moffkalast2y ago

Yeah it has been quite the problem to think about ever since the original release of ChatGPT, as it was already obvious where this will be going and multimodal models more or less confirmed it.

There's two ways this goes: UBI or gradual population reduction through unemployment and homelessness. There's no way the average human will be able to produce any productive value outside manual labor in 20 years. Maybe not even that, looking at robots like Digit that can already do warehouse work for $25/hour.

TrackerFF2y ago

Yes, imagine being a HS student now, deciding what to do 5-6-7 years from now.

Work will just move to a higher level of abstraction.

drubio2y ago

I'm wondering the same, but for the narrower white collar subset of tech workers, what will today's UX/UI designer or API developer be doing in 5-10 years.

Whatever you want, probably. Or put a different way: "what's a workforce?"

"We need to do a big calculation, so your HBO/Netflix might not work correctly for a little bit. These shouldn't be too frequent; but bear with us."

Go ride a bike, write some poetry, do something tactile with feeling. They're doing something, but after a certain threshold, us humans are going to have to take them at their word.

The graph of computational gain is going to go linear, quadratic, ^4, ^8, ^16... all the way until we get to it being a vertical line. A step function. It's not a bad thing, but it's going to require a perspective shift, I think.

Edit: I also think we should drop the "A" from "AI" ...just... "Intelligence."

gniv2y ago

Yeah, this feels like the revenge of the blue collar workers. Maybe the changes won't be too dramatic, but the intelligence premium will definitely go down.

Ironically, this is created by some of the most intelligent people.

samr712y ago

We're just gonna have UBI

dfbrown2y ago· 4 in thread

How real is it though? This blog post says

In this post, we’ll explore some of the prompting approaches we used in our Hands on with Gemini demo video.

which makes it sound like they used text + image prompts and then acted them out in the video, as opposed to Gemini interpreting the video directly.

https://developers.googleblog.com/2023/12/how-its-made-gemin...

riscy2y ago

After reading this blog post, that hands-on video is just straight-up lying to people. For the boxcar example, the narrator in the video says to Gemini:

> Narrator: "Based on their design, which of these would go faster?"

Without even specifying that those are cars! That was impressive to me, that it recognized the cars are going downhill _and_ could infer that in such a situation, aerodynamics matters. But the blog post says the real prompt was this:

> Real Prompt: "Which of these cars is more aerodynamic? The one on the left or the right? Explain why, using specific visual details."

They narrated inaccurate prompts for the Sun/Saturn/Earth example too:

> Narrator: "Is this the right order?"

> Real Prompt: "Is this the right order? Consider the distance from the sun and explain your reasoning."

If the narrator actually read the _real_ prompts they fed Gemini in these videos, this would not be as impressive at all!

Yeah I think this comment basically sums up my cynicism about that video.

It's that, you know some of this happened and you don't know how much. So when it says "what the quack!" presumably the model was prompted "give me answers in a more fun conversational style" (since that's not the style in any of the other clips) and, like, was it able to do that with just a little hint or did it take a large amount of wrangling "hey can you say that again in a more conversational way, what if you said something funny at the beginning like 'what the quack'" and then it's totally unimpressive. I'm not saying that's what happened, I'm saying "because we know we're only seeing a very fragmentary transcript I have no way to distinguish between the really impressive version and the really unimpressive one."

It'll be interesting to use it more as it gets more generally available though.

It's always like this isn't it. I was watching the demo and thought why ask it what duck is in multiple languages? Siri can do that right now and it's not an ai model. I really do think we're getting their with the ai revolution but these demos are so far from exciting, they're just mundane dummy tasks that don't have the nuance of everything we really interact and would need help from an ai with

How do you know though? The responses in the video were not the same as those in the blog post.

EZ-E2y ago· 4 in thread

Out of curiosity I fed ChatGPT 4 a few of the challenges through a photo (unclear if Gemini takes live video feed as input but GPT does not afaik) and it did pretty well. It was able to tell a duck was being drawn at an earlier stage before Gemini did. Like Gemini it was able to tell where the duck should go - to the left path to the swan. Because and I quote "because ducks and swans are both waterfowl, so the swan drawing indicates a category similarity (...)"

nuccy2y ago

Gemini made a mistake, when asked if the rubber duck floats, it says (after squeaking comment): "it is a rubber duck, it is made of a material which is less dense than water". Nope... rubber is not less dense (and yes, I checked after noticing, rubber duck is typically made of synthetic vinyl polymer plastic [1] with density of about 1.4 times the density of water, so duck floats because of air-filled cavity inside and not because of material it is made of). So it is correct conceptually, but misses details or cannot really reason based on its factual knowledge.

P.S. I wonder how these kind of flaws end up in promotions. Bard made a mistake about JWST, which at least is much more specific and is farther from common knowledge than this.

1. https://ducksinthewindow.com/rubber-duck-facts/

I showed the choice between a bear and a duck to GPT4, and it told me that it depends on whether the duck wants to go to a peaceful place, or wants to face a challenge :D

z72y ago

Tried the crab image. GPT-4 suggested a cat, then a "whale or a similar sea creature".

bookmark12312y ago

The category similarity comment is amusing. My ChatGPT4 seems to have an aversion to technicality, so much that I’ve resorted to adding “treat me like an expert researcher and don’t avoid technical detail” in the prompt

ACS_Solver2y ago· 3 in thread

To quote Gemini, what the quack! Even with the understanding that these are handpicked interactions that are likely to be among the system's best responses, that is an extremely impressive level of understanding and reasoning.

CamperBob22y ago

Calls for a new corollary to Clarke's Third Law. "Any sufficiently-advanced rigged demo is indistinguishable from magic."

quackery12y ago

Does it really need to have affectations like "What the quack!"? These affectations are lab grown and not cute.

spaceman_20202y ago

What would be Gemini's current IQ? I would suspect it's higher than the average human's.

dblitt2y ago· 2 in thread

> For the purposes of this demo, latency has been reduced and Gemini outputs have been shortened for brevity.

Seems like this video was heavily editorialized, but still impressive.

nathanfig2y ago

Definitely edited, pretty clear in some of the transitions. Makes me wonder how many takes were needed.

andrewprock2y ago

The prompts were also likely different:

video: "Is this the right order?"

blog post: "Is this the right order? Consider the distance from the sun and explain your reasoning."

https://developers.googleblog.com/2023/12/how-its-made-gemin...

thunkshift12y ago· 2 in thread

They should do this live instead of a pre recorded video for it to be more awe inspiring. Googles hype machine cannot be trusted.

galaxyLogic2y ago

Right. I would hope that competition does such live demonstration of where it fails. But I guess they won't because that would be bad publicity for AI in general.

+1. Or at least with no cuts, and more examples.

This is obviously geared towards non-technical/marketing people that will catch on to the hype. Or towards wall street ;)

w10-12y ago· 2 in thread

It's a very smooth demo, for demo's sake.

So the killer app for AI is to replace Where's Waldo? for kids?

Or perhaps that's the fun, engaging, socially-acceptable marketing application.

I'm looking for the demo that shows how regular professionals can train it to do the easy parts of their jobs.

That's the killer app.

Regular professionals that spend any time with text; sending emails, recieving mails, writing paragraphs of text for reports, reading reports, etc; all of that is now easier. Instead of taking thirty minutes to translate an angry email to a client where you want to say "fuck you, pay me", you can run it through an LLM and have it translated into professional business speak, and send out all of those emails before lunch, instead of spending all day writing instead. Same on the recieving side as well. Just ask an LLM to summarize the essay of an email to you in bullet points, and save yourself the time reading.

konschubert2y ago

There are many answers and each is a company.

brrrrrm2y ago· 1 in thread

I once met a Google PM whose job was to manage “Easter eggs” in the Google home assistant. I wonder how many engineers effectively “hard coded” features into this demo. (“What the quack” seems like one)

rvnx2y ago

Probably not "hard coded" in the literal way, but instead, if the model is using RLHF, they could thumbs up the answer.

SamBam2y ago· 1 in thread

Wow, that is jaw-dropping.

I wish I could see it in real time, without the cuts, though. It made it hard to tell whether it was actually producing those responses in the way that is implied in the video.

right. if that was real time, the latency was very impressive. but i couldn't tell.

globular-toast2y ago· 1 in thread

It seems weird to me. He asked it to describe what it sees, why does it randomly start spouting irrelevant facts about ducks? And is it trying to be funny when it's surprised about the blue duck? Does it know it's trying to be funny or does it really think it's a duck?

I can't say I'm really looking forward to a future where learning information means interacting with a book-smart 8 year old.

u3202y ago

Yeah it's weird why they picked this as a demo. The model could not identify an everyday item like a rubber duck? And it doesn't understand Archimedes' principle, instead reasoning about the density of rubber?

avs7332y ago· 1 in thread

honestly - of all the AI hype demos and presentations recently - this is the first one that has really blown my mind. Something about the multimodal component of visual to audio just makes it feel realer. I would be VERY curious to see this live and in real time to see how similar it is to the video.

you haven't seen pika then.

danpalmer2y ago· 1 in thread

I literally burst out laughing at the crab.

bogtog2y ago

The crab was the most amazing part of the demo for me.

jeron2y ago· 1 in thread

It’s technically very impressive but the question is how many people will use the model in this way? Does Gemini support video streaming?

In 5 years having a much more advanced version of this on a Google Glass like device would be amazing.

Real time instructions for any task, learn piano, live cooking instructions, fix your plumbing etc.

nuz2y ago· 1 in thread

This makes me excited about the future

RGamma2y ago

Let's hope we're in the 0.0001% when things get serious. Otherwise it'll be the wagie existence for us (or whatever the corporate overlords have in mind then).

Technically still exciting, just in the survival sense.

Curious how canned this demo is, in the last scene the phone content rotates moments before the guy rotates it so its clearly scripted

I suspect the cutting edge systems are capable of this level but over-scripting can undermine the impact

drubio2y ago

All the implications, from UI/UX to programming in general.

Like how much of what was 'important' to develop a career in the past decades, even in the past years, will be relevant with these kinds of interactions.

I'm assuming the video is highly produced, but it's mind blowing even if 50% of what the video shows works out of the gate and is as easy as it portrays.

kromem2y ago

The multimodal capabilities are, but the tone and insight comes across as very juvenile compared to the SotA models.

I suspect this was a fine tuning choice and not an in context level choice, which would be unfortunate.

If I was evaluating models to incorporate into an enterprise deployment, "creepy soulless toddler" isn't very high up on the list of desired branding characteristics for that model. Arguably I'd even have preferred histrionic Sydney over this, whereas "sophisticated, upbeat, and polite" would be the gold standard.

While the technical capabilities come across as very sophisticated, the language of the responses themselves do not at all.

This is a product marketing video, not a demo.

mandarlimaye2y ago

Google needs to pay someone to come up with better demos. Atleast this one is 100x better than the talking to pluto dumb demo they came up with few years ago.

relativeadv2y ago

its quacktastic

https://www.youtube.com/watch?app=desktop&v=kp2skYYA2B4

jansan2y ago

They should call it "Sheldon".

j / k navigate · click thread line to collapse