The flip side might be that Alexa was ahead of its time, and that the ML capabilities werent there. But I bet Alexa spent more than OpenAI by a huge margin. Amazon's fundamental flaw is trying to solve innovative business problems with incremental improvement. This only works in operations heavy businesses, like retail and AWS. AWS is really just extremely competent operations on top of server management.
The whole company is a meat grinder, poor technical implementations with an army of SDEs keeping it running. It definitely works, they ship product, but the company is underperforming the S&P for five+ years now. Walmart has higher shareholder returns. They need to flush out the legacy employees and make room for hungry young workers.
I had the thought while grumping at my Alexa. I say the same phrase every day, and every day there's a 50% failure rate. I keep looking for phrases that it can understand better than the ones I've cycled through, but they all suck.
But Alexa, as shipped, was pretty great. The fact that it hasn't seemed to improve at all since confounds me. The reports of the high body counts in the division surely attest to the lack of focus or interest in core Alexa skills. All those product and integration priorities were on the wrong product stack layer.
Same with: iPhone typing, Siri, Tesla software (and beyond FSD -- the only thing that seems have happened in the 2 years I've owned my vehicle is the addition of Apple Music), Apple Watch, etc.
Voice is still just an implementation detail. I'd argue that we still aren't at the point where general purpose computer assistants are all that useful relative to humans. It's the same 10% (or whatever) problem as self-driving has.
If you want changelogs you can look at this fan-run site. A lot of small iteration and a few big splashes (depending on if you like the feature or not)
So, for example, on my system “start my car” actually maps to “tell blue link to start my Ioniq and set the temperature to 72” because the skill is godawful picky about getting all of that.
Edit: I almost forgot. There was also the time it stopped understanding "stop". It properly parsed the word but didn't know what to do with it. IIRC I had to use the mobile app to stop playback that day.
Google Maps (mobile). Horrible UX in so many ways.
The problem is that collections of people behave wildly different from individuals. The culture of an org that's mostly H-1Bs is wildly and starkly different than ones with different ratios. This problem, at least when I initially joined Amazon many, many years ago, was actively warned against. The onboarding had a little speech about cultural differences and how it's OK to push back here. We aren't here to be task takers.
Fast forward to today and the culture is wildly different. Many of the practices of offshore teams have leaked into local ones, because the same culture that bred them over there are now in leadership positions here. There's no shortage of teams operating like unquestioning ticket jockeys these days. Your job isn't to produce value, it's to close tickets as fast as possible. Do what management says. Don't ask questions.
Interesting thought ... could be a kernel of truth there.
I think you might be confusing correlation with causation. Software engineering becoming a well paid job for the masses is the reason for both decline of exclusive hacker culture and amount of career-seekers (both domestic and international) willing to join the industry.
Wholeheartedly agree. Any time a team or individual had a good implementation that deviated from that core Alexa group, they stole it and fired the people who designed and implemented it. Nevermind the amount of personal customer data they leaked in the process. Embarrassing.
People who left Alexa to go to places like Meta actually showed return on investment. The people who ran them off? They're still at Alexa, they probably aren't the ones being fired, and you can easily recognize their work in projects like Amazon Music and Prime Video. Some of them even have the nerve to boast working on "AGI," as if the echo of their Titan flop was not loud enough.
Almost all big tech companies face the same problem in slightly different ways, and further increased due to the sloppy hiring in recent years. In Google they hired tons of extremely incompetent folks, and now those folks band up together and try to closely guard information, as that is the only way they can ensure their performance is not very bad compared to others. People who are both technically and politically incompetent get pushed out, while the politically savvy and technically incompetent ones manage to create moats for their survival.
[1] https://aws.amazon.com/executive-insights/content/how-do-you...
> They need to flush out the legacy employees and make room for hungry young workers.
I think there might be some cognitive dissonance at play, here
It sounds to me like they need to flush out incompetent management.
But for an interaction in the car to see some available podcasts and play an episode? It's just not "smart" enough. And you go through a few rounds of frustration trying to get an assistant to do something and you just stop trying.
There are specific use cases. E.g. my dentist uses one as a timer especially when he's gloved up. But generally they're pretty optional.
Ageism demanded by the masses. Tech workers are a funny bunch in how they love digging their own graves.
Good luck. Hungry young workers will leave and go elsewhere once they see how things are run. Speaking as such a worker who's now dealing with incompetent middle management, Amazon is where dreamers and hungry engineers become cynical and disenfranchised.
This describes pretty much every AI startup I know, both good and bad.
I suspect the places where AI will really succeed will be places that entirely change existing workflows and user experiences. This requires a fair bit of research and experimentation, but most startups don't have the appetite for that, and VCs are putting on the pressure to do AI stuff fast.
A lot of AI potential is being squandered now due to a simple lack of patience and planning in the rush to market.
Super ageist attitude, my observation is many experienced engineers are the ones who would like to improve things. The "hungry" people you refer to are just as likely to be ladder climbing, job hopping, cutting corners for quick wins, and playing politics.
* What's the weather?
* What time is it?
* Set a timer for X minutes
* Where is my stuff?
Edit: Also, measurement conversions in the kitchen are handy.
It doesn't do this very well for me. It only knows things like "a package arrived", or "a package is coming tomorrow". It doesn't say what the package is, etc.
Fortunately, they seemed to have backed off on that a bit. It was a good reminder for me though - they control all the dials and can take something I bought and enjoyed using and enshittify it at the drop of a hat.
What job doesn't have this?
I assumed that both Amazon and Google were underwhelmed by how much actual revenue these kinds of devices produced, so they were starving the backend services.
Now it looks like both companies are hoping that Generative AI is going to make them more valuable [0]
[0] https://www.cnbc.com/2023/08/01/google-reshuffles-assistant-...
2016 - "Hey Google, play Gorillaz"
2018 - "Hey Google, play music by Gorillaz"
2020 - "Hey Google, play music by Gorillaz on Google Play Music"
2023 - "Hey Google, play album Demon Days by Gorillaz on Google Play Music"
One day the previous command will suddenly stop working with a "I don't understand" error, so he has to figure out the new incantation to get it to do anything remotely close to what he wants.
For years I've been questioning the usefulness of voice assistants and have been mostly ridiculed for it. Beyond a few edge cases, like setting a timer when cooking or use in cars, I still don't see people actually using them all that much. So I'd agree that the potential revenue was vastly overestimate, nor are the lack of a powerful voice assistant going to hurt sales of devices such as phones. The devices which can only be used as a voice assistant is going to go away obviously.
I have found them consistently useful as broadcast devices from Home Assistant, however -- sending media from plex, etc. I haven't yet tried to utilize them for interacting with Home Assistant directly.
Anecdotally, I also found the music selection accuracy went down considerably once Google Play Music was merged into Youtube Music.
Going from the search corpus containing only the actual song/artist name to an uploader-provided title aimed at gaming the recommendation algorithm made this a foregone conclusion. It's incredibly frustrating.
Some other things I've noticed-
- the "nearest device" feature almost never works now. I'll quietly speak in one room only to have a device at the other end of the house activate either instead of or in addition to.
- "play white noise" has a 50% chance of playing death metal
- there is still a huge amount of functionality that's only available to free google accounts and not gsuite users. Generally this is discovered by trying to do something like add a reminder and having the device crash/restart.
Every one of my "all in on home assistants" friends, no matter which ecosystem, all generally feel the same way that the assistants are strangely worse today than a few years back and the only trajectory seems "subtly worse" but it is hard for almost everyone to explain how/why they are worse than before. It's an interesting phenomenon, anecdotally at least.
It doesn't seem to be explainable purely economically either, perhaps. Most software you leave it alone and stop paying for maintenance work and it doesn't just slowly lose features or get worse. I wonder how much there is some sort of entropy effect we are seeing on these "AI assistants". It's fun to bring out the Marathon/Halo term "rampancy" for this, and Microsoft invited us to directly do that by even calling theirs Cortana for a while (Copilot as a current name has such less interesting personality). I think there is something of a rampancy problem we're seeing across all players (Amazon, Google, Apple, Microsoft) and I wonder how inherent a problem it is to all of our current ML approaches. I don't directly know why it is happening or what it means, but it has been an interesting thing to observe anecdotally because it seems consistent despite some very different models/approaches/corporate overlords.
Relatedly, Discord's Clyde has been on slower but consistent path to rampancy in a "Tay way" (thanks Microsoft for that example in the chat, too) and Discord just admitted they will be shutting it down in early December.
I also wonder, in some respects, if the pandemic and everyone being home more often, lead to people using these products more often and/or intensely, and were finding their flaws faster than perhaps pre pandemic.
This might have been true for client software. It has never been true for services, and is especially not true for services with diverse dependencies on other teams and products.
> underwhelmed by how much actual revenue these kinds of devices produced, so they were starving the backend services
Absolutely. I got all of my minis for either 30CAD or for free through promotions. These devices were always sold at a loss on the hope they'll make it up on the other end.
ChatGPT's voice chat feature in the iOS app is incredible. It's easily 2-3 orders of magnitude better than Google Home when it was operating at it's peak. I'd happily pay a (potentially steep) monthly fee for voice assistants that aren't neutered and have similar capabilities. They really are fantastic when they work.
I personally don’t think LLMs are anywhere near ready for this type of rollout, so it will be a while before we see results. The cost of something like GPT 3.5 deployed at scale is tremendous. The censoring of responses will make it an obstreperous assistant, unless this has gotten considerably better in the last few months. And are we all just going to accept hallucinated results over something that is either vetted or clearly misunderstood?
I don’t even know how far away we are from sufficiently powerful TPUs that fit a mobile case and power profile. But I do think this is likely to be the next big mobile advancement, however far off it may be.
Latest autocorrect is already based on LLMs. https://9to5mac.com/2023/11/14/ios-17-2-disable-inline-text-...
> just integrate the latest AI tech into them
The Ultra 2 watch has a GPU (neural engine) that can run on-device Siri. Works great, I've been able to run basic commands like setting timers and local shortcuts in areas with no cell reception.
They have had powerful devices with best-in-class integration for many years now, and Siri is still a frustrating experience and lags significantly behind competing voice assistants.
Competitors also have access to LLMs, and competing ecosystems are maturing (e.g. Samsung, Google). I wouldn’t be surprised if Apple makes waves with AI, but I also wouldn’t say they’re particularly more likely to do it than anybody else.
Now rather than deal with all this cruft, or deal with competing assistants (e.g. why the fuck does Google Assistant still exist if you have Bard???) Apple has the runway and the capital to make this type of LLM upgrade to Siri and the well-maintained ecosystem.
Because an integrated smart home assistant is a different problem domain than a general purpose chatbot.
ChatGPT was a revolutionary technology that released less than a year ago and surprised the world with its capabilities. It’s going to take some time to integrate that technology into an ever-present integrated assistant that leverages your personal knowledge graph to take actions on your behalf, but they’re working on it.
And Assistant imo has a huge leg up because of the investment in software integrations made in the pre-LLM world. Meanwhile when I ask Siri to show me photos of myself, it does an image search of the web and returns pictures of the word “myself”
It’s become second nature to say “Alexa, seven minute timer” while I’m cooking. Or “Alexa, add soy sauce to the shopping list” when I open the fridge and pull out an almost empty bottle.
The other big issue is that it's not smart enough for any random request to work, so you just learn a few useful ones (timers, music, thermostat maybe...) and stick with those.
I think things might change when/if they hook up ChatGPT-level AI to them... but I still wouldn't spend money that way.
But you’re spot on about learning commands instead of it understanding randomly phrased requests.
Alexa works just about perfectly, setting, unsetting and snoozing alarms as needed. But Google is almost comically incompetent. "Hey google, set an alarm for 4:20 am" -- "Alarm set for 8 pm".
When Alexa starts ringing, I can quietly say "stop" or "snooze". To get Google's attention when it is making noise, both of us have to scream at it more ofte than not.
- refuses to understand English at times (maybe it's picking some weird localisation)
- will not hear. Period. Literally have to shut the power couple times.
Google's thing is even more stark because it had no primary raison d'etre so I only ever saw one in the wild and it was at a google employee's home!
Apple went the other way, making a speaker that you could also talk to. Of course Siri is such crap that it's fundamentally "un-sirious" too.
But speech is not a terrible adjunct. I often use speech to control my watch (start a workout, set a timer, what's the weather outside) bc the screen is so tiny and the controls so few. It sometimes tells me things if I am wearing airpods. But I have the watch cranked down to only provide me with a few things, mostly by glancing at it after it taps me. And who would design a tap-only interface?
Honestly, Kindles aren't that different. I like them (and my iPad) for traveling but I'm not sure ebooks have been the revolution that some thought they would be.
An utter failure by Dave Limp and company.
I haven’t seen Alexa evolve that much but maybe someone who worked at Amazon can give more on what’s evolved.
Wouldn't be surprised to see them roll out a new optional Alexa OS upgrade that you can opt into which has a subscription fee attached and provides AI coolness.
I'm still looking for another solution so I can get out of the hell of maintaining Alexa's random new notification setting of the day. I really hate all the new "suggestions" and new crap they keep shoving down your throat with no option to turn it off. I would pay a subscription to turn that it off the annoyance, but there is no option for it.
I gave it a shot for a year as a potential source of side income and released about 10 skills publicly (+20 more are half-done or abandoned).
I've learned quite a bit about this platform over the past few years. I'd say the only positives I got out of it are a monthly $50 credit on AWS and sporadic surveys that netted me anywhere from $5 - $50 each.
My notes:
- VUI is a terrible interface. There are too many variables (foreign accents, volume, word recognition, etc.) that just make it clunky and unusable. Testing and debugging in the Alexa Console does not work 100% like on a real device.
- The ISPs (In-Skill-Products) are confusing to developers and users. The documentation is often wrong.
- The Alexa forum posts are rarely addressed by anyone at Amazon. Instead, you start to see fellow developers venting their frustrations.
- The intent model and lack of state machine makes it incredibly difficult to make a skill worth someone paying for it. It wasn't uncommon for testing to work flawlessly and then the published skill would be so spotty as to be unusable. Imagine being in the middle of an Alexa game and then unceremoniously receiving a the default help response (required for Alexa skills) that restarts your progress.
- Alexa doesn't work like people expect it to. Again, the intent/slot paradigm is pretty terrible. There's a lot of grunt work to have it even recognize slight variations of slots, so you end up with a huge list of sample intents users could potentially use.
- Internationalization is a HUGE pain. There are different availabilities for ISPs, Alexa services, etc.
- The Alexa Console is very clunky. Today (just like every day), I loaded the Analytics for one of my skills and got a 500 error. Reloading two or three times usually solves this issue, but c'mon!
- Good luck trying to market your skill. You could game the system by publishing updates or implementing new Alexa features, but getting your skill onto someone's account is basically a guessing game of what will work or not.
- Amazon Skills are not very profitable for individual developers. This puts the platform in jeopardy, as only the big players care enough to have their brand out there and sole developers don't make enough to make useful Alexa skills.
- Other skill developers could just copy your skill and publish it! The certification process is weird and oftentimes seems to be arbitrary.
There's more I could gripe about, but I've abandoned the platform. I suspect that Amazon's real product was the unfettered access to real-world speech data. By having a surveillance device in a user's home, they were able to collect data and train their own software.