"Yes, it's important to make sure developers have the tools they need to create innovative products with these models."
"Oh look, I found an interesting article on thoriumsim.com about a star ship bridge simulator called Thorim Nova."
"Hmm, sounds interesting let's read it."
Absolutely painful. I would love something that summarizes the articles and discussion without pretending to be a conversation between two people. I mean it says it is AI generated but they are adding all this conversational fluff which really does not work for me.
It is interesting to see these pieces come together but I want to tear my ears out of my head when I hear things like "Yes, it's important to make sure developers have the tools they need to create innovative products with these models." or just repeatedly adding the word "interesting" to summaries of articles.
Please just give me a bog standard summary in audio form without this faux commentary. I do not find the "insights" of ChatGPT worthwhile.
If AI adds aspects that I dont like on podcasts, it's worthless for me.
Yeah, the conversational fluff is easily my least favorite part of podcasts. Adding that to an AI version that should theoretically be even more efficient at summarizing/delivering information is... unexpected. A non-starter for me.
On the other hand, the banter on this podcast is still less cringey than my local TV news, which is a compliment.
Yeah I don't know if ChatGPT is capable of acting like a knowledgeable person but I suspect it would have a tendency to make the kind of novel insights that people tend to make. The most statistically likely word might be sort of smoothed over conceptually.
But also it was frustrating to hear the opening of this podcast go with "we are two AI generated hosts running on ChatGPT" and then when they move to talking specifically about the ChatGPT API article they talk about all these products running on the new API and there is not one single comment in that section where they say "and of course us too! haha" like any human being would if some relevant spot in an article came up like that.
Also the kind of tech podcasts I like to listen to are more critical. They are not just going to tell you what someone announced but also why some part of this announcement is probably nonsense or improbable. I can imagine this AI podcast talking about some new NFT announcement without any hint of doubt as to the claims made in the announcement.
And summarizing the discussion is the real challenge, which I have not seen ChatGPT do. As another user has commented I do not come here for the articles, but for the discussion.
If I wasn't clued in, I probably wouldn't know these weren't human. At least the male voice sounds slightly more natural to me.
ps, I feel like inflection is going to be one of the harder things for an LM to pick up, given all the subtext humans can convey with it.
I thought so too. What do they use for text to voice?
I also feel like every application of ChatGPT seems to completely miss the point of the media it mimics. Podcasts are not merely coherent voices talking to each other. Getting rid of human presenters is literally soulless. People already don't listen for much subtler reasons. Entertainers get canceled, media companies get boycotted, bias divides audiences, etc.
That's not going away with or without AI. There is no "tweaking" the training without putting humans right back into the equation and probably making production way more expensive than it's worth. There is no scalability payoff either. Who wants to listen to the same podcast cloned a million times with just replaced voices? We already have this problem with podcasts today and it kills any interest to consume it.
* exactly as long as your breakfast consumption time
* talks about the papers you're reading, but...
* is as shallow as a puddle and as funny as being the person who steps in one?
Because that's what this is. The synthesized discussion combines all the insight of a breakfast radio host interviewing a guest on a specialized technical topic, and the banter as engaging as a technical specialist of some kind trying to host breakfast radio.
By the way, I'm not trying to be overly critical of the developers of this experiment, which is a great illustration of where we're currently at with a bunch of technologies. But it also very starkly illustrates its current limitations.
EDIT: to be clear I'm not hating on LLMs, but that whatever the next big thing is probably won't be imitating what exists today
The thing that might hold it back is the latency in the experience. You could mask it with the AI equivalent of "ummm ..." to get to maybe 5-10s.
Once things settle down we'll start to see some seriously useful stuff, but for the moment it's the wild west.
A possible use case for this could be podcasts dealing with inflammatory, politically divisive topics and disguised as coming from real hosts.
When my favorite podcast ended it felt like I lost touch with a group of friends, this ain't going to have that sort of impact on me. Pass.
These are students playing with new technology to produce animated characters that move in unintuitive ways, resulting in something actually quite interesting, yet unnaturally creepy (which was intentional).
Miyazaki dismissed it 'an insult to life itself'. I can't imagine the disappointment those students must have felt.
If you suspect I'm writing in a way to try to make you feel (or not) a certain way, or to avoid breaking some taboo, or to follow some dogma, then you have no real reason to care about what I ultimately say, because you have no real reason to think its "authentic." By contrast when he views something overtly as "an insult to life itself" it's an incredibly insightful view on his perspective of the world. You would have lost so much in "translation" had he crafted his message in a less sincere way.
I also think this is why there will be minimal to zero market for much "AI" content. Content is not just content. It's a reflection of ourselves. Think about how much you can, probably accurately, infer about me, my views, and more - based on these 3 paragraphs. When this comes from a chatbot, any reflections you might see would be as real as the shapes you might see in the clouds.
My dad always said "computers are very fast idiots". They will probably never be Mozarts, but they can and will be Salieris; and most of the world would be extremely happy to have a personal Salieri - in fact, we'll probably be happier like that, considering how Mozarts can be very problematic from so many perspectives.
Last weekend I watched part of an episode and there was a scene where they walked towards "Place de la Pointillisme" [0]. The effect is clearly CGI and you can see how Homer and Marge are actually animated 3D models, so effectively all the "newer" shows (it was aired May 8, 2016) are computer animations with a very flat cel shader. Some argue that newer episodes aren't as good as old ones, but I'm not sure if this could be attributed to them not being hand drawn anymore. In any case, one could apply an XKCD-shader to make the lines a bit more human if the look doesn't appeal.
The Miyazaki video, I get it why he says what he says, but it's an issue with the students targeting the wrong audience. I could see their horrible graphics being a part of a horror movie or game, but that is a completely different world than Miyazaki's.
[0] https://simpsonswiki.com/wiki/File:Pointillism_Marge_and_Hom...
``` That's all for the weather report. Now we have some breaking news. A maniac has stolen a tank from a military base and is leading the police on a wild chase through the city streets. We have our reporter on the scene with more details. Stay tuned for this developing story. ```
Someday the AI will introduce mistakes on purpose to seem more human like.
Well... OpenAI did also release the Whisper API
I've found myself wanting to listen to HN comment threads, as I'm one of those people who derives more value and entertainment from the comments than I do from the actual submissions a lot of the time! I envision a voice-controlled way to navigate through threads too. Basically an accessibility narrator on steroids.
I wonder if anyone else has ever been interested in something like this. Getting good voices to read like this podcast would make it that much more fun, so thanks for getting me really hot and bothered :)
Guess if no one does it soon I'll have to build it myself!
I released a product yesterday that handles all aspects of the URL to speech process called Chief of Staff. [2]
The initial version only synthesizes article text alone.
I found there is a great deal of nuance to text to speech synthesis, from the player behavior itself to handling quotas around cloud services.
My goal is far greater flexibility and features—-a genuine Chief of Staff that briefs you on information you care about in a format and medium that beats suits your.
Thread reading is just one application of this.
I'm thinking maybe having it read just the top thread if there's only 10 or so comments or just the main threads if it's a topic with hundreds of main threads.
maybe we could make some way where the reader can text a code and we'll link them directly to a comment if they want to dig deeper, or save/upvote it.
This was pre GPT apis but essentially what i was doing was using a python summarization library to sumarize articles from rss feeds into a simple tts podcast. Probably a lot of money in custom GPTCasts made off someone personal rss feed as a service IMO.
Maybe an AI generated newsletter or aggregator with a voice summary?
Shoot me an email. (Email is in profile)
I want Sam Jackson and Molly Wood to read my hn please.
I would love to know about any other alternatives that I may have missed.
While this technical demonstration is a long way from replacing "real podcasts", it's just the very beginning.
What are the implications here?
So a person will "vouch" for content, and we consume the media vouched for by people in our white list
We won't be able to consume media outside of the whitelist, because it will just contain too much noise
An AI can't sit in a dusty archive room and draw inferences from hand-written minutes. It can't interview the survivor of a tragedy or the winner of a trophy. AI software depends on sentient wetwear to generate the fundamental content.
Between this and the advances in robotics, it feels like we’re within decades of some really tough times for humanity.
We could also be within decades of utopia. But my money is on these technologies being used in bad ways far more often than for good. Hopefully I’m just overly cynical!
Good luck kiddos!
Will be very difficult to detect in the future and will result in trust issues / rampant fake news.
I'd venture to say, this will only increase people's skepticism, which is a good thing. We need people to start thinking for themselves instead of turning off their brain and just being fed info they assume they can trust.
https://www.businessinsider.com/these-6-corporations-control...
Never discount the influence of high production value on any form of media. Look at the utter crap music and films that have dominated mass media for decades. The best produced and most palatable fare nearly always sells best, no matter what the quality of the underlying content.
kinda like how we can recall things closer to the event than months later.
it knows a ton from it's training but it still got it from the web so always question it, but if we can add meta data and other things to strengthen the llms understanding it shouldn't hallucinate much at all.
(extracted from the apple podcast link)
The voice sounds uncanny with unusual breathing pauses, and there isn’t a name announced when they come on or sign off the traffic report.
(1) https://jazz.fm/
I have yet to see something like this. Something less “perfect” sounding than say the google maps voice.
Have you heard ShowDoJo's Wurst Take yet?
https://www.twitch.tv/showdojo
It's not perfect but it's one of the best I've heard so far.
"Sounds like super useful pickles for those who work with binary files!"
I would really like to have a timestamp to click in the story listing.
This would begin playing the audio at that story.
There’s some really funny stuff there, the voices are not perfect, but have a lot of expression.
If I could choose a preference for personality and voice, I'd probably be sold.
Any affiliation with https://old.reddit.com/r/airadio/ ?
I enjoy reading Hacker News even though I don’t have the background to understand most of the stories, because I can easily skip to stories I am interested in. With the podcast, I got stuck listening to everything, including quite a few stories I didn’t understand. Either the podcast needs to focus more on stories of general interest, or it needs to explain the context and significance of the technical stories better.
I'm listening on a laptop and would rather not adjust the system volume and affect all other apps with sound.
Otherwise, the convenience of audio format makes it among the interesting uses of AI that I've seen.
It's basically a headline reader with some fluff, but it does a great job at that and there are whole teams of real humans providing such podcasts today, so that's saying something.
It can get weird or even a little broken though. See timestamp 09:50 of the Feb 23 2022 episode:
Laura: So, we're gonna talk about an article called Generic Dynamic Array in 60 lines of C that can be found on gist.github.com.
Zod: Alright, shall we read the article?
Laura (voice 2, almost a different voice): Sure, let me share it here.
Laura (voice 1): "Laura reads the article." <this is verbatim in the podcast>
Laura (voice 1): OK, so that was the article. What do you think about it?
Zod: I think it's interesting that you can define a generic dynamic array in such a small amount of code...
Like, cool technical implementation, but a failure from concept.