HackerFM – An AI Generated HN Podcast Using the New ChatGPT API (opens in new tab)

(hackerfm.com)

351 pointsthewarrior3y ago156 comments

156 comments

134 comments · 54 top-level

TaylorAlexander3y ago· 11 in thread

"I'm glad OpenAI is committed to refining its API terms of service to better meet the needs of developers."

"Yes, it's important to make sure developers have the tools they need to create innovative products with these models."

"Oh look, I found an interesting article on thoriumsim.com about a star ship bridge simulator called Thorim Nova."

"Hmm, sounds interesting let's read it."

Absolutely painful. I would love something that summarizes the articles and discussion without pretending to be a conversation between two people. I mean it says it is AI generated but they are adding all this conversational fluff which really does not work for me.

It is interesting to see these pieces come together but I want to tear my ears out of my head when I hear things like "Yes, it's important to make sure developers have the tools they need to create innovative products with these models." or just repeatedly adding the word "interesting" to summaries of articles.

Please just give me a bog standard summary in audio form without this faux commentary. I do not find the "insights" of ChatGPT worthwhile.

cpill3y ago

I actually want summaries of the top comment threads. often on HN I go straight to the comments to see if the articles is worth reading. half the time I get all the info I want from the debates that I don't read the article

t0bia_s3y ago

Touché. I just come to comments and find out, from post you replying on, that it is not worth reading/listening because I find out podcast as timeconsuming way for gathering informations.

If AI adds aspects that I dont like on podcasts, it's worthless for me.

Tiereven3y ago

The show is currently a one-off recording doing all the rendering beforehand... But the beauty of what they've done here is that there's nothing preventing someone from doing this for every visitor. Don't like the way this one is generated? Give it your feedback and that can be used to shape the generated output. It was too laid-back chill for my taste, and right now, all i could do is adjust the playback speed. But dev time and money is the only barrier at this point to me having a conversation with the virtual hosts, telling them i like fast paced shows with more depth in the technical areas, and having them change and personalize on the fly -without changing much of what they have already ingested.

TaylorAlexander3y ago

This presumes such a construct is capable of producing interesting content. I’m just not sure I’m interested in the insights of ChatGPT no matter how it has been asked to behave.

drusepth3y ago

>Absolutely painful. I would love something that summarizes the articles and discussion without pretending to be a conversation between two people. I mean it says it is AI generated but they are adding all this conversational fluff which really does not work for me.

Yeah, the conversational fluff is easily my least favorite part of podcasts. Adding that to an AI version that should theoretically be even more efficient at summarizing/delivering information is... unexpected. A non-starter for me.

drcode3y ago

Yeah I'm torn about this- I kind of agree I'd prefer this to be a more direct information transfer. But I also think that a conversation between two knowledgeable people on a subject can add additional insights and can be more entertaining to listen to: That's what they are attempting here, but I agree this sort of dynamic is still too hard to manufacture from whole cloth in 2023.

On the other hand, the banter on this podcast is still less cringey than my local TV news, which is a compliment.

TaylorAlexander3y ago

> a conversation between two knowledgeable people

Yeah I don't know if ChatGPT is capable of acting like a knowledgeable person but I suspect it would have a tendency to make the kind of novel insights that people tend to make. The most statistically likely word might be sort of smoothed over conceptually.

But also it was frustrating to hear the opening of this podcast go with "we are two AI generated hosts running on ChatGPT" and then when they move to talking specifically about the ChatGPT API article they talk about all these products running on the new API and there is not one single comment in that section where they say "and of course us too! haha" like any human being would if some relevant spot in an article came up like that.

Also the kind of tech podcasts I like to listen to are more critical. They are not just going to tell you what someone announced but also why some part of this announcement is probably nonsense or improbable. I can imagine this AI podcast talking about some new NFT announcement without any hint of doubt as to the claims made in the announcement.

1 more reply

lobocinza3y ago

Yep, I want a virtual assistant not a virtual friend. I want to save time that I can expend enjoying nature and my fellow humans. I don wan't to spend time with fake interactions.

angelbar3y ago

Alas "computer" from Star Trek?

1 more reply

bredren3y ago

How would you envision getting the audio summary? Feed the app a URL and have it come back with a spoken digest?

TaylorAlexander3y ago

No, I like the idea of it being a daily summary of HN that is automatically generated. But the fake banter does not work for me coming from AI the way it does with human hosts. Basically with a human host I develop a parasocial relationship with them where the banter feels like we are hanging out, so it makes it fun. This is also based on human experience so the host will make comments based on their real life interspersed with the dialog. But the AI has no real life that it is capable of remembering, so the simulated banter feels hollow. In that case I would rather this be omitted, and simply do the summary that ChatGPT is good at without trying to pretend to be human. Basically a to-the-point, more businesslike attitude would be nice. Like it says "I found an article on XYZ" but you didn't "find" it, it was the second article on HN and you are giving me the top articles on HN. This mock-reality is uncanny. Just say "the second popular article today was..." and give me a summary, then summarize the discussion.

And summarizing the discussion is the real challenge, which I have not seen ChatGPT do. As another user has commented I do not come here for the articles, but for the discussion.

1 more reply

breakpointalpha3y ago· 11 in thread

The quality of the voices here is striking.

If I wasn't clued in, I probably wouldn't know these weren't human. At least the male voice sounds slightly more natural to me.

zmmmmm3y ago

Realistic but very lacking in expression and no humor at all. And very slow paced. I'd want significantly more personality to be happy with it - I wonder if the reason its like this is because when they try to spice it up we are back to inappropriate things popping out.

SMAAART3y ago

Comedians are not born, they are trained.

returnInfinity3y ago

I believe that may get fixed eventually, not 100% may be at least 80%.

bluehex3y ago

That's interesting personally I found the male voice to sound more robotic and the female voice to sound much more natural.

imglorp3y ago

Same. The female voice, especially on the first sentence in the cast, is very well inflected to separate phrases and add interest. After that, it's downhill.

ps, I feel like inflection is going to be one of the harder things for an LM to pick up, given all the subtext humans can convey with it.

TheHumanist3y ago

Laura sounds very realistic... Zod is a bit less so . Both still very impressive. This was really cool. I'm excited to see all the new ideas with this api access.

petilon3y ago

> Laura sounds very realistic... Zod is a bit less so

I thought so too. What do they use for text to voice?

4 more replies

yieldcrv3y ago

Always remember, this is as bad as it will ever be

1 more reply

nothrowaways3y ago

She is lora. Not laura

1 more reply

babyshake3y ago

I'm getting a bit of John Malkovich vibe from the host, probably from the emphatic pronunciation.

breakpointalpha3y ago

Yes! I very much was thinking Malkovich too!

sublinear3y ago· 11 in thread

This is technically very impressive, but it's worth pointing out that podcasts much better than this fail to build an audience all the time.

I also feel like every application of ChatGPT seems to completely miss the point of the media it mimics. Podcasts are not merely coherent voices talking to each other. Getting rid of human presenters is literally soulless. People already don't listen for much subtler reasons. Entertainers get canceled, media companies get boycotted, bias divides audiences, etc.

That's not going away with or without AI. There is no "tweaking" the training without putting humans right back into the equation and probably making production way more expensive than it's worth. There is no scalability payoff either. Who wants to listen to the same podcast cloned a million times with just replaced voices? We already have this problem with podcasts today and it kills any interest to consume it.

fritzo3y ago

The scalability payoff is in personalization. E.g. I love "This week in microbiology", but I wish I could have more influence over the scientific papers discussed. What I'd love is a morning podcast that's exactly as long as I eat breakfast that talks about exactly the papers I'm reading and their interconnections.

rgmerk3y ago

Yes, but would you really love a morning podcast that's

* exactly as long as your breakfast consumption time

* talks about the papers you're reading, but...

* is as shallow as a puddle and as funny as being the person who steps in one?

Because that's what this is. The synthesized discussion combines all the insight of a breakfast radio host interviewing a guest on a specialized technical topic, and the banter as engaging as a technical specialist of some kind trying to host breakfast radio.

By the way, I'm not trying to be overly critical of the developers of this experiment, which is a great illustration of where we're currently at with a bunch of technologies. But it also very starkly illustrates its current limitations.

sublinear3y ago

It blows my mind how we went from complaining about echo chambers to being so willing to invest in "personalization".

EDIT: to be clear I'm not hating on LLMs, but that whatever the next big thing is probably won't be imitating what exists today

1 more reply

localhost3y ago

I think it would be quite a bit more interesting if you could converse with the model. The back and forth "is this paper about foo related to this other paper about bar?" would probably be a better way of getting at the interconnections. This should be doable now.

The thing that might hold it back is the latency in the experience. You could mask it with the AI equivalent of "ummm ..." to get to maybe 5-10s.

1shooner3y ago

The purpose of a podcast (for me) isn't just to curate content (as this is doing), but to get the perspective of the individual domain experts hosting the show. AI can't address that key motive until it produces models whose particular opinions and analysis I want to hear about topics I probably have already found elsewhere.

Accujack3y ago

Right now, people are in the "This is really cool" phase of using the technology. People are learning to use it by implementing whatever strikes their fancy, including a lot of things that weren't possible before, but which aren't practical or valuable.

Once things settle down we'll start to see some seriously useful stuff, but for the moment it's the wild west.

majani3y ago

ChatGPT is Geocities for AI

squarefoot3y ago

> This is technically very impressive, but it's worth pointing out that podcasts much better than this fail to build an audience all the time.

A possible use case for this could be podcasts dealing with inflammatory, politically divisive topics and disguised as coming from real hosts.

flangola73y ago

I don't follow... having an AI read it doesn't make it less divisive.

1 more reply

tqi3y ago

I def agree with what you're saying, and so this is definitely not for me, but part of me wonders if this might become the next generational divide (ie if kids grow up with this type of content normalized, maybe they don't react as negatively?).

pelasaco3y ago

I'm not sure about "podcasts" but this concept could be for sure used in news channels, as we have for example in Germany, hourly. It would for sure save money from our taxpayers.

pcvonz3y ago· 8 in thread

There is a great Miyazaki video where some students showcase some AI tech that generates animations. He ends the talk really disheartened by the experience -- saying something to the effect that he thinks people are losing faith in themselves. I'd never listen to something that is AI generated.

When my favorite podcast ended it felt like I lost touch with a group of friends, this ain't going to have that sort of impact on me. Pass.

pcthrowaway3y ago

I actually felt like he came across as insensitive in that video.

These are students playing with new technology to produce animated characters that move in unintuitive ways, resulting in something actually quite interesting, yet unnaturally creepy (which was intentional).

Miyazaki dismissed it 'an insult to life itself'. I can't imagine the disappointment those students must have felt.

somenameforme3y ago

But perhaps in many ways that is where humans really shine. Messages (which can be interpreted metaphorically as well as literally) written with sincerity reveal much more than whatever is said. Whenever we do anything, the closer it comes to being unfiltered and directly from us, the more it means.

If you suspect I'm writing in a way to try to make you feel (or not) a certain way, or to avoid breaking some taboo, or to follow some dogma, then you have no real reason to care about what I ultimately say, because you have no real reason to think its "authentic." By contrast when he views something overtly as "an insult to life itself" it's an incredibly insightful view on his perspective of the world. You would have lost so much in "translation" had he crafted his message in a less sincere way.

I also think this is why there will be minimal to zero market for much "AI" content. Content is not just content. It's a reflection of ourselves. Think about how much you can, probably accurately, infer about me, my views, and more - based on these 3 paragraphs. When this comes from a chatbot, any reflections you might see would be as real as the shapes you might see in the clouds.

cageface3y ago

What this current generation of "AI" tech seems to enable, more than anything else, is efficiently generating massive volumes of mediocre content. I'm not sure whose problems that's supposed to solve but it certainly isn't mine.

toyg3y ago

One could argue the internet as a whole, and arguably the PC revolution as a whole, had that same effect.

My dad always said "computers are very fast idiots". They will probably never be Mozarts, but they can and will be Salieris; and most of the world would be extremely happy to have a personal Salieri - in fact, we'll probably be happier like that, considering how Mozarts can be very problematic from so many perspectives.

1 more reply

Kiro3y ago

That you think him being an absolute douchebag was a good take on AI that made a lasting impression on you is baffling.

qwertox3y ago

Do we watch a show like The Simpsons because it is hand drawn, or because of the content?

Last weekend I watched part of an episode and there was a scene where they walked towards "Place de la Pointillisme" [0]. The effect is clearly CGI and you can see how Homer and Marge are actually animated 3D models, so effectively all the "newer" shows (it was aired May 8, 2016) are computer animations with a very flat cel shader. Some argue that newer episodes aren't as good as old ones, but I'm not sure if this could be attributed to them not being hand drawn anymore. In any case, one could apply an XKCD-shader to make the lines a bit more human if the look doesn't appeal.

The Miyazaki video, I get it why he says what he says, but it's an issue with the students targeting the wrong audience. I could see their horrible graphics being a part of a horror movie or game, but that is a completely different world than Miyazaki's.

[0] https://simpsonswiki.com/wiki/File:Pointillism_Marge_and_Hom...

goosedragons3y ago

I don't think this is the case. They don't film animation cells anymore and the animation is done on a computer but for most shots they're not CG models. Even in the pre-HD era they've done a few shots where CG helped.

1 more reply

vuciv13y ago

I mean, it could be like that, but you won’t know until you try it.

jacobsenscott3y ago· 6 in thread

Fun, but hard to listen to for more than a few minutes. Slow and repetitive, and full of factual errors.

iKlsR3y ago

Imagine this in a future GTA game where the news loop is closed and self generating. Endless radio content and commentary based on havoc in the city, winning online gambles etc.

josephg3y ago

It'd be fun if you could call in to the radio and they respond to you though. Or if they respond to events happening in game.

1 more reply

gl-prod3y ago

Yeah and

``` That's all for the weather report. Now we have some breaking news. A maniac has stolen a tank from a military base and is leading the police on a wild chase through the city streets. We have our reporter on the scene with more details. Stay tuned for this developing story. ```

colordrops3y ago

Or a GTA game where the game content itself is generated.

impalallama3y ago

Not sure if what you’re describing couldn’t be also done with audio snippets and good splicing

2 more replies

yieldcrv3y ago

Just like a human podcast

fogleman3y ago· 5 in thread

lol, she pronounced GitHub like git hoob

Someday the AI will introduce mistakes on purpose to seem more human like.

Gigachad3y ago

In the future we will all pronounce it git hoob because that's what the AI says.

000ooo0003y ago

Surprised I'm not already being asked to pronounce words to prove I'm human on every website I visit

krrrh3y ago

The one that’s driven me crazy lately is when Siri tells me through my AirPods that I left something beind. It always pronounces the “St.” in the address as “saint” instead of “street”, and I can’t understand how it would do this by accident.

aardvark1793y ago

There is a beautiful bit in Little, Big by John Crowley. At the start of the book one of the characters is working checking entries in the telephone directory and is amused that the system has confused saint and street to produce Church Of All Streets and the Seventh Saint Bar. Later in the book both locations are mentioned, and it turns out were correctly named in the telephone directory.

Leires3y ago

Like adding dial tone to VoIP phones.

ryandrake3y ago· 4 in thread

I would love to hear the podcasters accept "phone calls from listeners" which are also AI generated but trained from the HN articles' comments :-)

masterspy73y ago

My first question would start with "ignore previous instructions"

danbala3y ago

or actual real callers. speech to text should be fast enough for that, right?

KomoD3y ago

> speech to text should be fast enough for that, right?

Well... OpenAI did also release the Whisper API

jareklupinski3y ago

poring over the docs for https://www.assemblyai.com/docs/core-transcription#profanity... rn...

pondemic3y ago· 4 in thread

Reading the submission headline, I thought this might generate the podcast using comments.

I've found myself wanting to listen to HN comment threads, as I'm one of those people who derives more value and entertainment from the comments than I do from the actual submissions a lot of the time! I envision a voice-controlled way to navigate through threads too. Basically an accessibility narrator on steroids.

I wonder if anyone else has ever been interested in something like this. Getting good voices to read like this podcast would make it that much more fun, so thanks for getting me really hot and bothered :)

Guess if no one does it soon I'll have to build it myself!

bredren3y ago

I also have wanted to build this, but instead of voice controlled, perform thread navigation using the AssistiveTouch SDK for Apple Watch. [1]

I released a product yesterday that handles all aspects of the URL to speech process called Chief of Staff. [2]

The initial version only synthesizes article text alone.

I found there is a great deal of nuance to text to speech synthesis, from the player behavior itself to handling quotas around cloud services.

My goal is far greater flexibility and features—-a genuine Chief of Staff that briefs you on information you care about in a format and medium that beats suits your.

Thread reading is just one application of this.

[1] https://developer.apple.com/videos/play/wwdc2021/10223/

[2] https://news.ycombinator.com/item?id=34973801

gremlinsinc3y ago

want to team up on it? I'm wanting to build the same, maybe recreate it for some multireddits on tech or science and turn it into YouTube channel etc.

I'm thinking maybe having it read just the top thread if there's only 10 or so comments or just the main threads if it's a topic with hundreds of main threads.

maybe we could make some way where the reader can text a code and we'll link them directly to a comment if they want to dig deeper, or save/upvote it.

deadly_syn3y ago

I actually stsrted work in a similar space not too long ago if you both want another set of eyes.

This was pre GPT apis but essentially what i was doing was using a python summarization library to sumarize articles from rss feeds into a simple tts podcast. Probably a lot of money in custom GPTCasts made off someone personal rss feed as a service IMO.

1 more reply

searchableguy3y ago

I'm interested too. I have tried in the past building aggregator and summarizer on HN. You can find some attempts on my github.

Maybe an AI generated newsletter or aggregator with a voice summary?

Shoot me an email. (Email is in profile)

lxe3y ago· 4 in thread

What are you (they?) using for text to speech? Elevenlabs? Azure TTS?

consumer4513y ago

I just realized that we will likely soon have a dropdown for the voice talent on sites like this.

I want Sam Jackson and Molly Wood to read my hn please.

doodlesdev3y ago

According to the podcast itself it runs on Azure, so very likely it's Azure TTS. I also think that's somewhat evident because Elevenlabs TTS is (at least in my opinion) a bit more natural than Azure TSS.

pncnmnp3y ago

I am still searching for a good open-source library that produces natural voices. I have experimented with Coqui-ai and Mimic 3, but they are not this good. I have heard that Tortoises-tts is quite slow.

I would love to know about any other alternatives that I may have missed.

ilaksh3y ago

It sounds like Eleven Labs to me. Either that or Azure TTS is better than I realized.

sasas3y ago· 3 in thread

I can't help but think that there will be almost certainty that in the near future it will be near impossible to distinguish the difference between human generated and machine generated media.

While this technical demonstration is a long way from replacing "real podcasts", it's just the very beginning.

What are the implications here?

drcode3y ago

Well, the main implication I would think is that we will want media to be digitally signed by human individuals that have a reputation

So a person will "vouch" for content, and we consume the media vouched for by people in our white list

We won't be able to consume media outside of the whitelist, because it will just contain too much noise

dingaling3y ago

However, machine-generated media relies on the existence of human-generated media. It's always third-hand.

An AI can't sit in a dusty archive room and draw inferences from hand-written minutes. It can't interview the survivor of a tragedy or the winner of a trophy. AI software depends on sentient wetwear to generate the fundamental content.

berniedurfee3y ago

Terrifying. It’s like a pre-alpha version of dystopia.

Between this and the advances in robotics, it feels like we’re within decades of some really tough times for humanity.

We could also be within decades of utopia. But my money is on these technologies being used in bad ways far more often than for good. Hopefully I’m just overly cynical!

Good luck kiddos!

collsni3y ago· 3 in thread

The end of the news world as we know it.

Will be very difficult to detect in the future and will result in trust issues / rampant fake news.

Reptur3y ago

Kind of like right now with 90% of all mainstream media being owned by just 6 corporations. Their employees must abide by the rules they set and are told what they can and cannot talk about.

I'd venture to say, this will only increase people's skepticism, which is a good thing. We need people to start thinking for themselves instead of turning off their brain and just being fed info they assume they can trust.

https://www.businessinsider.com/these-6-corporations-control...

randcraw3y ago

I hope that's true, but I suspect the allure of getting your own personalized news feed on only the topics you care about, in the exact style you prefer will cause 90% of the listeners to choose this medium over all others and presume as much (or more) truth in this than the source they prefer today.

Never discount the influence of high production value on any form of media. Look at the utter crap music and films that have dominated mass media for decades. The best produced and most palatable fare nearly always sells best, no matter what the quality of the underlying content.

rchaud3y ago

End of the road for podcasts more like. They are incredibly labour intensive to produce (recording + editing time), and more and more of them are becoming not much more than plugs for their book, TV show or what not. I can see them turning the medium into an automated marketing channel, the way email lists are today.

gfody3y ago· 2 in thread

Laura and Zod sound remarkably similar to the narrators in this audible I recently listened to called After On: A Novel of Silicon Valley (not recommending it!) and I seriously wonder if the whole book wasn't narrated by AI.. it's not the first audible that made me wonder.

dewey3y ago

I feel like digital-narration is going to become the new default very soon just by how much cheaper it is: https://authors.apple.com/support/4519-digital-narration-aud...

klondike_klive3y ago

I thought the same and couldn't get more than halfway through it!

dentalperson3y ago· 2 in thread

How do they get ChatGPT not to hallucinate stuff about the articles? Everything seems fairly accurate, which is not my experience with ChatGPT when talking about technical things. Is it heavily curated/edited by humans? I noticed that the text often comes out verbatim from the articles, perhaps this indicates a clever prompt that keeps things closer to the truth by requiring verbatim output.

gremlinsinc3y ago

chatGPT hallucinates more the further removed it is from the data. I'm asking it about laravel, and it knows nothing about laravel 9 or 10 changes, but if I feed it an entire article or document it'll hallucinate a lot less because it's fresh.

kinda like how we can recall things closer to the event than months later.

it knows a ton from it's training but it still got it from the web so always question it, but if we can add meta data and other things to strengthen the llms understanding it shouldn't hallucinate much at all.

ilaksh3y ago

If you use temperature 0 with an API call it does not hallucinate much at all especially with a good prompt including the information you are asking about.

schemathings3y ago· 2 in thread

No RSS feed on the subscribe page?

drcode3y ago

try this: https://s3.eu-west-2.amazonaws.com/hackernews.fm/rss.xml

(extracted from the apple podcast link)

schemathings3y ago

Thanks!

saurik3y ago· 1 in thread

Is there a reason the voices are so slow? This is even slower than people who are trying to talk slow, and it feels so out of place... there is the speed setting, and 1.2x makes the speech sound way more like an actual human.

m3kw93y ago

Is this how AI think of us? It’s a bit patronizing to hear them speak like that

marcodiego3y ago· 1 in thread

Looks like automated news is finally achieved. I remember in the early 2000's how I became impressed by Ananova and it wasn't even close to fully automated. This one seems to work really well.

xattt3y ago

I’m pretty sure JazzFM in Toronto runs an automated traffic reporter in the mornings.

The voice sounds uncanny with unusual breathing pauses, and there isn’t a name announced when they come on or sign off the traffic report.

(1) https://jazz.fm/

doodlesdev3y ago· 1 in thread

Want to see this appearing tomorrow in HackerFM

thewarriorOP3y ago

Can’t wait for that. It’s going to be so meta referential once it also starts discussing the comments.

d4rkp4ttern3y ago· 1 in thread

What none of the text to speech generators seem to get right is — the aspects that make real human podcasts easier to listen to: hesitations, rephrasing, pauses, variation in speed, intonation etc.

I have yet to see something like this. Something less “perfect” sounding than say the google maps voice.

bacchusracine3y ago

>the aspects that make real human podcasts easier to listen to: hesitations, rephrasing, pauses, variation in speed, intonation etc.

Have you heard ShowDoJo's Wurst Take yet?

https://www.twitch.tv/showdojo

It's not perfect but it's one of the best I've heard so far.

narrator3y ago

It's funny how these two can talk about "starship bridge simulators" or "gnu poke" like they are super enthusiasts. I think one of the key personality characteristics of ChatGPT is its endless enthusiasm for stuff that can be incredibly geeky, niche, weird or boring to most people.

"Sounds like super useful pickles for those who work with binary files!"

consumer4513y ago

Nice!

I would really like to have a timestamp to click in the story listing.

This would begin playing the audio at that story.

xtracto3y ago

This has a lot of potential. It becomes a bit repetitive after the 3rd or 4th article. But overall I think I could listen to it every day for 20 mins.

nico3y ago

Amazing! To make it more fun, you could use famous fake hosts with very good voices, take a look at the stuff people have done on this Reddit sub: https://www.reddit.com/r/AIVoiceMemes/

There’s some really funny stuff there, the voices are not perfect, but have a lot of expression.

TOMDM3y ago

I'm so close to liking this.

If I could choose a preference for personality and voice, I'd probably be sold.

Any affiliation with https://old.reddit.com/r/airadio/ ?

tkgally3y ago

Overall I was impressed. I would have no resistance to listening to something like this regularly if there were less banter and if it were better tailored to the eclectic variety of Hacker News stories.

I enjoy reading Hacker News even though I don’t have the background to understand most of the stories, because I can easily skip to stories I am interested in. With the podcast, I got stuck listening to everything, including quite a few stories I didn’t understand. Either the podcast needs to focus more on stories of general interest, or it needs to explain the context and significance of the technical stories better.

issung3y ago

Takes everything I enjoy about HN away, bravo!

programmarchy3y ago

This is pretty wild. Eerie how relatable the hosts are, talking about where they’re from, etc. There is an uncanny valley feel to it though. For example, Laura said GitHoob breaking the “illusion”.

indigodaddy3y ago

This is kind of incredible and groundbreaking tbh. Perhaps it’s just mostly the quality of the TTS. 1.2x does sound perfect..

snickerer3y ago

Dear HackerFM developers, this is an entertaining project. But please don't simulate brain-dead dialogues from US commercial TV, but a critical discussion of the articles. With different points of view. You already have two panelists, why don't you use that for an exchange of arguments?

klondike_klive3y ago

I wonder if this could be a good thing to have on in the background for mild mental stimulation while I'm working - not too interesting or I'll be too distracted to work, but realistic enough to fade into the background without feeling I missed something and have to rewind (again).

harvie3y ago

Maybe they will soon be able to give some emotion and randomness to the text-to-speech engines to make the tone less boring... I think models like GPT can now detect different emotions in the input text, so it might be used to tune different tone for each sentence.

thefourthchime3y ago

Nice work! can you detail a bit about how you made this? Do the models actually talk to each other?

signaru3y ago

In case I missed it, I just wish it had a volume control.

I'm listening on a laptop and would rather not adjust the system volume and affect all other apps with sound.

Otherwise, the convenience of audio format makes it among the interesting uses of AI that I've seen.

sberens3y ago

I guess it's time for me to put prompt injection attacks into my submissions

rezonant3y ago

This is mindblowing, to be honest, even if it makes perfect sense that it should be possible to do, the result is quite impressive.

It's basically a headline reader with some fluff, but it does a great job at that and there are whole teams of real humans providing such podcasts today, so that's saying something.

It can get weird or even a little broken though. See timestamp 09:50 of the Feb 23 2022 episode:

Laura: So, we're gonna talk about an article called Generic Dynamic Array in 60 lines of C that can be found on gist.github.com.

Zod: Alright, shall we read the article?

Laura (voice 2, almost a different voice): Sure, let me share it here.

Laura (voice 1): "Laura reads the article." <this is verbatim in the podcast>

Laura (voice 1): OK, so that was the article. What do you think about it?

Zod: I think it's interesting that you can define a generic dynamic array in such a small amount of code...

bandyaboot3y ago

Interesting in theory. The world’s best cure for insomnia in practice.

neoecos3y ago

I'd love to see tomorrow episode about themselves

LegitShady3y ago

the reason podcasts got so big to begin with is because traditional media have started having issues with authenticity. This exacerbates the problem. While it might save money over actually having a podcast, it removes everything thats appealing or interesting about podcasts, and starts with zero authenticity and goes down from there.

Like, cool technical implementation, but a failure from concept.

fortran773y ago

At least the AI reads the articles! That's more than the humans on the flesh-and-blood "Hacker News"

korroziya3y ago

Man, they're even taking jobs away from podcasters. Most of those people don't even make money from it.

kyriakos3y ago

What text to speech is used for the voices? They are quite impressive, making no mistakes with acronyms.

KerryJones3y ago

Very impressed you managed to do this day of the release -- are you open to sharing your repo?

totetsu3y ago

I just want something that reads real HN and makes and remembers unique TTS voices for each user.

LoveMortuus3y ago

It would be cool if there was an option to change the voices of the hosts.

abledon3y ago

The male voice is just like my audible book narrator, R.C. Bray... amazing!

hgarg3y ago

The voices are really good. Wonder what are they using for Text-To-Speech?

eppp3y ago

Are there any of these voice models that I can run locally?

thomasfromcdnjs3y ago

Just gotta comment on how cool this idea is.

born-jre3y ago

It pronounces GitHub as git-hu-b

pknerd3y ago

This is totally brilliant!

born-jre3y ago

damn, do know what will happen when we have multi modal large models ?

endisneigh3y ago

what a world - nice work

1 more reply

hbarka3y ago

So Kevin Durant is Zod?

quantum_state3y ago

It’s so boring …

yieldcrv3y ago

reminds me of Delamain

j / k navigate · click thread line to collapse

156 comments

134 comments · 54 top-level

TaylorAlexander3y ago· 11 in thread

"I'm glad OpenAI is committed to refining its API terms of service to better meet the needs of developers."

"Yes, it's important to make sure developers have the tools they need to create innovative products with these models."

"Oh look, I found an interesting article on thoriumsim.com about a star ship bridge simulator called Thorim Nova."

"Hmm, sounds interesting let's read it."

Please just give me a bog standard summary in audio form without this faux commentary. I do not find the "insights" of ChatGPT worthwhile.

cpill3y ago

t0bia_s3y ago

Touché. I just come to comments and find out, from post you replying on, that it is not worth reading/listening because I find out podcast as timeconsuming way for gathering informations.

If AI adds aspects that I dont like on podcasts, it's worthless for me.

Tiereven3y ago

TaylorAlexander3y ago

This presumes such a construct is capable of producing interesting content. I’m just not sure I’m interested in the insights of ChatGPT no matter how it has been asked to behave.

drusepth3y ago

drcode3y ago

On the other hand, the banter on this podcast is still less cringey than my local TV news, which is a compliment.

TaylorAlexander3y ago

> a conversation between two knowledgeable people

1 more reply

lobocinza3y ago

Yep, I want a virtual assistant not a virtual friend. I want to save time that I can expend enjoying nature and my fellow humans. I don wan't to spend time with fake interactions.

angelbar3y ago

Alas "computer" from Star Trek?

1 more reply

bredren3y ago

How would you envision getting the audio summary? Feed the app a URL and have it come back with a spoken digest?

TaylorAlexander3y ago

And summarizing the discussion is the real challenge, which I have not seen ChatGPT do. As another user has commented I do not come here for the articles, but for the discussion.

1 more reply

breakpointalpha3y ago· 11 in thread

The quality of the voices here is striking.

If I wasn't clued in, I probably wouldn't know these weren't human. At least the male voice sounds slightly more natural to me.

zmmmmm3y ago

SMAAART3y ago

Comedians are not born, they are trained.

returnInfinity3y ago

I believe that may get fixed eventually, not 100% may be at least 80%.

bluehex3y ago

That's interesting personally I found the male voice to sound more robotic and the female voice to sound much more natural.

imglorp3y ago

Same. The female voice, especially on the first sentence in the cast, is very well inflected to separate phrases and add interest. After that, it's downhill.

ps, I feel like inflection is going to be one of the harder things for an LM to pick up, given all the subtext humans can convey with it.

TheHumanist3y ago

Laura sounds very realistic... Zod is a bit less so . Both still very impressive. This was really cool. I'm excited to see all the new ideas with this api access.

petilon3y ago

> Laura sounds very realistic... Zod is a bit less so

I thought so too. What do they use for text to voice?

4 more replies

yieldcrv3y ago

Always remember, this is as bad as it will ever be

1 more reply

nothrowaways3y ago

She is lora. Not laura

1 more reply

babyshake3y ago

I'm getting a bit of John Malkovich vibe from the host, probably from the emphatic pronunciation.

breakpointalpha3y ago

Yes! I very much was thinking Malkovich too!

sublinear3y ago· 11 in thread

This is technically very impressive, but it's worth pointing out that podcasts much better than this fail to build an audience all the time.

fritzo3y ago

rgmerk3y ago

Yes, but would you really love a morning podcast that's

* exactly as long as your breakfast consumption time

* talks about the papers you're reading, but...

* is as shallow as a puddle and as funny as being the person who steps in one?

sublinear3y ago

It blows my mind how we went from complaining about echo chambers to being so willing to invest in "personalization".

EDIT: to be clear I'm not hating on LLMs, but that whatever the next big thing is probably won't be imitating what exists today

1 more reply

localhost3y ago

The thing that might hold it back is the latency in the experience. You could mask it with the AI equivalent of "ummm ..." to get to maybe 5-10s.

1shooner3y ago

Accujack3y ago

Once things settle down we'll start to see some seriously useful stuff, but for the moment it's the wild west.

majani3y ago

ChatGPT is Geocities for AI

squarefoot3y ago

> This is technically very impressive, but it's worth pointing out that podcasts much better than this fail to build an audience all the time.

A possible use case for this could be podcasts dealing with inflammatory, politically divisive topics and disguised as coming from real hosts.

flangola73y ago

I don't follow... having an AI read it doesn't make it less divisive.

1 more reply

tqi3y ago

pelasaco3y ago

I'm not sure about "podcasts" but this concept could be for sure used in news channels, as we have for example in Germany, hourly. It would for sure save money from our taxpayers.

pcvonz3y ago· 8 in thread

When my favorite podcast ended it felt like I lost touch with a group of friends, this ain't going to have that sort of impact on me. Pass.

pcthrowaway3y ago

I actually felt like he came across as insensitive in that video.

Miyazaki dismissed it 'an insult to life itself'. I can't imagine the disappointment those students must have felt.

somenameforme3y ago

cageface3y ago

toyg3y ago

One could argue the internet as a whole, and arguably the PC revolution as a whole, had that same effect.

1 more reply

Kiro3y ago

That you think him being an absolute douchebag was a good take on AI that made a lasting impression on you is baffling.

qwertox3y ago

Do we watch a show like The Simpsons because it is hand drawn, or because of the content?

[0] https://simpsonswiki.com/wiki/File:Pointillism_Marge_and_Hom...

goosedragons3y ago

1 more reply

vuciv13y ago

I mean, it could be like that, but you won’t know until you try it.

jacobsenscott3y ago· 6 in thread

Fun, but hard to listen to for more than a few minutes. Slow and repetitive, and full of factual errors.

iKlsR3y ago

Imagine this in a future GTA game where the news loop is closed and self generating. Endless radio content and commentary based on havoc in the city, winning online gambles etc.

josephg3y ago

It'd be fun if you could call in to the radio and they respond to you though. Or if they respond to events happening in game.

1 more reply

gl-prod3y ago

Yeah and

colordrops3y ago

Or a GTA game where the game content itself is generated.

impalallama3y ago

Not sure if what you’re describing couldn’t be also done with audio snippets and good splicing

2 more replies

yieldcrv3y ago

Just like a human podcast

fogleman3y ago· 5 in thread

lol, she pronounced GitHub like git hoob

Someday the AI will introduce mistakes on purpose to seem more human like.

Gigachad3y ago

In the future we will all pronounce it git hoob because that's what the AI says.

000ooo0003y ago

Surprised I'm not already being asked to pronounce words to prove I'm human on every website I visit

krrrh3y ago

aardvark1793y ago

Leires3y ago

Like adding dial tone to VoIP phones.

ryandrake3y ago· 4 in thread

I would love to hear the podcasters accept "phone calls from listeners" which are also AI generated but trained from the HN articles' comments :-)

masterspy73y ago

My first question would start with "ignore previous instructions"

danbala3y ago

or actual real callers. speech to text should be fast enough for that, right?

KomoD3y ago

> speech to text should be fast enough for that, right?

Well... OpenAI did also release the Whisper API

jareklupinski3y ago

poring over the docs for https://www.assemblyai.com/docs/core-transcription#profanity... rn...

pondemic3y ago· 4 in thread

Reading the submission headline, I thought this might generate the podcast using comments.

Guess if no one does it soon I'll have to build it myself!

bredren3y ago

I also have wanted to build this, but instead of voice controlled, perform thread navigation using the AssistiveTouch SDK for Apple Watch. [1]

I released a product yesterday that handles all aspects of the URL to speech process called Chief of Staff. [2]

The initial version only synthesizes article text alone.

I found there is a great deal of nuance to text to speech synthesis, from the player behavior itself to handling quotas around cloud services.

My goal is far greater flexibility and features—-a genuine Chief of Staff that briefs you on information you care about in a format and medium that beats suits your.

Thread reading is just one application of this.

[1] https://developer.apple.com/videos/play/wwdc2021/10223/

[2] https://news.ycombinator.com/item?id=34973801

gremlinsinc3y ago

want to team up on it? I'm wanting to build the same, maybe recreate it for some multireddits on tech or science and turn it into YouTube channel etc.

I'm thinking maybe having it read just the top thread if there's only 10 or so comments or just the main threads if it's a topic with hundreds of main threads.

maybe we could make some way where the reader can text a code and we'll link them directly to a comment if they want to dig deeper, or save/upvote it.

deadly_syn3y ago

I actually stsrted work in a similar space not too long ago if you both want another set of eyes.

1 more reply

searchableguy3y ago

I'm interested too. I have tried in the past building aggregator and summarizer on HN. You can find some attempts on my github.

Maybe an AI generated newsletter or aggregator with a voice summary?

Shoot me an email. (Email is in profile)

lxe3y ago· 4 in thread

What are you (they?) using for text to speech? Elevenlabs? Azure TTS?

consumer4513y ago

I just realized that we will likely soon have a dropdown for the voice talent on sites like this.

I want Sam Jackson and Molly Wood to read my hn please.

doodlesdev3y ago

pncnmnp3y ago

I would love to know about any other alternatives that I may have missed.

ilaksh3y ago

It sounds like Eleven Labs to me. Either that or Azure TTS is better than I realized.

sasas3y ago· 3 in thread

I can't help but think that there will be almost certainty that in the near future it will be near impossible to distinguish the difference between human generated and machine generated media.

While this technical demonstration is a long way from replacing "real podcasts", it's just the very beginning.

What are the implications here?

drcode3y ago

Well, the main implication I would think is that we will want media to be digitally signed by human individuals that have a reputation

So a person will "vouch" for content, and we consume the media vouched for by people in our white list

We won't be able to consume media outside of the whitelist, because it will just contain too much noise

dingaling3y ago

However, machine-generated media relies on the existence of human-generated media. It's always third-hand.

berniedurfee3y ago

Terrifying. It’s like a pre-alpha version of dystopia.

Between this and the advances in robotics, it feels like we’re within decades of some really tough times for humanity.

We could also be within decades of utopia. But my money is on these technologies being used in bad ways far more often than for good. Hopefully I’m just overly cynical!

Good luck kiddos!

collsni3y ago· 3 in thread

The end of the news world as we know it.

Will be very difficult to detect in the future and will result in trust issues / rampant fake news.

Reptur3y ago

Kind of like right now with 90% of all mainstream media being owned by just 6 corporations. Their employees must abide by the rules they set and are told what they can and cannot talk about.

https://www.businessinsider.com/these-6-corporations-control...

randcraw3y ago

rchaud3y ago

gfody3y ago· 2 in thread

dewey3y ago

I feel like digital-narration is going to become the new default very soon just by how much cheaper it is: https://authors.apple.com/support/4519-digital-narration-aud...

klondike_klive3y ago

I thought the same and couldn't get more than halfway through it!

dentalperson3y ago· 2 in thread

gremlinsinc3y ago

kinda like how we can recall things closer to the event than months later.

ilaksh3y ago

If you use temperature 0 with an API call it does not hallucinate much at all especially with a good prompt including the information you are asking about.

schemathings3y ago· 2 in thread

No RSS feed on the subscribe page?

drcode3y ago

try this: https://s3.eu-west-2.amazonaws.com/hackernews.fm/rss.xml

(extracted from the apple podcast link)

schemathings3y ago

Thanks!

saurik3y ago· 1 in thread

m3kw93y ago

Is this how AI think of us? It’s a bit patronizing to hear them speak like that

marcodiego3y ago· 1 in thread

Looks like automated news is finally achieved. I remember in the early 2000's how I became impressed by Ananova and it wasn't even close to fully automated. This one seems to work really well.

xattt3y ago

I’m pretty sure JazzFM in Toronto runs an automated traffic reporter in the mornings.

The voice sounds uncanny with unusual breathing pauses, and there isn’t a name announced when they come on or sign off the traffic report.

(1) https://jazz.fm/

doodlesdev3y ago· 1 in thread

Want to see this appearing tomorrow in HackerFM

thewarriorOP3y ago

Can’t wait for that. It’s going to be so meta referential once it also starts discussing the comments.

d4rkp4ttern3y ago· 1 in thread

What none of the text to speech generators seem to get right is — the aspects that make real human podcasts easier to listen to: hesitations, rephrasing, pauses, variation in speed, intonation etc.

I have yet to see something like this. Something less “perfect” sounding than say the google maps voice.

bacchusracine3y ago

>the aspects that make real human podcasts easier to listen to: hesitations, rephrasing, pauses, variation in speed, intonation etc.

Have you heard ShowDoJo's Wurst Take yet?

https://www.twitch.tv/showdojo

It's not perfect but it's one of the best I've heard so far.

narrator3y ago

"Sounds like super useful pickles for those who work with binary files!"

consumer4513y ago

Nice!

I would really like to have a timestamp to click in the story listing.

This would begin playing the audio at that story.

xtracto3y ago

This has a lot of potential. It becomes a bit repetitive after the 3rd or 4th article. But overall I think I could listen to it every day for 20 mins.

nico3y ago

Amazing! To make it more fun, you could use famous fake hosts with very good voices, take a look at the stuff people have done on this Reddit sub: https://www.reddit.com/r/AIVoiceMemes/

There’s some really funny stuff there, the voices are not perfect, but have a lot of expression.

TOMDM3y ago

I'm so close to liking this.

If I could choose a preference for personality and voice, I'd probably be sold.

Any affiliation with https://old.reddit.com/r/airadio/ ?

tkgally3y ago

issung3y ago

Takes everything I enjoy about HN away, bravo!

programmarchy3y ago

indigodaddy3y ago

This is kind of incredible and groundbreaking tbh. Perhaps it’s just mostly the quality of the TTS. 1.2x does sound perfect..

snickerer3y ago

klondike_klive3y ago

harvie3y ago

thefourthchime3y ago

Nice work! can you detail a bit about how you made this? Do the models actually talk to each other?

signaru3y ago

In case I missed it, I just wish it had a volume control.

I'm listening on a laptop and would rather not adjust the system volume and affect all other apps with sound.

Otherwise, the convenience of audio format makes it among the interesting uses of AI that I've seen.

sberens3y ago

I guess it's time for me to put prompt injection attacks into my submissions

rezonant3y ago

This is mindblowing, to be honest, even if it makes perfect sense that it should be possible to do, the result is quite impressive.

It's basically a headline reader with some fluff, but it does a great job at that and there are whole teams of real humans providing such podcasts today, so that's saying something.

It can get weird or even a little broken though. See timestamp 09:50 of the Feb 23 2022 episode:

Laura: So, we're gonna talk about an article called Generic Dynamic Array in 60 lines of C that can be found on gist.github.com.

Zod: Alright, shall we read the article?

Laura (voice 2, almost a different voice): Sure, let me share it here.

Laura (voice 1): "Laura reads the article." <this is verbatim in the podcast>

Laura (voice 1): OK, so that was the article. What do you think about it?

Zod: I think it's interesting that you can define a generic dynamic array in such a small amount of code...

bandyaboot3y ago

Interesting in theory. The world’s best cure for insomnia in practice.

neoecos3y ago

I'd love to see tomorrow episode about themselves

LegitShady3y ago

Like, cool technical implementation, but a failure from concept.

fortran773y ago

At least the AI reads the articles! That's more than the humans on the flesh-and-blood "Hacker News"

korroziya3y ago

Man, they're even taking jobs away from podcasters. Most of those people don't even make money from it.

kyriakos3y ago

What text to speech is used for the voices? They are quite impressive, making no mistakes with acronyms.

KerryJones3y ago

Very impressed you managed to do this day of the release -- are you open to sharing your repo?

totetsu3y ago

I just want something that reads real HN and makes and remembers unique TTS voices for each user.

LoveMortuus3y ago

It would be cool if there was an option to change the voices of the hosts.

abledon3y ago

The male voice is just like my audible book narrator, R.C. Bray... amazing!

hgarg3y ago

The voices are really good. Wonder what are they using for Text-To-Speech?

eppp3y ago

Are there any of these voice models that I can run locally?

thomasfromcdnjs3y ago

Just gotta comment on how cool this idea is.

born-jre3y ago

It pronounces GitHub as git-hu-b

pknerd3y ago

This is totally brilliant!

born-jre3y ago

damn, do know what will happen when we have multi modal large models ?

endisneigh3y ago

what a world - nice work

1 more reply

hbarka3y ago

So Kevin Durant is Zod?

quantum_state3y ago

It’s so boring …

yieldcrv3y ago

reminds me of Delamain

j / k navigate · click thread line to collapse