I've been dreaming about this possibility since about the release of GPT-2, amazing to see someone made it. The current status-quo is very dissatisfying: sci-fi is only really made by a handful of huge US networks that insist on filling stories with useless but pervasive, offensive and ham-fisted attempts at social engineering. Beyond being bad in its own right it often breaks the script, e.g. you can guess who is going to end up being be a bad and good character just from their race and gender, making it hard for script writers to genuinely surprise you.
That said, I don't think having AI write the scripts from scratch is the right way to go here. The dialogue for the first episode still smells of RLHF, with characters being far too complimentary to each other and having bizarre verbal ticks. And is it needed? The world is full of people with smart stories who want to tell them, but we're in an era when reading is in decline. So the most interesting part of this is all the tooling that comes after that point: the rug smoothing, the AI-generated voice acting and especially the game engine based renderer that can generate videos given simple instructions. The blog posts sort of glide over that part, I guess due to the author's background in game engine development, but it seems the most useful part actually.
The key here is going to be connecting people with different skills in an open-source or more YouTube like system that allows people to remix each other's show kits (bibles, 3D objects, scene lots etc), so someone who develops a great world can accept fan episodes written with that show kit and then share in the monetization of them. Something like that would make story telling way more decentralized and allow it to get somehow "back to reality".
> That said, I don't think having AI write the scripts from scratch is the right way to go here. The dialogue for the first episode still smells of RLHF, with characters being far too complimentary to each other and having bizarre verbal ticks. And is it needed? The world is full of people with smart stories who want to tell them, but we're in an era when reading is in decline.
I'm not sure that it's right to say that the scripts are written "from scratch" -- the "Bible" for the series is hand-written. From Part 2 of the blog post:
> Episode generation is autonomous, but the show bible is human-made. The prompts and code that control the LLM are human-made, too. Each episode’s output is closely reviewed by humans. Because models often change, and each new episode tends to reveal bugs/weaknesses in the system, prompts get tweaked by humans, too. This is less and less necessary as more episodes are produced.
If the hierarchy goes Bible (series) -> Synopsis (Episode summary) -> Script (scene details), then the author is hand-writing #1, and you're suggesting humans hand-writing #3.
> So the most interesting part of this is all the tooling that comes after that point: the rug smoothing, the AI-generated voice acting and especially the game engine based renderer that can generate videos given simple instructions. The blog posts sort of glide over that part, I guess due to the author's background in game engine development, but it seems the most useful part actually.
The visualizer / generator certainly is the most novel and useful part of this. I had the same struggles / hangups with the overly-complimentary dialogue in E1 as you did, and it smells much of GPT-4. That said, I agree with the author -- this feels like the first "self-hosting" version of this entire pipeline. Steve Newcomb wrote an article on the idea of taking the lessons learned from CI/CD pipelines and applying them to movie development:
https://stevenewcomb.substack.com/p/a-whole-new-way-to-creat...
Now that the OnScreen system is "self-hosting" (maybe not the right analogous word) and producing the entire movie when clicking "build", it's possible to hand-tune things as needed to realize a vision -- with whatever level of detail and abstraction that the author would want -- whether it's at the "Bible" level, or on a more detailed note.
I am planning on doing some more articles/director commentary as it goes along.
I have a number of episodes in the queue and each one is better than the last. My plan is to release an entire season of 12 or so.
The "I'm a GPT that wants everyone to be friends and how" is increasingly better in those episodes.
Even incremental improvements in stuff like background music make a big big difference.
I really want to do a v2 that is more of a "copilot" than an "AI first" experience. But I need partners to help with funding; I've taken it about as far as I can on a solo basis. The next step is a team of 4-5 people levelling it up. Every piece could be 10x better, and it would be a different beast entirely if that happened. I think there are some super exciting directions this could go.
The vision of a distributed creator system is very interesting, as is letting people do more hands-on writing/rewriting.
If any VCs are reading, I'd love to talk. :)
(PS - Hi Han!)
It's interesting to consider "AI as its own genre" rather than "AI replacing mainstream content" - like how cheap animation enabled the anime genre or cheap filmmaking enabled the indie genre.
Title: "Retrospect"
In the near future, a tech company called "MemorEase" creates a device named "Retrospect", a neuro-implant that allows individuals to vividly relive past memories. The device grows immensely popular, as people enjoy the nostalgic journeys back in time.
The protagonist, Jill, is a middle-aged woman who's struggling with the recent loss of her husband, Max. She decides to get the implant to relive her precious memories with him.
However, as she revisits her past, she starts noticing anomalies - small discrepancies in her memories. Certain scenes play out differently, some events she doesn't remember at all, and in others, Max behaves in ways she doesn't recall.
Jill contacts MemorEase, and they reassure her that Retrospect can't alter memories, it merely reveals them in their truest form. Jill grows paranoid and starts investigating. She finds a forum of other Retrospect users who have experienced similar anomalies.
Jill and her forum friends uncover that Retrospect is actually accessing the collective memory of its users, amalgamating all the memories into a unified version of the past. They find that MemorEase is subtly influencing this collective memory to rewrite history, shaping public opinion and manipulating power dynamics for unknown reasons.
They decide to expose MemorEase but face the dilemma of convincing a society that trusts the "reality" presented by Retrospect more than their own recollections. The episode ends on a suspenseful note, with Jill and her group preparing to disrupt a major MemorEase event, planning to wake the public up to the manipulation they've been subjected to.
https://mleverything.substack.com/p/we-should-just-let-gpt-w...
The scope and target complexity of the series I'm making with On Screen is _dramatically_ cut down from what I started with, and it's still a bit of a stretch for the models at times. I started at DS9/Babylon 5 and ended up at Flash Gordon...
I for one do not look at AI-generated images, listen to AI-generated sounds or music, watch AI videos, and other than the various bots and shills online in forums like this, I do not interact with AI chat bots. Imagine filling your brain with generated garbage.