What is it? An AI-powered, voice-controlled D&D adventure set in the world of Dvorak. Talk to characters, explore locations, and shape the story using your words.
Use your microphone to interact with the AI dungeon master. Explore freely – interrupt, ask questions, or take unexpected actions. If you make friends at the tavern, you can also just hang out there and chat.
Hint: Talk to the bartender to move the story along.
This is an early demo, and I'm eager for your thoughts: Is the concept engaging? What works well, and what doesn't? I've added a feedback form to the webpage in case you want to drop a comment without posting on HN.
Thanks for trying out the demo!
Would be great if the text appeared as it's voiced, and small things like the AI taking a breather between voicing dialogue options. I wonder if some form of image could be created on-the-fly, too? That's updated like a scene in a comic. Would love to see blacksmith's reaction, lol.
[1]: https://dojoteef.com/papers/virtual_gm_wordplay_2024.pdf
(If you want to see what I mean: the alleyway option isn’t available until after you attack the merchant, for example.)
I was on this train for a bit until after reading some posts from game developers as to why this is probably a terrible idea to the point of unfeasibility - game design (as I understand it) deals with a finite set of outcomes in a controlled environment. "Integrating" AI with games would involve shattering this constraint in a way that would make games inherently unstable/untestable - at least if I am understanding the argument correctly.
Games that try to tell an interesting story, the real arty stuff, all that might not be able to use it well.
But Minecraft with good NPC’s, all the open world stuff, that seems like it could be really fun and cool.
How much time did people spend actually engaging with the deep and thoughtful narrative in Skyrim? And how much did they spend just enjoying the world? The latter could be really enhanced with an AI DM, in the medium-term, I bet.
From my limited DMing experience, much of the challenges is handling when players do something you haven't anticipated. For most encounters (video games do this too) I think about
1. The stealth approach (why yes there is a secret entrance underneath the castle). 2. The direct approach. Like just kicking down the door and heading in. 3. Negotiation: Is there a way to strike a deal with the bad guy?
Beyond that I fall back on trying to provide a realistic response from the game world, but sometimes the players are creative enough that you have to redo the whole thing on the fly.
Also, having the text appear in sync with the voice is a great idea. I'll experiment and see what feels best, but even just having the words fade in one-by-one at a speaking rate could be good. Thanks for the suggestions!
Get some friends together at a table and play D&D. You can literally already have all of that.
This isn't innovative, like most AI apps it's just a worse version of something that already exists.
https://old.reddit.com/r/OpenAI/search?q=dnd&restrict_sr=on&...
--
However, this is cool. Hopefully you have left a Grue someplace as an easter egg.
A generative AI is never going to replace a talented DM who is playing with a skilled group. Or hell even a mediocre DM with a less skilled group. 'D&D' is about your friends around the table, not a ruleset. It's about creating a story together not about being navigated down some decision tree.
To put it another way. Baldurs Gate 3 was an amazing game, but it was not 'D&D', I as a player could only move within the bounds of the system laid down by the developers. .. and honestly and a little more subjectively, it did not 'feel' like 'D&D' even though the systems were largely conformant to the 5e ruleset. AI might be able to shade a little closer to tabletop, but it still won't be tabletop, not for many many years, and probably a different underlying 'AI' technology.
Maybe you could use this for 'solo rpg' play without a huge amount for frustration, but that also isn't 'D&D' even when you do it with pen and paper.
You aren't going to replace a bunch of friends around the table, not with the current generation of 'Generative AI'.
I personally love to focus on combat and sometimes miss the nuance of storytelling. Something like this could help nudge you to remember to cater to your players that also enjoy RPing more than picking their spells during a fight
Would you like this more if it gave you the ability to type as input and see the response so that you could use it as a DM tool?
* As someone said, it'd be cool if you could render what I'm saying and add a loading indicator for the LLM. It'd improve the UX a bit.
* As someone mentioned, you can try to generate images to make the story more "real". This could be fun.
* You can also try to generate more realistic and drammatic sounds, and make the DM sound more theatrical. I'm not sure if that's easy but might be a big improvement. Bonus - maybe it'd be fun to choose a famous voice, like morgan freeman or anthony hopkins.
* It'd be cool if that could save my adventure. Right now, it is restarted everytime I leave the page.
I'd love to connect this up to Flux.1 and have auto-generated hero images at the top! And getting the sound right will be a huge part of it, since it's basically an audio-first experience. I'm wondering if it would work to change voices for the dialogue when you speak to different people in the world...
I've noted that save games are essential! Thanks for playing it through long enough to think about that :) I'm glad you enjoyed it enough to keep going!
It's a bit more open ended though, are the constraints on actions intentional (ie. they're predetermined), or is the model just adamant on picking from options provided
For this demo, the app architecture really depends on users sticking (more or less) to the scripted options – if they want to progress with the story. I’ve included something in the prompt to encourage that.
There are also some ‘hidden’ choices, though. For example, you can attack the merchant and the blacksmith. Those options aren’t enumerated by the GPT when it describes the scene, but they’re equally valid paths in the backend. (That gives me an opportunity to script some of the more popular transgressions.)
How did you set up Spellbound? Do you have one longer prompt, or did you split it up?
Also would be nice if we could change the voice.
I think adding an option to change the voice is the #1 most frequent request that I’ve gotten. Time to dig through ElevenLabs and see what else I can find! :)
I’m going to go in the tavern now and see if I can start a brawl :)
Please consider the potential for this - I think you could make something really fun and something people would be willing to pay for.
But the problem is you need a world someone else created (or one you painstakingly create yourself). You could consider Conan's Hyborian Age, H.P. Lovecraft, Sherlock Holmes, or Peter Watt's work[0] for a known world that is public domain to base your project on.
Best of luck!
Ability to type instead of speak? Or Undo/Cancel -- though that might be tempting to use for fixing mistakes in judgement.
You can interrupt and ask the AI to do what you wanted to do in the first place, also! Depending on what the action was, it will often just correct the dialogue and continue on.
Thanks for trying it!
Books didn't have the interactivity, Hollywood too, they also didn't make films long enough and feared complicated stories, computer games copied character development, eventually got mind blowing graphics that got even better shortly after.
With just 3 DM's and 3 map editors you should be able to create 24 hours worth of new adventures every day but I'm not aware of anyone doing that. Diablo 2 had fabulous game mechanics and great graphics but the tiny amount of content for it was rather shocking for any DM. Later games did get open worlds with plenty to do but if anyone generated maps they got repetitive soon.
Popular TV series keep making new episodes without a real story that has a beginning a middle and an end. Startrek was possibly the exception but they more often than not wanted the story to happen in a single episode (like movies)
In role playing games you were to get long fascinating adventures one after the other.
I imagine, if one can generate plot lines, graphics, music and personalities automatically a group of writers and a director or possibly a single DM (depending on his skill level) could continuously develop adventures in real time for the AI to glue together. Have a bunch of critical testers of various skill levels.
New adventures all the time and delete them after a few hours.
The big computer is to make sure no content resembles anything made before. It turns adventures that take hours into narrated short fly-though overviews that can be used to demonstrate similarity but can also be combined into lengthy cinematics to bring players who just logged in up to speed on what is going on.
There should be "players" who only log in to watch the cinematics. It should be that good. It should be good enough to put an hour worth on Netflix every 12 hours. Good enough to generate a comic and to publish a book every month.
Character development should grow similarly with new things every day and old things vanishing in a fog...
character creation in D&D easier: https://tabletopy.com/fantasy-character-generator.html
mynoise.net:
TURN THIS ON: play all the things, and adjust the sliders...
https://mynoise.net/superGenerator.php?g1=thunderNoiseGenera...
Have the page load this/another URL from mynoise and have it play - you dont need to define sounds - just play the ambiance that you want directly from my noise as it relates to your place in the adventure - check out the dungeon sounds.
When a keyword is stated in your story - have it load a corresponding ambiance URL from mynoise.net.
I sent an email to Stephane to point them at this thread. ---
https://i.imgur.com/KcdTY4d.png
https://i.imgur.com/OyEMuX2.png
https://i.imgur.com/uTjnTGP.png
Have it load the Village sound, as you can hear the blacksmith busy:
https://i.imgur.com/zMXdwOW.png
(The site is free, the guy is a PHD audiophile - and the sounds are license free.
---
>Sound is my passion. The major part of my work relates to sound processing, where sound design represents the artistic side of it. Between 1994 and 2015, I've been working for Roland Corporation, a leading electronic musical instrument manufacturer. My exclusive contract with Roland Japan prevented me from working for any other manufacturer in the field during that period of time, but gave me a rare opportunity to work at the leading edge of the state-of-the-art technologies in synthesizer design! Today, I am free as a (pigeon) bird again!