Would potentially be a useful augmentation to a digital comic book reader, refocusing from panel to panel in sequence. Not to mention making comic book content more accessible.
Would probably never be perfect, but if it gets it right in all but a few outlying cases, that should be good enough.
Take video games. From Software games tell stories through breadcrumbs; locations, speech lines, item descriptions, and only over time can you start to connect the dots.
Sure, you can watch a youtuber connect the dots for you, show you what you've missed, and make you aware of your own mental limitations. But it's not the same.
You could probably build a tool that tags each panel and attempts to figure out the order, and then have a human editor do a validation pass. If you have enough people reading a series you can probably crowdsource the panel sequence.
It would be complex to pin a given panel order as canonical when the author is playing games with the reader. I don't think authors would oppose an accesibility feature, but I could see the debate if it was a more prominent, sanctionned way of reading.
As amazing as recent AI progress has been, we do overrate it a lot (I'm including myself in that).
Can you explain what you mean by motion comic generation? Sounds interesting!
Cerebus's Reads volume is basically a book with illustrations, for example.
Wonder what SAM would do in such cases...
It's an interesting tech but giving such an important creative job to a computer instead of an artist is a bad idea for any comic artist who cares about their work.
Yes, Bill Waterson used his fame to get out of the standard grid [1] but in a world where people read comics on their phones this technology is necessary. And if this stuff helps comic artists reach more readers, so be it. We can always hope that making simple tasks easy today will encourage artists to try harder things tomorrow.
[1] https://www.leaderonomics.com/articles/personal/bill-watters...
It is an element of storytelling. If the artist isn't doing it, its because they're doing a bad job for storytelling, or publishing in a standard format where the beats are always the same (like saturday morning comics in the newspaper, if those still exist).
YAC Reader [1] has great panel recognition. My other favorite comic reader has attempted but never got something that I could use.
Money quote. Applicable to so many areas of ML/AI.
GPT does similarly poorly if you ask it historical questions like "who were the most influential art educators of the 19th century?" It will respond with a jumble of people from different eras and books that those people did not write.
At any rate, if it keeps hallucinating answers then it means it simply doesn't know. Either it wasn't a part of the dataset or it wasn't mentioned often enough to be memorised.