undefined | Better HN

0 pointssoultrees2y ago0 comments

Oh for sure, but since most screenplays are standardized to a point, the idea is to just use Regex to get there, then use a local version of llama 2 to finish up the classifications.

Where the fun with LLMs come in, is after all the screenplays have been parsed and turned into a dataset that is trained on not just the story but also the cinematography aspects, as well since we’ve broken it down to a granular detail of each element of each scene.

0 comments

ajani2y ago

I see where you're going with this. Elemental (motif-based) editing instead of timeline and frame based editing?

soultreesOP2y ago

Timeline and frame based editing is the end result but this more about the elemental creation and from there, editing it into time based scene. Ive spent the last few years in the cinematography department, and most directors and director of photographers will write the scenes on flash cards and move them around and rearrange them because not every story is linear, even though we have to find a way to present every story in a linear form. And from there each scenes requires multiple angles, shots, motivations and things change so much on the fly, that a screenplay becomes a document that becomes quite dense with non-presented information.

So, I suppose the next step to this would be to parse a bunch of screenplays from different formats, into a single readable format and then train an image model on the frames of those movies we also trained the text model with screenplays on to get a cross reference of what is written down vs what is displayed visually. And we can break down the visual shots with camera movements, steadicam, dolly move etc as well as identify key props in the image model (maybe. Sounds expensive) and compare them to key props in the script. I don’t know, I’m spitballing now but a multi-modal Hollywood film producer would be kind of fun but this totally is just starting as a way to standardize the script in a granular form and to code since I’m not out on set.

ajani2y ago

I see. So would the dialogue be part of the textual description?

What would be a typical unit? A keyframe? A bucket of frames?

Because the elements are temporal and visual, I think some form of “object in time” relation must be encoded in the description?

1 more reply

j / k navigate · click thread line to collapse

0 comments

ajani2y ago

I see where you're going with this. Elemental (motif-based) editing instead of timeline and frame based editing?

soultreesOP2y ago

ajani2y ago

I see. So would the dialogue be part of the textual description?

What would be a typical unit? A keyframe? A bucket of frames?

Because the elements are temporal and visual, I think some form of “object in time” relation must be encoded in the description?

1 more reply

j / k navigate · click thread line to collapse