If you are trying to stream desktop, camera, and microphone to the browser, I would recommend pion's mediadevices package [1].
I've been wanting to build a hackday project that takes images captured from our satellites and builds a video stream. (we gets hundreds of new shots from space every minute, for fun create a screensaver-like streaming video feed of interesting pictures)
perhaps I can build it with either ffmpeg or pion (using pkg/driver/screen as a model to create a virtual canvas to draw on)
* -fflags +genpts, +igndts, +ignidx
* -vsync
* -copyts
* -use_wallclock_as_timestamps 1
* And more that you find even when you thought you had seen all flags that might be related.
FFmpeg docs are a strange beast, they cover a lot of topics, but are extremely shallow in most of them, so the overall quality ends up being pretty poor. I mean it's like the kind of frowned upon code comments such as "ignidx ignores the index; genpts generates PTS". No surprises there... but no real explanation, either.
What I'd love is for a real, technical explanation of what are the consequences of each flag, and more importantly, the kind of scenarios where they would make a desirable difference.
Especially for the case of recording live video that comes from an unreliable connection (RTP through UDP) and storing it as-is (no transcoding whatsoever): what is the best, or recommended set of flags that FFmpeg authors would recommend? Given that packets can get lost, or timestamps can get garbled, UDP packets reordered in the network, or any combination of funny stuff.
For now I've sort of decided on using genpts+igndts and use_wallclock_as_timestamps, but all comes from intuition and simple tests, and not from actual evidence and guided by technical documentation of each flag.
A universal translator framework cannot provide a bespoke translation engine for all possible permutations of source and target language. Instead it provides a common engine which is meant to be suitable enough for the most common traits shared across languages.
When converting any two languages at random, there will be quirks of the language or errors/ambiguity in the source prose which the engine cannot hope to all automatically recognize and accommodate, so there are all these options that do one specific thing and allow users to modify a step of the translation process. The docs cannot go in detail because the downstream ramifications of the option can vary based on the exact properties of the source-target pair and the transformations requested of ffmpeg. Instead the docs will describe the exact change directly triggered by the option.
----
As for the specific options,
* -fflags +genpts, +igndts, +ignidx
All of these apply to inputs only.
genpts: if input packets have missing presentation timestamps, this option will assign the decoding timestamp as PTS, if present.
igndts will unset dts if packet's pts is set.
ignidx is only applied to a few formats. These provide a keyframe index, which ffmpeg uses to populate its internal KF index for the stream. This option makes ffmpeg ignore the supplied index.
* -vsync
The option is misnamed. It's better called fpsmode. Most commonly used to drop or duplicate frames to achieve a constant framerate stream.
* -copyts
FFmpeg will, by default, remove any starting offset to input timestamps or adjust timestamps if they overflow (roll over) or have a large gap. copyts stops all that and relays input timestamps. Basically used if one wishes to manually examine and adjust timestamps using setpts filter or setts bitstream filter.
* -use_wallclock_as_timestamps 1
Discards input timestamps and assign system clock time at time of handling packet as its pts.
You already provided better lines for some of the options than what their docs state, although I'd still miss a small commentary about some example instances where some of them would be useful.
For example: "igndts will unset dts if packet's pts is set"... OK but why would anyone want to do that? DTS is for Decoding, PTS is for Presentation, so wouldn't mixing them cause presentation issues?
As mentioned I'm interested in storing UDP RTP as-is, and for that I'm using "-fflags +genpts+igndts -use_wallclock_as_timestamps 1" because intuitively it makes sense to me that potentially broken incoming timestamps should be ignored and new ones written from scratch, but now that you mention it, maybe "+genpts" is doing nothing in this scenario?
For latency, specify a short GOP size, e.g. `-g 50`
I'm not affiliated with the project, it's just really performant and reliable.
It was for audio and it was webrtc to ffmpeg. I was streaming a group chat directly to s3.
It mostly worked, but the only problem I ran into was syncing issues if a user had a spotty connection. The solution seemed to involve using rtmp to synchronize but I didn’t have a chance to go down that rabbit hole.
https://www.meetecho.com/blog/whip-janus/
https://millicast.medium.com/whip-the-magic-bullet-for-webrt...
Anyway, HLS has latency, just by definition. The "H" stands for "HTTP" (a synchronous protocol, based on TCP). RT[S]P uses UDP or RDT, and is isochronous.
* https://softwareengineering.stackexchange.com/questions/1471...
How do I play video files stored in my VPS to Chromecast?
I want my mom to watch a video, from her TV, but I can't upload it to YouTube due to copyrighted content (yes, even if you set unlisted, YouTube will block it).
https://github.com/skorokithakis/catt
You might need a VPN as whatever is running catt must be able to connect to your chromecast and chromecast must be able to pull from whatever is running catt.
We watch all the movies this way - just cast an mp4 file. Works great on a local network.
Both let you stream their videos to Chromecast last I checked.
Plex also has support for pictures which might be interesting in some related cases
I created a commercial product Video Hub App and have been trying for a year to get streaming a video from a PC to an iPhone working (through a PWA, not a dedicated iOS app) and have had 0 success. I could get the video stream to play on a separate laptop through Chrome, but iOS Safari kicks my ass.
Does anyone have suggestions / ideas?
You can however go the way described in the post: instead of requesting data though the data channel, you can initiate video/audio channels and make your streaming work pretty much like google hangouts, having your streamer as a participant.
It is not the recommended way though. But no other way for iOs anyways.
I can zoom up to factor 10 like this:
ffmpeg -i someimage.jpg -vf "zoompan=z='10-on/100':d=1000:x='iw/2-(iw/zoom/2)':y='ih/2-(ih/zoom/2)':s=1920x1437" zoom.mp4
But everything above a zoom of 10 seems to fail. Is there a hard limit in the code for some reason? Some way to overcome this?
Or is there another nice linux or online tool to do zooms into images?
I came across your post[0] about KVS from a while ago. Thank you for your work on pion and KVS.
A quick question on the KVS, C implementation. Is this in anyway tied to be used with AWS Kinesis? Can it be used with Wowza for instance?
go run . -rtbufsize 100M -f dshow -i video="Integrated Webcam" -pix_fmt yuv420p -c:v libx264 -bsf:v h264_mp4toannexb -b:v 2M -max_delay 0 -bf 0 - < SDP
Connection State has changed failed
Peer Connection State has changed: failed
Peer Connection has gone to failed exiting