FFmpeg to WebRTC (opens in new tab)

(github.com)

215 pointsashellunts4y ago46 comments

46 comments

39 comments · 19 top-level

yboris4y ago· 4 in thread

[help request]

I created a commercial product Video Hub App and have been trying for a year to get streaming a video from a PC to an iPhone working (through a PWA, not a dedicated iOS app) and have had 0 success. I could get the video stream to play on a separate laptop through Chrome, but iOS Safari kicks my ass.

Does anyone have suggestions / ideas?

https://github.com/whyboris/Video-Hub-App

https://github.com/whyboris/Video-Hub-App-remote

at0mic224y ago

Easy thing: iOs on iPhone does not support MediaFileExtensions, so you can't use <video> tag with dynamic source.

You can however go the way described in the post: instead of requesting data though the data channel, you can initiate video/audio channels and make your streaming work pretty much like google hangouts, having your streamer as a participant.

It is not the recommended way though. But no other way for iOs anyways.

boraturan4y ago

I don't understand, why not use webrtc or hls, on iOS Safari?

j1elo4y ago

What's this? You have a server component in the PC, with access to local videos, then you want to play those videos back remotely on the iPhone?

yboris4y ago

Correct! I've even tried transcoding via FFmpeg - successful video playback on PC over WiFi but not on iPhone - https://github.com/whyboris/Video-Hub-App/pull/611

1 more reply

ryanar4y ago· 2 in thread

Note that there are a lot of tunings that you may need depending on what your latency tolerance and picture quality tolerance is. I would recommend following FFmpeg's streaming guide [0].

If you are trying to stream desktop, camera, and microphone to the browser, I would recommend pion's mediadevices package [1].

[0] - https://trac.ffmpeg.org/wiki/StreamingGuide

[1] - https://github.com/pion/mediadevices

anotherjesse4y ago

Thanks!

I've been wanting to build a hackday project that takes images captured from our satellites and builds a video stream. (we gets hundreds of new shots from space every minute, for fun create a screensaver-like streaming video feed of interesting pictures)

perhaps I can build it with either ffmpeg or pion (using pkg/driver/screen as a model to create a virtual canvas to draw on)

mwint4y ago

What industry do you work in? Sounds interesting to have many satellite images streaming in!

1 more reply

j1elo4y ago· 2 in thread

My all time question about FFmpeg is what are all those timestamp correction flags and synchronization options for:

* -fflags +genpts, +igndts, +ignidx

* -vsync

* -copyts

* -use_wallclock_as_timestamps 1

* And more that you find even when you thought you had seen all flags that might be related.

FFmpeg docs are a strange beast, they cover a lot of topics, but are extremely shallow in most of them, so the overall quality ends up being pretty poor. I mean it's like the kind of frowned upon code comments such as "ignidx ignores the index; genpts generates PTS". No surprises there... but no real explanation, either.

What I'd love is for a real, technical explanation of what are the consequences of each flag, and more importantly, the kind of scenarios where they would make a desirable difference.

Especially for the case of recording live video that comes from an unreliable connection (RTP through UDP) and storing it as-is (no transcoding whatsoever): what is the best, or recommended set of flags that FFmpeg authors would recommend? Given that packets can get lost, or timestamps can get garbled, UDP packets reordered in the network, or any combination of funny stuff.

For now I've sort of decided on using genpts+igndts and use_wallclock_as_timestamps, but all comes from intuition and simple tests, and not from actual evidence and guided by technical documentation of each flag.

gyan4y ago

Think of ffmpeg as a universal translator. There are 100s of languages in use around the world, with their own syntax, vocabulary, writing system, formal and informal conventions ..etc.

A universal translator framework cannot provide a bespoke translation engine for all possible permutations of source and target language. Instead it provides a common engine which is meant to be suitable enough for the most common traits shared across languages.

When converting any two languages at random, there will be quirks of the language or errors/ambiguity in the source prose which the engine cannot hope to all automatically recognize and accommodate, so there are all these options that do one specific thing and allow users to modify a step of the translation process. The docs cannot go in detail because the downstream ramifications of the option can vary based on the exact properties of the source-target pair and the transformations requested of ffmpeg. Instead the docs will describe the exact change directly triggered by the option.

----

As for the specific options,

* -fflags +genpts, +igndts, +ignidx

All of these apply to inputs only.

genpts: if input packets have missing presentation timestamps, this option will assign the decoding timestamp as PTS, if present.

igndts will unset dts if packet's pts is set.

ignidx is only applied to a few formats. These provide a keyframe index, which ffmpeg uses to populate its internal KF index for the stream. This option makes ffmpeg ignore the supplied index.

* -vsync

The option is misnamed. It's better called fpsmode. Most commonly used to drop or duplicate frames to achieve a constant framerate stream.

* -copyts

FFmpeg will, by default, remove any starting offset to input timestamps or adjust timestamps if they overflow (roll over) or have a large gap. copyts stops all that and relays input timestamps. Basically used if one wishes to manually examine and adjust timestamps using setpts filter or setts bitstream filter.

* -use_wallclock_as_timestamps 1

Discards input timestamps and assign system clock time at time of handling packet as its pts.

j1elo4y ago

Thank you; I understand what you mean with the translator metaphor. The project still needs to pick some very common usage scenarios and discuss them in depth (kind of "let's pick English and Spanish, two very commonly used languages, and talk about all quirks and translation techniques", in your own example).

You already provided better lines for some of the options than what their docs state, although I'd still miss a small commentary about some example instances where some of them would be useful.

For example: "igndts will unset dts if packet's pts is set"... OK but why would anyone want to do that? DTS is for Decoding, PTS is for Presentation, so wouldn't mixing them cause presentation issues?

As mentioned I'm interested in storing UDP RTP as-is, and for that I'm using "-fflags +genpts+igndts -use_wallclock_as_timestamps 1" because intuitively it makes sense to me that potentially broken incoming timestamps should be ignored and new ones written from scratch, but now that you mention it, maybe "+genpts" is doing nothing in this scenario?

1 more reply

ChrisMarshallNY4y ago· 2 in thread

I did something similar for Mac, a while back[0]. I never really developed it much farther, because of the latency issues. Since it was for surveillance cameras, that was a showstopper.

[0] https://github.com/RiftValleySoftware/RVS_MediaServer

metanonsense4y ago

I think we evaluated something like this (ffmpeg to rtc with kurento) to broadcast the screen of mobile devices to a web browser. If I remember correctly, with the correct ffmpeg settings, latency became more than acceptable.

ChrisMarshallNY4y ago

I believe that. I'm sure that I could have greatly reduced the latency, but tuning ffmpeg is not for the faint of heart, and my heart wasn't really into it.

Anyway, HLS has latency, just by definition. The "H" stands for "HTTP" (a synchronous protocol, based on TCP). RT[S]P uses UDP or RDT, and is isochronous.

1 more reply

shireboy4y ago· 2 in thread

I was just looking for something to do this, but couldn’t find much. I need to serve up about 1000 cameras to both hls (for public) and webrtc (for low latency/ptz admin use). Today we do it with paid packages, but I was exploring just using ffmpeg + nginx. Hls is easy enough, but since webrtc is not http, needs its own piece. Anyone have ideas on this? I’m familiar with Wowza and Ant. Any other open source utilities that do rtsp to both hls/webrtc?

thebruce87m4y ago

Would you mind sharing the paid packages you use?

shireboy4y ago

Wowza. We put an open source nginx cache layer in front, and cloudfront in front of that. It’s a pretty cool setup, but I feel I could do most of the HLS in nginx+ffmpeg.

1 more reply

wiradikusuma4y ago· 2 in thread

Since we're on this topic, I want to ask a question:

How do I play video files stored in my VPS to Chromecast?

I want my mom to watch a video, from her TV, but I can't upload it to YouTube due to copyrighted content (yes, even if you set unlisted, YouTube will block it).

notyourday4y ago

catt (Cast All The Things!) is your friend.

https://github.com/skorokithakis/catt

You might need a VPN as whatever is running catt must be able to connect to your chromecast and chromecast must be able to pull from whatever is running catt.

We watch all the movies this way - just cast an mp4 file. Works great on a local network.

y4mi4y ago

Plex and jellyfin are options if its a video that they might want to watch repeatedly at later times.

Both let you stream their videos to Chromecast last I checked.

Plex also has support for pictures which might be interesting in some related cases

Sean-Der4y ago· 2 in thread

Great work! This makes WebRTC much more accessible, it not being available in ffmpeg makes people default to worse alternatives.

smokenadmirrors4y ago

hi, I dont intend to hijack this thread and I dont know if this is considered [OT].

I came across your post[0] about KVS from a while ago. Thank you for your work on pion and KVS.

A quick question on the KVS, C implementation. Is this in anyway tied to be used with AWS Kinesis? Can it be used with Wowza for instance?

[0] https://news.ycombinator.com/item?id=21951692

Sean-Der4y ago

Yes it could! It is vendor agnostic, just provides a WebRTC API that can be used anywhere.

1 more reply

_Gyan_4y ago· 1 in thread

For `-f h264`, `-bsf:v h264_mp4toannexb` is not needed. It will be automatically inserted as needed, with ffmpeg 4.0 or later.

For latency, specify a short GOP size, e.g. `-g 50`

ashelluntsOP4y ago

Thank you, will try.

mg4y ago· 1 in thread

Since there are probably some people experienced with ffmpeg here, is it possible to to image zooms with ffmpg that go deeper then zoom factor 10?

I can zoom up to factor 10 like this:

ffmpeg -i someimage.jpg -vf "zoompan=z='10-on/100':d=1000:x='iw/2-(iw/zoom/2)':y='ih/2-(ih/zoom/2)':s=1920x1437" zoom.mp4

But everything above a zoom of 10 seems to fail. Is there a hard limit in the code for some reason? Some way to overcome this?

Or is there another nice linux or online tool to do zooms into images?

AbsoluteDestiny4y ago

You could just crop then scale which should give you a lot more control.

smokenadmirrors4y ago· 1 in thread

This is what I got for

go run . -rtbufsize 100M -f dshow -i video="Integrated Webcam" -pix_fmt yuv420p -c:v libx264 -bsf:v h264_mp4toannexb -b:v 2M -max_delay 0 -bf 0 - < SDP

Connection State has changed failed

Peer Connection State has changed: failed

Peer Connection has gone to failed exiting

ashelluntsOP4y ago

When pasting SDP back to browser, make sure text box is empty. It has an annoying space there that can mess things up.

baxuz4y ago· 1 in thread

Cool! I wish there was an easy way to consume browser streams in FFmpeg — the other way around.

ashelluntsOP4y ago

https://github.com/t-mullen/wrtc-to-ffmpeg

Or https://github.com/pion/webrtc/tree/master/examples/save-to-...

milankragujevic4y ago

I personally use this project to proxy IP camera RTSP stream via Web Sockets as fragmented MP4 - https://github.com/deepch/RTSPtoWSMP4f

I'm not affiliated with the project, it's just really performant and reliable.

1 more reply

whoisjohnkid4y ago

Nice stuff; I did something similar with ffmpeg and pion.

It was for audio and it was webrtc to ffmpeg. I was streaming a group chat directly to s3.

It mostly worked, but the only problem I ran into was syncing issues if a user had a spotty connection. The solution seemed to involve using rtmp to synchronize but I didn’t have a chance to go down that rabbit hole.

nerdbaggy4y ago

Hopefully WHIP takes off. It’s a standard protocol that would easily allow things to interface with WebRTC.

https://www.meetecho.com/blog/whip-janus/

https://millicast.medium.com/whip-the-magic-bullet-for-webrt...

j1elo4y ago

To the author: if you really want to be permissive about what others can do with your software, a MIT, BSD, or Apache 2 license (which is more complete in that it even includes a patent grant) seem to be more widely recognized and well tested than the Unlicense. Unless you did choose that license for some solid reasons, I'd suggest to consider switching to one of the other better regarded licences.

* https://softwareengineering.stackexchange.com/questions/1471...

* https://news.ycombinator.com/item?id=3610208

est4y ago

Offtopic, are there any streaming gateway to automatically insert CC subtitles into the video container on the fly?

vanillax4y ago

As someone completely new to go, how do I run this? I have go installed but I cant seem to get any of the sample commands to work. I pulled the repo, cd'd into directory and ran the GO sample command that was provided in the source, but the terminal just hangs and blinks with no output.

defied4y ago

Would using gstreamer instead of ffmpeg offer better or worse performance? (Less CPU usage on the sender side?) If anyone has experience with this setup, I’d love to know.

at0mic224y ago

Sending one frame per message is quite expensive, I'd do some buffer aggregation instead

j / k navigate · click thread line to collapse

46 comments

39 comments · 19 top-level

yboris4y ago· 4 in thread

[help request]

Does anyone have suggestions / ideas?

https://github.com/whyboris/Video-Hub-App

https://github.com/whyboris/Video-Hub-App-remote

at0mic224y ago

Easy thing: iOs on iPhone does not support MediaFileExtensions, so you can't use <video> tag with dynamic source.

It is not the recommended way though. But no other way for iOs anyways.

boraturan4y ago

I don't understand, why not use webrtc or hls, on iOS Safari?

j1elo4y ago

What's this? You have a server component in the PC, with access to local videos, then you want to play those videos back remotely on the iPhone?

yboris4y ago

Correct! I've even tried transcoding via FFmpeg - successful video playback on PC over WiFi but not on iPhone - https://github.com/whyboris/Video-Hub-App/pull/611

1 more reply

ryanar4y ago· 2 in thread

Note that there are a lot of tunings that you may need depending on what your latency tolerance and picture quality tolerance is. I would recommend following FFmpeg's streaming guide [0].

If you are trying to stream desktop, camera, and microphone to the browser, I would recommend pion's mediadevices package [1].

[0] - https://trac.ffmpeg.org/wiki/StreamingGuide

[1] - https://github.com/pion/mediadevices

anotherjesse4y ago

Thanks!

perhaps I can build it with either ffmpeg or pion (using pkg/driver/screen as a model to create a virtual canvas to draw on)

mwint4y ago

What industry do you work in? Sounds interesting to have many satellite images streaming in!

1 more reply

j1elo4y ago· 2 in thread

My all time question about FFmpeg is what are all those timestamp correction flags and synchronization options for:

* -fflags +genpts, +igndts, +ignidx

* -vsync

* -copyts

* -use_wallclock_as_timestamps 1

* And more that you find even when you thought you had seen all flags that might be related.

What I'd love is for a real, technical explanation of what are the consequences of each flag, and more importantly, the kind of scenarios where they would make a desirable difference.

gyan4y ago

Think of ffmpeg as a universal translator. There are 100s of languages in use around the world, with their own syntax, vocabulary, writing system, formal and informal conventions ..etc.

----

As for the specific options,

* -fflags +genpts, +igndts, +ignidx

All of these apply to inputs only.

genpts: if input packets have missing presentation timestamps, this option will assign the decoding timestamp as PTS, if present.

igndts will unset dts if packet's pts is set.

ignidx is only applied to a few formats. These provide a keyframe index, which ffmpeg uses to populate its internal KF index for the stream. This option makes ffmpeg ignore the supplied index.

* -vsync

The option is misnamed. It's better called fpsmode. Most commonly used to drop or duplicate frames to achieve a constant framerate stream.

* -copyts

* -use_wallclock_as_timestamps 1

Discards input timestamps and assign system clock time at time of handling packet as its pts.

j1elo4y ago

You already provided better lines for some of the options than what their docs state, although I'd still miss a small commentary about some example instances where some of them would be useful.

1 more reply

ChrisMarshallNY4y ago· 2 in thread

I did something similar for Mac, a while back[0]. I never really developed it much farther, because of the latency issues. Since it was for surveillance cameras, that was a showstopper.

[0] https://github.com/RiftValleySoftware/RVS_MediaServer

metanonsense4y ago

ChrisMarshallNY4y ago

I believe that. I'm sure that I could have greatly reduced the latency, but tuning ffmpeg is not for the faint of heart, and my heart wasn't really into it.

Anyway, HLS has latency, just by definition. The "H" stands for "HTTP" (a synchronous protocol, based on TCP). RT[S]P uses UDP or RDT, and is isochronous.

1 more reply

shireboy4y ago· 2 in thread

thebruce87m4y ago

Would you mind sharing the paid packages you use?

shireboy4y ago

Wowza. We put an open source nginx cache layer in front, and cloudfront in front of that. It’s a pretty cool setup, but I feel I could do most of the HLS in nginx+ffmpeg.

1 more reply

wiradikusuma4y ago· 2 in thread

Since we're on this topic, I want to ask a question:

How do I play video files stored in my VPS to Chromecast?

I want my mom to watch a video, from her TV, but I can't upload it to YouTube due to copyrighted content (yes, even if you set unlisted, YouTube will block it).

notyourday4y ago

catt (Cast All The Things!) is your friend.

https://github.com/skorokithakis/catt

You might need a VPN as whatever is running catt must be able to connect to your chromecast and chromecast must be able to pull from whatever is running catt.

We watch all the movies this way - just cast an mp4 file. Works great on a local network.

y4mi4y ago

Plex and jellyfin are options if its a video that they might want to watch repeatedly at later times.

Both let you stream their videos to Chromecast last I checked.

Plex also has support for pictures which might be interesting in some related cases

Sean-Der4y ago· 2 in thread

Great work! This makes WebRTC much more accessible, it not being available in ffmpeg makes people default to worse alternatives.

smokenadmirrors4y ago

hi, I dont intend to hijack this thread and I dont know if this is considered [OT].

I came across your post[0] about KVS from a while ago. Thank you for your work on pion and KVS.

A quick question on the KVS, C implementation. Is this in anyway tied to be used with AWS Kinesis? Can it be used with Wowza for instance?

[0] https://news.ycombinator.com/item?id=21951692

Sean-Der4y ago

Yes it could! It is vendor agnostic, just provides a WebRTC API that can be used anywhere.

1 more reply

_Gyan_4y ago· 1 in thread

For `-f h264`, `-bsf:v h264_mp4toannexb` is not needed. It will be automatically inserted as needed, with ffmpeg 4.0 or later.

For latency, specify a short GOP size, e.g. `-g 50`

ashelluntsOP4y ago

Thank you, will try.

mg4y ago· 1 in thread

Since there are probably some people experienced with ffmpeg here, is it possible to to image zooms with ffmpg that go deeper then zoom factor 10?

I can zoom up to factor 10 like this:

ffmpeg -i someimage.jpg -vf "zoompan=z='10-on/100':d=1000:x='iw/2-(iw/zoom/2)':y='ih/2-(ih/zoom/2)':s=1920x1437" zoom.mp4

But everything above a zoom of 10 seems to fail. Is there a hard limit in the code for some reason? Some way to overcome this?

Or is there another nice linux or online tool to do zooms into images?

AbsoluteDestiny4y ago

You could just crop then scale which should give you a lot more control.

smokenadmirrors4y ago· 1 in thread

This is what I got for

go run . -rtbufsize 100M -f dshow -i video="Integrated Webcam" -pix_fmt yuv420p -c:v libx264 -bsf:v h264_mp4toannexb -b:v 2M -max_delay 0 -bf 0 - < SDP

Connection State has changed failed

Peer Connection State has changed: failed

Peer Connection has gone to failed exiting

ashelluntsOP4y ago

When pasting SDP back to browser, make sure text box is empty. It has an annoying space there that can mess things up.

baxuz4y ago· 1 in thread

Cool! I wish there was an easy way to consume browser streams in FFmpeg — the other way around.

ashelluntsOP4y ago

https://github.com/t-mullen/wrtc-to-ffmpeg

Or https://github.com/pion/webrtc/tree/master/examples/save-to-...

milankragujevic4y ago

I personally use this project to proxy IP camera RTSP stream via Web Sockets as fragmented MP4 - https://github.com/deepch/RTSPtoWSMP4f

I'm not affiliated with the project, it's just really performant and reliable.

1 more reply

whoisjohnkid4y ago

Nice stuff; I did something similar with ffmpeg and pion.

It was for audio and it was webrtc to ffmpeg. I was streaming a group chat directly to s3.

nerdbaggy4y ago

Hopefully WHIP takes off. It’s a standard protocol that would easily allow things to interface with WebRTC.

https://www.meetecho.com/blog/whip-janus/

https://millicast.medium.com/whip-the-magic-bullet-for-webrt...

j1elo4y ago

* https://softwareengineering.stackexchange.com/questions/1471...

* https://news.ycombinator.com/item?id=3610208

est4y ago

Offtopic, are there any streaming gateway to automatically insert CC subtitles into the video container on the fly?

vanillax4y ago

defied4y ago

Would using gstreamer instead of ffmpeg offer better or worse performance? (Less CPU usage on the sender side?) If anyone has experience with this setup, I’d love to know.

at0mic224y ago

Sending one frame per message is quite expensive, I'd do some buffer aggregation instead

j / k navigate · click thread line to collapse