undefined | Better HN

0 pointsparineum4y ago0 comments

I don't know much about webrtc but I do have some security cameras, frigate and home assistant all working together with rtmp streams.

There are some webrtc solution for getting those streams into home assistant with low latency but they are... I don't know the word. They aren't difficult to set up because the instructions are very simple, however, they don't work when I follow them and, from reading forums, that's not uncommon. I have _no_ idea why it doesn't work.

I don't really understand why I can't spin up a docker container that will take my rtmp streams and convert them to webrtc then hook that into home assistant.

I've gathered that webrtc just doesn't work that way but why can't it?

0 comments

3 comments · 1 top-level

Karrot_Kream4y ago· 2 in thread

Heh, welcome to the world of livestreaming media. The reason why it's hard to create this kind of simple "stream in, stream out" abstraction is because most IP Voice/Video stacks are architected very differently than stateless net protocols that are popular today. IP streaming generally works by:

1. A signaling layer that helps setup the connection metadata (a layer where the sender can say they're the sender, that they'll be sending data to port n, that the data will be encoded using codec foo, etc)

2. Media streams that are opened based on the metadata transferred over the signaling layer that are usually just streams of encoded packets being pushed over the wire as fast as the media source and the network allows.

Most IP Media stacks (RTSP, RTMP, WebRTC, SIP, XMPP, Matrix, etc) follow this same pattern. This is different than "modern" protocols like HTTP where signaling is bound together with data using framing (e.g. HTTP headers for signaling vs the HTTP request/response body for data.) This design makes IP media stacks especially fragile to NAT connectivity issues and especially hard to proxy. There are typically good reasons this is done (due to latency, non-blocking reads, head-of-line blocking, etc) but these "good reasons" are becoming less good as innovations in lower networking layers (like QUIC or TCPLS) create conditions that make it much easier to organize IP Media in a manner more similar to HTTP. Hopefully one day you'll just be able to take IP Media streams and "convert" or "proxy" them from one format to another.

vr000m4y ago

All the listed protocols came after HTTP. RTSP, SIP borrowed heavily (albeit badly in retrospect) from HTTP.

I do not have all the historical context (early 90s), but for WebRTC, the idea was to not define any new protocol(s) or do a clean slate design. but rather to just agree on the flavors of the various protocols, and then to universally implement those. We already had SDP, RTSP, RTP, SAP, etc. And the idea was to cobble together the existing protocols into something everyone could agree on (the young companies, the old companies, etc)

We ended up defining variations to the flavors that we already had and for the most part everything turned out okay (maybe the SDP plan wars did not end up where we wanted it, but… it was a good enough compromise).

For realtime media, if we are able combine “locator:identifier” issue, we will be able to make media and signaling work inband.

Karrot_Kream4y ago

I know they came later, so I'm still confused why RTSP and SIP weren't implemented atop HTTP. I realize that RTSP and SIP can push server to client, but there's ways around that, though perhaps long polling and Websockets weren't conceivable when RTSP and SIP were invented. I mean, in a pinch, I have an HTTP server serving a folder where SDP files are generated, and I've written clients that just look for a well-known SDP file and use that to consume an RDP stream. It's a ghetto form of "signaling" that I love using when doing experiments (not suitable for production for various reasons obvious to you I imagine.)

I'm not saying WebRTC had poor design decisions or anything. I think it was very smart for WebRTC to reuse SDP, RDP, etc so the same libraries and pipelines could keep working with minimal changes. It also means very little new learning for folks familiar with the rest of the stack.

> For realtime media, if we are able combine “locator:identifier” issue, we will be able to make media and signaling work inband.

+1000. I think RTSP+TCP is a decent way to do in-band signaling and media, and RTMP defines strict ways to send both anyway.

1 more reply

j / k navigate · click thread line to collapse