Over the last few weeks, GStreamer’s RTP stack got a couple of new and quite useful features. As it is difficult to configure, mostly because there being so many different possible configurations, I decided to write about this a bit with some example code.
The features are RFC 6051-style rapid synchronization of RTP streams, which can be used for inter-stream (e.g. audio/video) synchronization as well as inter-device (i.e. network) synchronization, and the ability to easily retrieve absolute sender clock times per packet on the receiver side.
Note that each of this was already possible before with GStreamer via different mechanisms with different trade-offs. Obviously, not being able to have working audio/video synchronization would be simply not acceptable and I previously talked about how to do inter-device synchronization with GStreamer before, for example at the GStreamer Conference 2015 in Düsseldorf.
The example code below will make use of the GStreamer RTSP Server library but can be applied to any kind of RTP workflow, including WebRTC, and are written in Rust but the same can also be achieved in any other language. The full code can be found in this repository.
And for reference, the merge requests to enable all this are [1], [2] and [3]. You probably don’t want to backport those to an older version of GStreamer though as there are dependencies on various other changes elsewhere. All of the following needs at least GStreamer from the git main
branch as of today, or the upcoming 1.22 release.
Baseline Sender / Receiver Code
The starting point of the example code can be found here in the baseline
branch. All the important steps are commented so it should be relatively self-explanatory.
Sender
The sender is starting an RTSP server on the local machine on port 8554
and provides a media with H264 video and Opus audio on the mount point /test
. It can be started with
$ cargo run -p rtp-rapid-sync-example-send
After starting the server it can be accessed via GStreamer with e.g. gst-play-1.0 rtsp://127.0.0.1:8554/test
or similarly via VLC or any other software that supports RTSP.
This does not do anything special yet but lays the foundation for the following steps. It creates an RTSP server instance with a custom RTSP media factory, which in turn creates custom RTSP media instances. All this is not needed at this point yet but will allow for the necessary customization later.
One important aspect here is that the base time of the media’s pipeline is set to zero
pipeline.set_base_time(gst::ClockTime::ZERO); pipeline.set_start_time(gst::ClockTime::NONE);
This allows the timeoverlay
element that is placed in the video part of the pipeline to render the clock time over the video frames. We’re going to use this later to confirm on the receiver that the clock time on the sender and the one retrieved on the receiver are the same.
let video_overlay = gst::ElementFactory::make("timeoverlay", None) .context("Creating timeoverlay")?; [...] video_overlay.set_property_from_str("time-mode", "running-time");
It actually only supports rendering the running time of each buffer, but in a live pipeline with the base time set to zero the running time and pipeline clock time are the same. See the documentation for some more details about the time concepts in GStreamer.
Overall this creates the following RTSP stream producer bin, which will be used also in all the following steps:
Receiver
The receiver is a simple playbin
pipeline that plays an RTSP URI given via command-line parameters and runs until the stream is finished or an error has happened.
It can be run with the following once the sender is started
$ cargo run -p rtp-rapid-sync-example-recv -- "rtsp://192.168.1.101:8554/test"
Please don’t forget to replace the IP with the IP of the machine that is actually running the server.
All the code should be familiar to anyone who ever wrote a GStreamer application in Rust, except for one part that might need a bit more explanation
pipeline.connect_closure( "source-setup", false, glib::closure!(|_playbin: &gst::Pipeline, source: &gst::Element| { source.set_property("latency", 40u32); }), );
playbin
is going to create an rtspsrc
, and at that point it will emit the source-setup
signal so that the application can do any additional configuration of the source element. Here we’re connecting a signal handler to that signal to do exactly that.
By default rtspsrc
introduces a latency of 2 seconds of latency, which is a lot more than what is usually needed. For live, non-VOD RTSP streams this value should be around the network jitter and here we’re configuring that to 40 milliseconds.
Retrieval of absolute sender clock times
Now as the first step we’re going to retrieve the absolute sender clock times for each video frame on the receiver. They will be rendered by the receiver at the bottom of each video frame and will also be printed to stdout
. The changes between the previous version of the code and this version can be seen here and the final code here in the sender-clock-time-retrieval
branch.
When running the sender and receiver as before, the video from the receiver should look similar to the following
The upper time that is rendered on the video frames is rendered by the sender, the bottom time is rendered by the receiver and both should always be the same unless something is broken here. Both times are the pipeline clock time when the sender created/captured the video frame.
In this configuration the absolute clock times of the sender are provided to the receiver via the NTP / RTP timestamp mapping provided by the RTCP Sender Reports. That’s also the reason why it takes about 5s for the receiver to know the sender’s clock time as RTCP packets are not scheduled very often and only after about 5s by default. The RTCP interval can be configured on rtpbin
together with many other things.
Sender
On the sender-side the configuration changes are rather small and not even absolutely necessary.
rtpbin.set_property_from_str("ntp-time-source", "clock-time");
By default the RTP NTP time used in the RTCP packets is based on the local machine’s walltime clock converted to the NTP epoch. While this works fine, this is not the clock that is used for synchronizing the media and as such there will be drift between the RTP timestamps of the media and the NTP time from the RTCP packets, which will be reset every time the receiver receives a new RTCP Sender Report from the sender.
Instead, we configure rtpbin
here to use the pipeline clock as the source for the NTP timestamps used in the RTCP Sender Reports. This doesn’t give us (by default at least, see later) an actual NTP timestamp but it doesn’t have the drift problem mentioned before. Without further configuration, in this pipeline the used clock is the monotonic system clock.
rtpbin.set_property("rtcp-sync-send-time", false);
rtpbin
normally uses the time when a packet is sent out for the NTP / RTP timestamp mapping in the RTCP Sender Reports. This is changed with this property to instead use the time when the video frame / audio sample was captured, i.e. it does not include all the latency introduced by encoding and other processing in the sender pipeline.
This doesn’t make any big difference in this scenario but usually one would be interested in the capture clock times and not the send clock times.
Receiver
On the receiver-side there are a few more changes. First of all we have to opt-in to rtpjitterbuffer
putting a reference timestamp metadata on every received packet with the sender’s absolute clock time.
pipeline.connect_closure( "source-setup", false, glib::closure!(|_playbin: &gst::Pipeline, source: &gst::Element| { source.set_property("latency", 40u32); source.set_property("add-reference-timestamp-meta", true); }), );
rtpjitterbuffer
will start putting the metadata on packets once it knows the NTP / RTP timestamp mapping, i.e. after the first RTCP Sender Report is received in this case. Between the Sender Reports it is going to interpolate the clock times. The normal timestamps (PTS) on each packet are not affected by this and are still based on whatever clock is used locally by the receiver for synchronization.
To actually make use of the reference timestamp metadata we add a timeoverlay
element as video-filter
on the receiver:
let timeoverlay = gst::ElementFactory::make("timeoverlay", None).context("Creating timeoverlay")?; timeoverlay.set_property_from_str("time-mode", "reference-timestamp"); timeoverlay.set_property_from_str("valignment", "bottom"); pipeline.set_property("video-filter", &timeoverlay);
This will then render the sender’s absolute clock times at the bottom of each video frame, as seen in the screenshot above.
And last we also add a pad probe on the sink pad of the timeoverlay
element to retrieve the reference timestamp metadata of each video frame and then printing the sender’s clock time to stdout
:
let sinkpad = timeoverlay .static_pad("video_sink") .expect("Failed to get timeoverlay sinkpad"); sinkpad .add_probe(gst::PadProbeType::BUFFER, |_pad, info| { if let Some(gst::PadProbeData::Buffer(ref buffer)) = info.data { if let Some(meta) = buffer.meta::<gst::ReferenceTimestampMeta>() { println!("Have sender clock time {}", meta.timestamp()); } else { println!("Have no sender clock time"); } } gst::PadProbeReturn::Ok }) .expect("Failed to add pad probe");
Rapid synchronization via RTP header extensions
The main problem with the previous code is that the sender’s clock times are only known once the first RTCP Sender Report is received by the receiver. There are many ways to configure rtpbin
to make this happen faster (e.g. by reducing the RTCP interval or by switching to the AVPF
RTP profile) but in any case the information would be transmitted outside the actual media data flow and it can’t be guaranteed that it is actually known on the receiver from the very first received packet onwards. This is of course not a problem in every use-case, but for the cases where it is there is a solution for this problem.
RFC 6051 defines an RTP header extension that allows to transmit the NTP timestamp that corresponds an RTP packet directly together with this very packet. And that’s what the next changes to the code are making use of.
The changes between the previous version of the code and this version can be seen here and the final code here in the rapid-synchronization
branch.
Sender
To add the header extension on the sender-side it is only necessary to add an instance of the corresponding header extension implementation to the payloaders.
let hdr_ext = gst_rtp::RTPHeaderExtension::create_from_uri( "urn:ietf:params:rtp-hdrext:ntp-64", ) .context("Creating NTP 64-bit RTP header extension")?; hdr_ext.set_id(1); video_pay.emit_by_name::<()>("add-extension", &[&hdr_ext]);
This first instantiates the header extension based on the uniquely defined URI for it, then sets its ID to 1
(see RFC 5285) and then adds it to the video payloader. The same is then done for the audio payloader.
By default this will add the header extension to every RTP packet that has a different RTP timestamp than the previous one. In other words: on the first packet that corresponds to an audio or video frame. Via properties on the header extension this can be configured but generally the default should be sufficient.
Receiver
On the receiver-side no changes would actually be necessary. The use of the header extension is signaled via the SDP (see RFC 5285) and it will be automatically made use of inside rtpbin
as another source of NTP / RTP timestamp mappings in addition to the RTCP Sender Reports.
However, we configure one additional property on rtpbin
source.connect_closure( "new-manager", false, glib::closure!(|_rtspsrc: &gst::Element, rtpbin: &gst::Element| { rtpbin.set_property("min-ts-offset", gst::ClockTime::from_mseconds(1)); }), );
Inter-stream audio/video synchronization
The reason for configuring the min-ts-offset
property on the rtpbin
is that the NTP / RTP timestamp mapping is not only used for providing the reference timestamp metadata but it is also used for inter-stream synchronization by default. That is, for getting correct audio / video synchronization.
With RTP alone there is no mechanism to synchronize multiple streams against each other as the packet’s RTP timestamps of different streams have no correlation to each other. This is not too much of a problem as usually the packets for audio and video are received approximately at the same time but there’s still some inaccuracy in there.
One approach to fix this is to use the NTP / RTP timestamp mapping for each stream, either from the RTCP Sender Reports or from the RTP header extension, and that’s what is made use of here. And because the mapping is provided very often via the RTP header extension but the RTP timestamps are only accurate up to clock rate (1/90000s for video and 1/48000s) for audio in this case, we configure a threshold of 1ms for adjusting the inter-stream synchronization. Without this it would be adjusted almost continuously by a very small amount back and forth.
Other approaches for inter-stream synchronization are provided by RTSP itself before streaming starts (via the RTP-Info
header), but due to a bug this is currently not made use of by GStreamer.
Yet another approach would be via the clock information provided by RFC 7273, about which I already wrote previously and which is also supported by GStreamer. This also allows inter-device, network synchronization and used for that purpose as part of e.g. AES67, Ravenna, SMPTE 2022 / 2110 and many other protocols.
Inter-device network synchronization
Now for the last part, we’re going to add actual inter-device synchronization to this example. The changes between the previous version of the code and this version can be seen here and the final code here in the network-sync
branch. This does not use the clock information provided via RFC 7273 (which would be another option) but uses the same NTP / RTP timestamp mapping that was discussed above.
When starting the receiver multiple times on different (or the same) machines, each of them should play back the media synchronized to each other and exactly 2 seconds after the corresponding audio / video frames are produced on the sender.
For this, both sender and all receivers are using an NTP clock (pool.ntp.org
in this case) instead of the local monotonic system clock for media synchronization (i.e. as the pipeline clock). Instead of an NTP clock it would also be possible to any other mechanism for network clock synchronization, e.g. PTP or the GStreamer netclock.
println!("Syncing to NTP clock"); clock .wait_for_sync(gst::ClockTime::from_seconds(5)) .context("Syncing NTP clock")?; println!("Synced to NTP clock");
This code instantiates a GStreamer NTP clock and then synchronously waits up to 5 seconds for it to synchronize. If that fails then the application simply exits with an error.
Sender
On the sender side all that is needed is to configure the RTSP media factory, and as such the pipeline used inside it, to use the NTP clock
factory.set_clock(Some(&clock));
This causes all media inside the sender’s pipeline to be synchronized according to this NTP clock and to also use it for the NTP timestamps in the RTCP Sender Reports and the RTP header extension.
Receiver
On the receiver side the same has to happen
pipeline.use_clock(Some(&clock));
In addition a couple more settings have to be configured on the receiver though. First of all we configure a static latency of 2 seconds on the receiver’s pipeline.
pipeline.set_latency(gst::ClockTime::from_seconds(2));
This is necessary as GStreamer can’t know the latency of every receiver (e.g. different decoders might be used), and also because the sender latency can’t be automatically known. Each audio / video frame will be timestamped on the receiver with the NTP timestamp when it was captured / created, but since then all the latency of the sender, the network and the receiver pipeline has passed and for this some compensation must happen.
Which value to use here depends a lot on the overall setup, but 2 seconds is a (very) safe guess in this case. The value only has to be larger than the sum of sender, network and receiver latency and in the end has the effect that the receiver is showing the media exactly that much later than the sender has produced it.
And last we also have to tell rtpbin
that
- sender and receiver clock are synchronized to each other, i.e. in this case both are using exactly the same NTP clock, and that no translation to the pipeline’s clock is necessary, and
- that the outgoing timestamps on the receiver should be exactly the sender timestamps and that this conversion should happen based on the NTP / RTP timestamp mapping
source.set_property_from_str("buffer-mode", "synced"); source.set_property("ntp-sync", true);
And that’s it.
A careful reader will also have noticed that all of the above would also work without the RTP header extension, but then the receivers would only be synchronized once the first RTCP Sender Report is received. That’s what the test-netclock.c / test-netclock-client.c example from the GStreamer RTSP server is doing.
As usual with RTP, the above is by far not the only way of doing this and GStreamer also supports various other synchronization mechanisms. Which one is the correct one for a specific use-case depends on a lot of factors.
Hi Sebastian, thank you for writing this great blog I can’t wait to try out the new gstreamer version.
For those of us that can’t upgrade to the latest gstreamer and are stuck with 1.16, can you please comment on how to go about synchronizing video from multiple RTSP IP cameras (senders) on a single computer (receiver) end for rendering a multi view? Senders and receiver are synced to the same NTP clock. Does one have to get access to RTCP SR NTP/RTP timestamp AND RTP timestamp of each GstBuffer? Or is using PTS (running time) enough?
Thank you
You could do it the same way as test-netclock.c / test-netclock-client.c, which is basically the same as what I’m doing in this blog post but only using RTCP SRs. So synchronization would only take place after a couple of seconds usually.
The only aspects that are not available in older GStreamer are the rapid synchronization RTP header extension and the mechanism to get absolute sender times as `GstMeta` on the buffers. But also you should figure out a way to update to newer GStreamer versions, 1.16 is ancient at this point 🙂
Thanks for this blog post – it is just what I needed to understand clock timing for RTSP better. Would you please fix the image for the RTSP stream producer bin? The text in that image is hard to read with the 500×143 size. Thanks
If you click on it you can get a bigger version, i.e. here
Hi Sebastian,
Thanks for this great post!
I am trying to run your example but I am having troubles. Let me give you some context, I must be missing some basic steps (missing some dependencies?).
I hope this is the right place to write about this, but if you prefer me to open an issue on gitlab, I can do that of course.
1) I get the latest gstreamer version (I am getting the same issue whether I build the latest main or v1.22 or if I use the latest gstream within docker)
2) I install rust
rustc –version
rustc 1.68.2 (9eb3afe9e 2023-03-27)
3) I build the project (all looks good there)
cargo build
4) I run the sender:
cargo run -p rtp-rapid-sync-example-send
output:
Finished dev [unoptimized + debuginfo] target(s) in 0.02s
Running `target/debug/rtp-rapid-sync-example-send`
Syncing to NTP clock
0:00:00.008702866 391 0x562a72bd6460 WARN GST_CLOCK gstclock.c:1059:gst_clock_get_internal_time: clock is not synchronized yet
0:00:00.029779983 391 0x562a72bd6460 WARN GST_CLOCK gstclock.c:1059:gst_clock_get_internal_time: clock is not synchronized yet
0:00:00.057980434 391 0x562a72bd6460 WARN GST_CLOCK gstclock.c:1059:gst_clock_get_internal_time: clock is not synchronized yet
0:00:00.076530148 391 0x562a72bd6460 WARN GST_CLOCK gstclock.c:1059:gst_clock_get_internal_time: clock is not synchronized yet
0:00:00.107846882 391 0x562a72bd6460 WARN GST_CLOCK gstclock.c:1059:gst_clock_get_internal_time: clock is not synchronized yet
0:00:00.126642711 391 0x562a72bd6460 WARN GST_CLOCK gstclock.c:1059:gst_clock_get_internal_time: clock is not synchronized yet
0:00:00.157851730 391 0x562a72bd6460 WARN GST_CLOCK gstclock.c:1059:gst_clock_get_internal_time: clock is not synchronized yet
0:00:00.172681309 391 0x562a72bd6460 WARN GST_CLOCK gstclock.c:1059:gst_clock_get_internal_time: clock is not synchronized yet
Synced to NTP clock
5a) I run the receiver (on the same machine, adjusting for my local IP)
cargo run -p rtp-rapid-sync-example-send — rtsp://172.17.0.1:8554/test
output:
Syncing to NTP clock
0:00:00.009938565 557 0x55e682fe9060 WARN GST_CLOCK gstclock.c:1059:gst_clock_get_internal_time: clock is not synchronized yet
0:00:00.026485020 557 0x55e682fe9060 WARN GST_CLOCK gstclock.c:1059:gst_clock_get_internal_time: clock is not synchronized yet
0:00:00.059731735 557 0x55e682fe9060 WARN GST_CLOCK gstclock.c:1059:gst_clock_get_internal_time: clock is not synchronized yet
0:00:00.076926534 557 0x55e682fe9060 WARN GST_CLOCK gstclock.c:1059:gst_clock_get_internal_time: clock is not synchronized yet
0:00:00.109251894 557 0x55e682fe9060 WARN GST_CLOCK gstclock.c:1059:gst_clock_get_internal_time: clock is not synchronized yet
0:00:00.123525657 557 0x55e682fe9060 WARN GST_CLOCK gstclock.c:1059:gst_clock_get_internal_time: clock is not synchronized yet
0:00:00.158795393 557 0x55e682fe9060 WARN GST_CLOCK gstclock.c:1059:gst_clock_get_internal_time: clock is not synchronized yet
0:00:00.176474925 557 0x55e682fe9060 WARN GST_CLOCK gstclock.c:1059:gst_clock_get_internal_time: clock is not synchronized yet
Synced to NTP clock
0:00:00.177189743 557 0x55e682e17d60 ERROR rtspserver rtsp-server.c:1004:gst_rtsp_server_create_socket:<GstRTSPServer@0x55e682f93930> failed to create socket
0:00:00.177211301 557 0x55e682e17d60 ERROR rtspserver rtsp-server.c:1374:gst_rtsp_server_create_source:<GstRTSPServer@0x55e682f93930> failed to create socket
0:00:00.177224416 557 0x55e682e17d60 ERROR rtspserver rtsp-server.c:1420:gst_rtsp_server_attach:<GstRTSPServer@0x55e682f93930> failed to create watch: Error binding to address 0.0.0.0:8554: Address already in use
Error: Failed to attach main context to RTSP server
I thought, maybe there is a typo in the instructions, and to start the receiver I should instead:
5b) cargo run -p rtp-rapid-sync-example-recv — rtsp://172.17.0.1:8554/test
output:
Finished dev [unoptimized + debuginfo] target(s) in 0.02s
Running `target/debug/rtp-rapid-sync-example-recv 'rtsp://172.17.0.1:8554/test'`
Syncing to NTP clock
0:00:00.009715025 562 0x56448b31d060 WARN GST_CLOCK gstclock.c:1059:gst_clock_get_internal_time: clock is not synchronized yet
0:00:00.029183503 562 0x56448b31d060 WARN GST_CLOCK gstclock.c:1059:gst_clock_get_internal_time: clock is not synchronized yet
0:00:00.059531777 562 0x56448b31d060 WARN GST_CLOCK gstclock.c:1059:gst_clock_get_internal_time: clock is not synchronized yet
0:00:00.082067360 562 0x56448b31d060 WARN GST_CLOCK gstclock.c:1059:gst_clock_get_internal_time: clock is not synchronized yet
0:00:00.109394984 562 0x56448b31d060 WARN GST_CLOCK gstclock.c:1059:gst_clock_get_internal_time: clock is not synchronized yet
0:00:00.130048945 562 0x56448b31d060 WARN GST_CLOCK gstclock.c:1059:gst_clock_get_internal_time: clock is not synchronized yet
0:00:00.159442770 562 0x56448b31d060 WARN GST_CLOCK gstclock.c:1059:gst_clock_get_internal_time: clock is not synchronized yet
0:00:00.180694360 562 0x56448b31d060 WARN GST_CLOCK gstclock.c:1059:gst_clock_get_internal_time: clock is not synchronized yet
Synced to NTP clock
thread ‘main’ panicked at ‘called `Result::unwrap()` on an `Err` value: BoolError { message: “Failed to deserialize value”, filename: “/root/.cargo/registry/src/github.com-1ecc6299db9ec823/gstreamer-0.18.8/src/value.rs”, function: “gstreamer::value”, line: 1198 }’, /root/.cargo/registry/src/github.com-1ecc6299db9ec823/gstreamer-0.18.8/src/gobject.rs:39:53
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Thanks in advance!
Elia
5a), you’re right there’s a typo and it should indeed be `recv` instead of `send` there. Thanks 🙂
5b), can you run with `RUST_BACKTRACE=1` and provide the output? Not sure where exactly it fails there.