July 2020 – coaxion.net

Automatic retry on error and fallback stream handling for GStreamer sources

A very common problem in GStreamer, especially when working with live network streams, is that the source might just fail at some point. Your own network might have problems, the source of the stream might have problems, …

Without any special handling of such situations, the default behaviour in GStreamer is to simply report an error and let the application worry about handling it. The application might for example want to restart the stream, or it might simply want to show an error to the user, or it might want to show a fallback stream instead, telling the user that the stream is currently not available and then seamlessly switch back to the stream once it comes back.

Implementing all of the aforementioned is quite some effort, especially to do it in a robust way. To make it easier for applications I implemented a new plugin called fallbackswitch that contains two elements to automate this.

It is part of the GStreamer Rust plugins and also included in the recent 0.6.0 release, which can also be found on the Rust package (“crate”) repository crates.io.

Installation

For using the plugin you most likely first need to compile it yourself, unless you’re lucky enough that e.g. your Linux distribution includes it already.

Compiling it requires a Rust toolchain and GStreamer 1.14 or newer. The former you can get via rustup for example, if you don’t have it yet, the latter either from your Linux distribution or by using the macOS, Windows, etc binaries that are provided by the GStreamer project. Once that is done, compiling is mostly a matter of running cargo build in the utils/fallbackswitch directory and copying the resulting libgstfallbackswitch.so (or .dll or .dylib) into one of the GStreamer plugin directories, for example ~/.local/share/gstreamer-1.0/plugins.

fallbackswitch

The first of the two elements is fallbackswitch. It acts as a filter that can be placed into any kind of live stream. It consumes one main stream (which must be live) and outputs this stream as-is if everything works well. Based on the timeout property it detects if this main stream didn’t have any activity for the configured amount of time, or everything arrived too late for that long, and then seamlessly switches to a fallback stream. The fallback stream is the second input of the element and does not have to be live (but it can be).

Switching between main stream and fallback stream doesn’t only work for raw audio and video streams but also works for compressed formats. The element will take constraints like keyframes into account when switching, and if necessary/possible also request new keyframes from the sources.

For example to play the Sintel trailer over the network and displaying a test pattern if it doesn’t produce any data, the following pipeline can be constructed:

gst-launch-1.0 souphttpsrc location=https://www.freedesktop.org/software/gstreamer-sdk/data/media/sintel_trailer-480p.webm ! \
    decodebin ! identity sync=true ! fallbackswitch name=s ! videoconvert ! autovideosink \
    videotestsrc ! s.fallback_sink

Note the identity sync=true in the main stream here as we have to convert it to an actual live stream.

Now when running the above command and disconnecting from the network, the video should freeze at some point and after 5 seconds a test pattern should be displayed.

However, when using fallbackswitch the application will still have to take care of handling actual errors from the main source and possibly restarting it. Waiting a bit longer after disconnecting the network with the above command will report an error, which then stops the pipeline.

To make that part easier there is the second element.

fallbacksrc

The second element is fallbacksrc and as the name suggests it is an actual source element. When using it, the main source can be configured via an URI or by providing a custom source element. Internally it then takes care of buffering the source, converting non-live streams into live streams and restarting the source transparently on errors. The various timeouts for this can be configured via properties.

Different to fallbackswitch it also handles audio and video at the same time and demuxes/decodes the streams.

Currently the only fallback streams that can be configured are still images for video. For audio the element will always output silence for now, and if no fallback image is configured for video it outputs black instead. In the future I would like to add support for arbitrary fallback streams, which hopefully shouldn’t be too hard. The basic infrastructure for it is already there.

To use it again in our previous example and having a JPEG image displayed whenever the source does not produce any new data, the following can be done:

gst-launch-1.0 fallbacksrc uri=https://www.freedesktop.org/software/gstreamer-sdk/data/media/sintel_trailer-480p.webm \
    fallback-uri=file:///path/to/some/jpg ! videoconvert ! autovideosink

Now when disconnecting the network, after a while (longer than before because fallbacksrc does additional buffering for non-live network streams) the fallback image should be shown. Different to before, waiting longer will not lead to an error and reconnecting the network causes the video to reappear. However as this is not an actual live-stream, right now playback would again start from the beginning. Seeking back to the previous position would be another potential feature that could be added in the future.

Overall these two elements should make it easier for applications to handle errors in live network sources. While the two elements are still relatively minimal feature-wise, they should already be usable in various real scenarios and are already used in production.

As usual, if you run into any problems or are missing some features, please create an issue in the GStreamer bug tracker.

GStreamer Rust Bindings & Plugins New Releases

It has been quite a while since the last status update for the GStreamer Rust bindings and the GStreamer Rust plugins, so the new releases last week make for a good opportunity to do so now.

Bindings

I won’t write too much about the bindings this time. The latest version as of now is 0.16.1, which means that since I started working on the bindings there were 8 major releases. In that same time there were 45 contributors working on the bindings, which seems quite a lot and really makes me happy.

Just as before, I don’t think any major APIs are missing from the bindings anymore, even for implementing subclasses of the various GStreamer types. The wide usage of the bindings in Free Software projects and commercial products also shows both the interest in writing GStreamer applications and plugins in Rust as well as that the bindings are complete enough and production-ready.

Most of the changes since the last status update involve API cleanups, usability improvements, various bugfixes and addition of minor API that was not included before. The details of all changes can be read in the changelog.

The bindings work with any GStreamer version since 1.8 (released more than 4 years ago), support APIs up to GStreamer 1.18 (to be released soon) and work with Rust 1.40 or newer.

Plugins

The biggest progress probably happened with the GStreamer Rust plugins.

There also was a new release last week, 0.6.0, which was the first release where selected plugins were also uploaded to the Rust package (“crate”) database crates.io. This makes it easy for Rust applications to embed any of these plugins statically instead of depending on them to be available on the system.

Overall there are now 40 GStreamer elements in 18 plugins by 28 contributors available as part of the gst-plugins-rs repository, one tutorial plugin with 4 elements and various plugins in external locations.

These 40 GStreamer elements are the following:

Audio

rsaudioecho: Port of the audioecho element from gst-plugins-good
rsaudioloudnorm: Live audio loudness normalization element based on the FFmpeg af_loudnorm filter
claxondec: FLAC lossless audio codec decoder element based on the pure-Rust claxon implementation
csoundfilter: Audio filter that can use any filter defined via the Csound audio programming language
lewtondec: Vorbis audio decoder element based on the pure-Rust lewton implementation

Video

cdgdec/cdgparse: Decoder and parser for the CD+G video codec based on a pure-Rust CD+G implementation, used for example by karaoke CDs
cea608overlay: CEA-608 Closed Captions overlay element
cea608tott: CEA-608 Closed Captions to timed-text (e.g. VTT or SRT subtitles) converter
tttocea608: CEA-608 Closed Captions from timed-text converter
mccenc/mccparse: MacCaption Closed Caption format encoder and parser
sccenc/sccparse: Scenarist Closed Caption format encoder and parser
dav1dec: AV1 video decoder based on the dav1d decoder implementation by the VLC project
rav1enc: AV1 video encoder based on the fast and pure-Rust rav1e encoder implementation
rsflvdemux: Alternative to the flvdemux FLV demuxer element from gst-plugins-good, not feature-equivalent yet
rsgifenc/rspngenc: GIF/PNG encoder elements based on the pure-Rust implementations by the image-rs project

Text

textwrap: Element for line-wrapping timed text (e.g. subtitles) for better screen-fitting, including hyphenation support for some languages

Network

reqwesthttpsrc: HTTP(S) source element based on the Rust reqwest/hyper HTTP implementations and almost feature-equivalent with the main GStreamer HTTP source souphttpsrc
s3src/s3sink: Source/sink element for the Amazon S3 cloud storage
awstranscriber: Live audio to timed text transcription element using the Amazon AWS Transcribe API

Generic

sodiumencrypter/sodiumdecrypter: Encryption/decryption element based on libsodium/NaCl
togglerecord: Recording element that allows to pause/resume recordings easily and considers keyframe boundaries
fallbackswitch/fallbacksrc: Elements for handling potentially failing (network) sources, restarting them on errors/timeout and showing a fallback stream instead
threadshare: Set of elements that provide alternatives for various existing GStreamer elements but allow to share the streaming threads between each other to reduce the number of threads
rsfilesrc/rsfilesink: File source/sink elements as replacements for the existing filesrc/filesink elements

Live loudness normalization in GStreamer & experiences with porting a C audio filter to Rust

A few months ago I wrote a new GStreamer plugin: an audio filter for live loudness normalization and automatic gain control.

The plugin can be found as part of the GStreamer Rust plugin in the audiofx plugin. It’s also included in the recent 0.6.0 release of the GStreamer Rust plugins and available from crates.io.

Its code is based on Kyle Swanson’s great FFmpeg filter af_loudnorm, about which he wrote some more technical details on his blog a few years back. I’m not going to repeat all that here, if you’re interested in those details and further links please read Kyle’s blog post.

From a very high-level, the filter works by measuring the loudness of the input following the EBU R128 standard with a 3s lookahead, adjusts the gain to reach the target loudness and then applies a true peak limiter with 10ms to prevent any too high peaks to get passed through. Both the target loudness and the maximum peak can be configured via the loudness-target and max-true-peak properties, same as in the FFmpeg filter. Different to the FFmpeg filter I only implemented the “live” mode and not the two-pass mode that is implemented in FFmpeg, which first measures the loudness of the whole stream and then in a second pass adjusts it.

Below I’ll describe the usage of the filter in GStreamer a bit and also some information about the development process, and the porting of the C code to Rust.

Usage

For using the filter you most likely first need to compile it yourself, unless you’re lucky enough that e.g. your Linux distribution includes it already.

Compiling it requires a Rust toolchain and GStreamer 1.8 or newer. The former you can get via rustup for example, if you don’t have it yet, the latter either from your Linux distribution or by using the macOS, Windows, etc binaries that are provided by the GStreamer project. Once that is done, compiling is mostly a matter of running cargo build in the audio/audiofx directory and copying the resulting libgstrsaudiofx.so (or .dll or .dylib) into one of the GStreamer plugin directories, for example ~/.local/share/gstreamer-1.0/plugins.

After that boring part is done, you can use it for example as follows to run loudness normalization on the Sintel trailer:

gst-launch-1.0 playbin \
    uri=https://www.freedesktop.org/software/gstreamer-sdk/data/media/sintel_trailer-480p.webm \
    audio-filter="audioresample ! rsaudioloudnorm ! audioresample ! capsfilter caps=audio/x-raw,rate=48000"

As can be seen above, it is necessary to put audioresample elements around the filter. The reason for that is that the filter currently only works on 192kHz input. This is a simplification for now to make it easier inside the filter to detect true peaks. You would first upsample your audio to 192kHz and then, if needed, later downsample it again to your target sample rate (48kHz in the example above). See the link mentioned before for details about true peaks and why this is generally a good idea to do. In the future the resampling could be implemented internally and maybe optionally the filter could also work with “normal” peak detection on the non-upsampled input.

Apart from that caveat the filter element works like any other GStreamer audio filter and can be placed accordingly in any GStreamer pipeline.

If you run into any problems using the code or it doesn’t work well for your use-case, please create an issue in the GStreamer bugtracker.

The process

As I wrote above, the GStreamer plugin is part of the GStreamer Rust plugins so the first step was to port the FFmpeg C code to Rust. I expected that to be the biggest part of the work, but as writing Rust is simply so much more enjoyable than writing C and I would have to adjust big parts of the code to fit the GStreamer infrastructure anyway, I took this approach nonetheless. The alternative of working based on the C code and writing the plugin in C didn’t seem very appealing to me. In the end, as usual when developing in Rust, this also allowed me to be more confident about the robustness of the result and probably reduced the amount of time spent debugging. Surprisingly, the translation was actually not the biggest part of the work, but instead I had to debug a couple of issues that were already present in the original FFmpeg code and find solutions for them. But more on that later.

The first step for porting the code was to get an implementation of the EBU R128 loudness analysis. In FFmpeg they’re using a fork of the libebur128 C library. I checked if there was anything similar for Rust already, maybe even a pure-Rust implementation of it, but couldn’t find anything. As I didn’t want to write one myself or port the code of the libebur128 C library to Rust, I wrote safe Rust bindings for that library instead. The end result of that can be found on crates.io as an independent crate, in case someone else also needs it for other purposes at some point. The crate also includes the code of the C library, making it as easy as possible to build and include into other projects.

The next step was to actually port the FFmpeg C code to Rust. In the end that was a rather straightforward translation fortunately. The latest version of that code can be found here.

The biggest difference to the C code is the usage of Rust iterators and iterator combinators like zip and chunks_exact. In my opinion this makes the code quite a bit easier to read compared to the manual iteration in the C code together with array indexing, and as a side effect it should also make the code run faster in Rust as it allows to get rid of a lot of array bounds checks.

Apart from that, one part that was a bit inconvenient during that translation and still required manual array indexing is the usage of ringbuffers everywhere in the code. For now I wrote those like I would in C and used a few unsafe operations like get_unchecked to avoid redundant bounds checks, but at a later time I might refactor this into a proper ringbuffer abstraction for such audio processing use-cases. It’s not going to be the last time I need such a data structure. A short search on crates.io gave various results for ringbuffers but none of them seem to provide an API that fits the use-case here. Once that’s abstracted away into a nice data structure, I believe the Rust code of this filter is really nice to read and follow.

Now to the less pleasant parts, and also a small warning to all the people asking for Rust rewrites of everything: of course I introduced a couple of new bugs while translating the code although this was a rather straightforward translation and I tried to be very careful. I’m sure there is also still a bug or two left that I didn’t find while debugging. So always keep in mind that rewriting a project will also involve adding new bugs that didn’t exist in the original code. Or maybe you’re just a better programmer than me and don’t make such mistakes.

Debugging these issues that showed up while testing the code was a good opportunity to also add extensive code comments everywhere so I don’t have to remind myself every time again what this block of code is doing exactly, and it’s something I was missing a bit from the FFmpeg code (it doesn’t have a single comment currently). While writing those comments and explaining the code to myself, I found the majority of these bugs that I introduced and as a side-effect I now have documentation for my future self or other readers of the code.

Fixing these issues I introduced myself wasn’t that time-consuming neither in the end fortunately, but while writing those code comments and also while doing more testing on various audio streams, I found a couple of bugs that already existed in the original FFmpeg C code. Further testing also showed that they caused quite audible distortions on various test streams. These are the bugs that unfortunately took most of the time in the whole process, but at least to my knowledge there are no known bugs left in the code now.

For these bugs in the FFmpeg code I also provided a fix that is merged already, and reported the other two in their bug tracker.

The first one I’d be happy to provide a fix for if my approach is considered correct, but the second one I’ll leave for someone else. Porting over my Rust solution for that one will take some time and getting all the array indexing involved correct in C would require some serious focusing, for which I currently don’t have the time.

Or maybe my solutions to these problems are actually wrong, or my understanding of the original code was wrong and I actually introduced them in my translation, which also would be useful to know.

Overall, while porting the C code to Rust introduced a few new problems that had to be fixed, I would definitely do this again for similar projects in the future. It’s more fun to write and in my opinion the resulting code is easier readable, and better to maintain and extend.