Writing GStreamer Elements in Rust (Part 4): Logging, COWs and Plugins

This is part 4, the older parts can be found here: part 1, part 2 and part 3

It’s been quite a while since the last update again, so I thought I should write about the biggest changes since last time again even if they’re mostly refactoring. They nonetheless show how Rust is a good match for writing GStreamer plugins.

Apart from actual code changes, also the code was relicensed from the LGPL-2 to a dual MIT-X11/Apache2 license to make everybody’s life a bit easier with regard to static linking and building new GStreamer plugins on top of this.

I’ll also speak about all this and more at RustFest.EU 2017 in Kiev on the 30th of April, together with Luis.

The next steps after all this will be to finally make the FLV demuxer feature-complete, for which all the base-work is already done now.

Logging

One thing that was missing so far and made debugging problems always a bit annoying was the missing integration with the GStreamer logging infrastructure. Adding println!() everywhere just to remove them again later gets boring after a while.

The GStreamer logging infrastructure is based, like many other solutions, on categories in which you log your messages and levels that describe the importance of the message (error, warning, info, …). Logging can be disabled at compile time, up to a specific level, and can also be enabled/disabled at runtime for each category to a specific level, and performance impact for disabled logging should be close to zero. This now has to be mapped somehow to Rust.

During last year’s “24 days of Rust” in December, slog was introduced (see this also for some overview how slog is used). And it seems like the perfect match here due to being able to implement new “output backends”, called a Drain in slog and very low performance impact. So how logging works now is that you create a Drain per GStreamer debug category (which will then create the category if needed), and all logging to that Drain goes directly to GStreamer:

With lazy_static we can then make sure that the Drain is only created once and can be used from multiple places.

All the implementation for the Drain can be found here, and it’s all rather straightforward plumbing. The interesting part here however is that slog makes sure that the message string and all its formatting arguments (the integer in the above example) are passed down to the Drain without doing any formatting. As such we can skip the whole formatting step if the category is not enabled or its level is too low, which gives us almost zero-cost logging for the cases when it is disabled. And of course slog also allows disabling logging up to a specific level at compile time via cargo’s features feature, making it really zero-cost if disabled at compile time.

Safe & simple Copy-On-Write

In GStreamer, buffers and similar objects are inheriting from a base class called GstMiniObject. This base class provides infrastructure for reference counting, copying (cloning) of the objects and a dynamic (at runtime, not to be confused with Rust’s COW type) Copy-On-Write mechanism (writable access requires a reference count of 1, or a copy has to be made). This is very similar to Rust’s Arc, which for a contained type that implements Clone provides the make_mut() and get_mut() functions that work the same way.

Now we can’t unfortunately use Arc directly here for wrapping the GStreamer types, as the reference counting is already done inside GStreamer and adding a second layer of reference counting on top is not going to make things work better. So there’s now a GstRc, which provides more or less the same API as Arc and wraps structs that implement the GstMiniObject trait. The latter provides GstRc functions for getting the raw pointer, swap the raw pointer and create new instances from a raw pointer. The actual structs for buffers and other types don’t do any reference counting or otherwise instance handling, and only have unsafe constructors. The general idea here is that they will never exist outside a GstRc, which will then can provide you with (mutable or not) references to them.

With all this we now have a way to let Rust do the reference counting for us and enforce the writability rules of the GStreamer API automatically without leaving any chance of doing things wrong. Compared to C where you have to do the reference counting yourself and could accidentally try to modify a non-writable (reference count > 1) object (which would give an assertion), this is a big improvement.

And as a bonus this is all completely without overhead: all that is passed around in the Rust code is (once compiled) the raw C pointer of the objects, and the functions calls directly map to the C functions too. Let’s take an example:

After reading this code you might ask why DerefMut is not implemented in addition, which would then do make_mut() internally if needed and would allow getting around the extra method call. The reason for this is that make_mut() might do a (expensive!) copy, and as such DerefMut could do a copy implicitly without the code having any explicit indication that a copy might happen here. I would be worried that it could cause non-obvious performance problems.

The last change I’m going to write about today is that the repository was completely re-organized. There is now a base crate and separate plugin crates (e.g. gst-plugin-file). The former is a normal library crate and contains some C code and all the glue between GStreamer and Rust, the latter don’t contain a single line of C code (and no unsafe code either at this point) and compile to a standalone GStreamer plugin.

The only tricky bit here was generating the plugin entry point from pure Rust code. GStreamer requires a plugin to export a symbol with a specific name, which provides access to a description struct. As the struct also contains strings, and generating const static strings with ‘\0’ terminator is not too easy, this is still a bit ugly currently. With the upcoming changes in GStreamer 1.14 this will become better, as we can then just export a function that can dynamically allocate the strings and return the struct from there.

All the boilerplate for creating the plugin entry point is hidden by the plugin_define!() macro, which can then be used as follows (and you’ll understand what I mean with ugly ‘\0’ terminated strings then):

As a side-note, handling multiple crates next to each other is very convenient with the workspace feature of cargo and the “build –all”, “doc –all” and “test –all” commands since 1.16.

Writing GStreamer Elements in Rust (Part 3): Parsing data from untrusted sources like it’s 2016

This is part 3, the older parts can be found here: part 1, part 2 and part 4

And again it took quite a while to write a new update about my experiments with writing GStreamer elements in Rust. The previous articles can be found here and here. Since last time, there was also the GStreamer Conference 2016 in Berlin, where I had a short presentation about this.

Progress was rather slow unfortunately, due to work and other things getting into the way. Let’s hope this improves. Anyway!

There will be three parts again, and especially the last one would be something where I could use some suggestions from more experienced Rust developers about how to solve state handling / state machines in a nicer way. The first part will be about parsing data in general, especially from untrusted sources. The second part will be about my experimental and current proof of concept FLV demuxer.

Parsing Data

Safety?

First of all, you probably all saw a couple of CVEs about security relevant bugs in (rather uncommon) GStreamer elements going around. While all of them would’ve been prevented by having the code written in Rust (due to by-default array bounds checking), that’s not going to be our topic here. They also would’ve been prevented by using various GStreamer helper API, like GstByteReader, GstByteWriter and GstBitReader. So just use those, really. Especially in new code (which is exactly the problem with the code affected by the CVEs, it was old and forgotten). Don’t do an accountant’s job, counting how much money/many bytes you have left to read.

But yes, this is something where Rust will also provide an advantage by having by-default safety features. It’s not going to solve all our problems, but at least some classes of problems. And sure, you can write safe C code if you’re careful but I’m sure you also drive with a seatbelt although you can drive safely. To quote Federico about his motivation for rewriting (parts of) librsvg in Rust:

Every once in a while someone discovers a bug in librsvg that makes it all the way to a CVE security advisory, and it’s all due to using C. We’ve gotten double free()s, wrong casts, and out-of-bounds memory accesses. Recently someone did fuzz-testing with some really pathological SVGs, and found interesting explosions in the library. That’s the kind of 1970s bullshit that Rust prevents.

You can directly replace the word librsvg with GStreamer here.

Ergonomics

The other aspect with parsing data is that it’s usually a very boring aspect of programming. It should be as painless as possible, as easy as possible to do it in a safe way, and after having written your 100th parser by hand you probably don’t want to do that again. Parser combinator libraries like Parsec in Haskell provide a nice alternative. You essentially write down something very close to a formal grammar of the format you want to parse, and out of this comes a parser for the format. Other than parser generators like good, old yacc, everything is written in target language though, and there is no separate code generation step.

Rust, being quite a bit more expressive than C, also made people write parser generator libraries. They are all not as ergonomic (yet?) as in Haskell, but still a big improvement over anything else. There’s nom, combine and chomp. All having a slightly different approach. Choose your favorite. I decided on nom for the time being.

A FLV Demuxer in Rust

For implementing a demuxer, I decided on using the FLV container format. Mostly because it is super-simple compared to e.g. MP4 and WebM, but also because Geoffroy, the author of nom, wrote a simple header parsing library for it already and a prototype demuxer using it for VLC. I’ll have to extend that library for various features in the near future though, if the demuxer should ever become feature-equivalent with the existing one in GStreamer.

As usual, the code can be found here, in the “demuxer” branch. The most relevant files are rsdemuxer.rs and flvdemux.rs.

Following the style of the sources and sinks, the first is some kind of base class / trait for writing arbitrary demuxers in Rust. It’s rather unfinished at this point though, just enough to get something running. All the FLV specific code is in the second file, and it’s also very minimal for now. All it can do is to play one specific file (or hopefully all other files with the same audio/video codec combination).

As part of all this, I also wrote bindings for GStreamer’s buffer abstraction and a Rust-rewrite of the GstAdapter helper type. Both showed Rust’s strengths quite well, the buffer bindings by being able to express various concepts of the buffers in a compiler-checked, safe way in Rust (e.g. ownership, reability/writability), the adapter implementation by being so much shorter (it’s missing features… but still).

So here we are, this can already play one specific file (at least) in any GStreamer based playback application. But some further work is necessary, for which I hopefully have some time in the near future. Various important features are still missing (e.g. other codecs, metadata extraction and seeking), the code is rather proof-of-concept style (stringly-typed media formats, lots of unimplemented!() and .unwrap() calls). But it shows that writing media handling elements in Rust is definitely feasible, and generally seems like a good idea.

If only we had Rust already when all this media handling code in GStreamer was written!

State Handling

Another reason why all this took a bit longer than expected, is that I experimented a bit with expressing the state of the demuxer in a more clever way than what we usually do in C. If you take a look at the GstFlvDemux struct definition in C, it contains about 100 lines of field declarations. Most of them are only valid / useful in specific states that the demuxer is in. Doing the same in Rust would of course also be possible (and rather straightforward), but I wanted to try to do something better, especially by making invalid states unrepresentable.

Rust has this great concept of enums, also known as tagged unions or sum types in other languages. These are not to be confused with C enums or unions, but instead allow multiple variants (like C enums) with fields of various types (like C unions). But all of that in a type-safe way. This seems like the perfect tool for representing complicated state and building a state machine around it.

So much for the theory. Unfortunately, I’m not too happy with the current state of things. It is like this mostly because of Rust’s ownership system getting into my way (rightfully, how would it have known additional constraints I didn’t know how to express?).

Common Parts

The first problem I ran into, was that many of the states have common fields, e.g.

When writing code that matches on this, and that tries to move from one state to another, these common fields would have to be moved. But unfortunately they are (usually) borrowed by the code already and thus can’t be moved to the new variant. E.g. the following fails to compile

A Tree of States

Repeating the common parts is not nice anyway, so I went with a different solution by creating a tree of states:

Apart from making it difficult to find names for all of these, and having relatively deeply nested code, this works

If you look at the code however, this causes the code to be much bigger than needed and I’m also not sure yet how it will be possible nicely to move “backwards” one state if that situation ever appears. Also there is still the previous problem, although less often: if I would match on to_skip here by reference (or it was no Copy type), the compiler would prevent me from overwriting have_header for the same reasons as before.

So my question for the end: How are others solving this problem? How do you express your states and write the functions around them to modify the states?

Update

I actually implemented the state handling as a State -> State function before (and forgot about that), which seems conceptually the right thing to do. It however has a couple of other problems. Thanks for the suggestions so far, it seems like I’m not alone with this problem at least.

Update 2

I’ve went a bit closer to the C-style struct definition now, as it makes the code less convoluted and allows me to just get forwards with the code. The current status can be seen here now, which also involves further refactoring (and e.g. some support for metadata).

Writing GStreamer Elements in Rust (Part 2): Don’t panic, we have better assertions now – and other updates

This is part 2, the other parts can be found here: part 1, part 3 and part 4

It’s a while since last article about writing GStreamer plugins in Rust, so here is a short (or not so short?) update of the changes since back then. You might also want to attend my talk at the GStreamer Conference on 10-11 October 2016 in Berlin about this topic.

At this point it’s still only with the same elements as before (HTTP source, file sink and source), but the next step is going to be something more useful (an element processing actual data, parser or demuxer is not decided yet) now that I’m happy with the general infrastructure. You can still find the code in the same place as before on GitHub here, and that’s where also all updates are going to be.

The main sections here will be Error Handling, Threading and Asynchonous IO.

Error Handling & Panics

First of all let’s get started with a rather big change that shows some benefits of Rust over C. There are two types of errors we care about here: expected errors and unexpected errors.

Expected Errors

In GLib based libraries we usually report errors with some kind of boolean return value plus an optional GError that allows to propagate further information about what exactly went wrong to the caller but also to the user. Bindings sometimes convert these directly into exceptions of the target language or whatever construct exists there.

Unfortunately, in GStreamer we use GErrors not very often. Consider for example GstBaseSink (in pseudo-C++/Java/… for simplicity):

For start()/stop() there is just a boolean, for render() there is at least an enum with a few variants. This is for from ideal, so what is additionally required by implementors of those virtual methods is that they post error messages if something goes wrong with further details. Those are propagated out of the normal control flow via the GstBus to the surrounding bins and in the end the application. It would be much nicer if instead we would have GErrors there and make it mandatory for implementors to return one if something goes wrong. These could still be converted to error messages but at a central place then. Something to think about for the next major version of GStreamer.

This is of course only for expected errors, that is, for things where we know that something can go wrong and want to report that.

Rust

In Rust this problem is solved in a similar way, see the huge chapter about error handling in the documentation. You basically return either the successful result, or something very similar to a GError:

Result is the type behind that, and it comes with convenient macros for propagating errors upwards (try!()), chaining multiple failing calls and/or converting errors (map(), and_then(), map_err(), or_else(), etc) and libraries that make defining errors with all the glue code required for combining different errors types from different parts of the code easier.

Similar to Result, there is also Option, which can be Some(x) or None, to signal the possible absence of a value. It works similarly, has similar API, but is generally not meant for error handling. It’s now used instead of GST_CLOCK_TIME_NONE (aka u64::MAX) to signal the absence of e.g. a stop position of a seek, or the absence of a known size of the stream. It’s more explicit then giving a single integer value of all of them a completely different meaning.

How is the different?

The most important difference from my point of view here is, that you must handle errors in one way or another. Otherwise the compiler won’t accept your code. If something can fail, you must explicitly handle this and can’t just silently ignore the possibility of failure. While in C people tend to just ignore error return values and assume that things just went fine.

What’s ErrorMessage and FlowError, what else?

As you probably expect, ErrorMessage maps to the GStreamer concept of error messages and contains exactly the same kind of information. In Rust this is implemented slightly different but in the end results in the same thing. The main difference here is that whenever e.g. start fails, you must provide an error message and can’t just fail silently. That error message can then be used by the caller, and e.g. be posted on the bus (and that’s exactly what happens).

FlowError is basically the negative part (the errors or otherwise non-successful results) of GstFlowReturn:

Similarly, for the actual errors (NotNegotiated and Error), an actual error message must be provided and that then gets used by the caller (and is posted on the bus).

And in the same way, if setting an URI fails we now return a Result<(), UriError>, which then reports the error properly to GStreamer.

In summary, if something goes wrong, we know about that, have to handle/report that and have an error message to post on the bus.

Macros are awesome

As a side-note, creating error messages for GStreamer is not too convenient and they want information like the current source file, line number, function, etc. Like in C, I’ve created a macro to make such an error message. Different to C, macros in Rust are actually awesome though and not just arbitrarily substituting text. Instead they work via pattern matching and allow you to distinguish all kinds of different cases, can be recursive and are somewhat typed (expression vs. statement vs. block of code vs. type name, …).

Unexpected Errors

So this was about expected errors so far, which have to be handled explicitly in Rust but not in C, and for which we have some kind of data structure to pass around. What about the other cases, the errors that will never happen (but usually do sooner or later) because your program would just be completely broken then and all your assumptions wrong, and you wouldn’t know what to do in those cases anyway.

In C with GLib we usually have 3 ways of handling these. 1) Not at all (and crashing, doing something wrong, deadlocking, deleting all your files, …), 2) explicitly asserting what the assumptions in the code are and crashing cleanly otherwise (SIGABRT), or 3) returning some default value from the function but just returning immediately and printing a warning instead of going on.

None of these 3 cases are handleable in any case, which seems fair because they should never happen and if they do we wouldn’t know what to do anyway. 1) is obviously least desirable but the most common, 3) is only slightly better (you get a warning, but usually sooner or later something will crash anyway because you’re in an inconsistent state) and 2) is cleanest. However 2) is nothing you really want either, your application should somehow be able to return back to a clean state if it can (by e.g. storing the current user data, stopping everything and loading up a new UI with the stored user data and some dialog).

Rust

Of course no Rust code should ever run into case 1) above and randomly crash, cause memory corruptions or similar. But of course this will also happen due to bugs in Rust itself, using unsafe code, or code wrapping e.g. a C library. There’s not really anything that can be done about this.

For the other two cases there is however: catching panics. Whenever something goes wrong in unexpected ways, the corresponding Rust code can call the panic!() macro in one way or another. Like via assertions, or when “asserting” that a Result is never the error case by calling unwrap() on it (you don’t have to handle errors but you have to explicitly opt-in to ignore them by calling unwrap()).

What happens from there on is similar to exception handling in other languages (unless you compiled your code so that panics automatically kill the application). The stack gets unwound, everything gets cleaned up on the way, and at some point either everything stops or someone catches that. The boundary for the unwinding is either your main() in Rust, or if the code is called from C, then at that exact point (i.e. for the GStreamer plugins at the point where functions are called from GStreamer).

So what?

At the point where GStreamer calls into the Rust code, we now catch all unwinds that might happen and remember that one happened. This is then converted into a GStreamer error message (so that the application can handle that in a meaningful way) and by remembering that we prevent any further calls into the Rust code and immediately make them error messages too and return.

This allows to keep the inconsistent state inside the element and to allow the application to e.g. remove the element and replace it with something else, restart the pipeline, or do whatever else it wants to do. Assertions are always local to the element and not going to take down the whole application!

Threading

The other major change that happened is that Sink and Source are now single-threaded. There is no reason why the code would have to worry about threading as everything happens in exactly one thread (the streaming thread), except for the setting/getting of the URI (and possibly other “one-time” settings in the future).

To solve that, at the translation layer between C and Rust there is now a (Rust!) wrapper object that handles all the threading (in Rust with Mutexes, which work like the ones in C++, or atomic booleans/integers), stores the URI separately from the Source/Sink and just passes the URI to the start() function. This made the code much cleaner and made it even simpler to write new sources or sinks. No more multi-threading headaches.

I think that we should in general move to such a simpler model in GStreamer and not require a full-fledged, multi-threaded GstElement subclass to be implemented, but instead something more use-case oriented (Source, sink, encoder, decoder, …) that has a single threaded API and hides all the gory details of GstElement. You don’t have to know these in most cases, so you shouldn’t have to know them as is required right now.

Simpler Source/Sink Traits

Overall the two traits look like this now, and that’s all you have to implement for a new source or sink:

Asynchronous IO

The last time I mentioned that a huge missing feature was asynchronous IO, in a composeable way. This has some news now, there’s an abstract implementation for futures and a set of higher-level APIs around mio for doing actual IO, called tokio. Independent of that there’s also futures-cpupool, which allows to call arbitrary calculations as futures on threads of a thread pool.

Recently also the HTTP library Hyper, as used by the HTTP source (and Servo), also got a branch that moves it to tokio for allowing asynchronous IO. Once that is landed, it can relatively easily be used inside the HTTP source for allowing to interrupt HTTP requests at any time.

It seems like this area moves into a very promising direction now, solving my biggest technical concern in a very pleasant way.

Writing GStreamer plugins and elements in Rust

This is part 1, the other parts can be found here: Part 2, part 3 and part 4

This weekend we had the GStreamer Spring Hackfest 2016 in Thessaloniki, my new home town. As usual it was great meeting everybody in person again, having many useful and interesting discussions and seeing what everybody was working on. It seems like everybody was quite productive during these days!

Apart from the usual discussions, cleaning up some of our Bugzilla backlog and getting various patches reviewed, I was working with Luis de Bethencourt on writing a GStreamer plugin with a few elements in Rust. Our goal was to be able to be able to read and write a file, i.e. implement something like the “cp” command around gst-launch-1.0 with just using the new Rust elements, while trying to have as little code written in C as possible and having a more-or-less general Rust API in the end for writing more source and sink elements. That’s all finished, including support for seeking, and I also wrote a small HTTP source.

For the impatient, the code can be found here: https://github.com/sdroege/rsplugin

Why Rust?

Now you might wonder why you would want to go through all the trouble of creating a bridge between GStreamer in C and Rust for writing elements. Other people have written much better texts about the advantages of Rust, which you might want to refer to if you’re interested: The introduction of the Rust documentation, or this free O’Reilly book.

But for myself the main reasons are that

  1. C is a rather antique and inconvenient language if you compare it to more modern languages, and Rust provides a lot of features from higher-level languages while still not introducing all the overhead that is coming with it elsewhere, and
  2. even more important are the safety guarantees of the language, including the strong type system and the borrow checker, which make a whole category of bugs much more unlikely to happen. And thus saves you time during development but also saves your users from having their applications crash on them in the best case, or their precious data being lost or stolen.

Rust is not the panacea for everything, and not even the perfect programming language for every problem, but I believe it has a lot of potential in various areas, including multimedia software where you have to handle lots of complex formats coming from untrusted sources and still need to provide high performance.

I’m not going to write a lot about the details of the language, for that just refer to the website and very well written documentation. But, although being a very young language not developed by a Fortune 500 company (it is developed by Mozilla and many volunteers), it is nowadays being used in production already at places like Dropbox or Firefox (their MP4 demuxer, and in the near future the URL parser). It is also used by Mozilla and Samsung for their experimental, next-generation browser engine Servo.

The Code

Now let’s talk a bit about how it all looks like. Apart from Rust’s standard library (for all the basics and file IO), what we also use are the url crate (Rust’s term for libraries) for parsing and constructing URLs, and the HTTP server/client crate called hyper.

On the C side we have all the boilerplate code for initializing a GStreamer plugin (plugin.c), which then directly calls into Rust code (lib.rs), which then calls back into C (plugin.c) for registering the actual GStreamer elements. The GStreamer elements themselves have then an implementation written in C (rssource.c and rssink.c), which is a normal GObject subclass of GstBaseSrc or GstBaseSink but instead of doing the actual work in C it is just calling into Rust code again. For that to work, some metadata is passed to the GObject class registration, including a function pointer to a Rust function that creates a new instance of the “machinery” of the element. This is then implementing the Source or Sink traits (similar to interfaces) in Rust (rssource.rs and rssink.rs):

And these traits (plus a constructor) are in the end all that has to be implemented in Rust for the elements (rsfilesrc.rs, rsfilesink.rs and rshttpsrc.rs).

If you look at the code, it’s all still a bit rough at the edges and missing many features (like actual error reporting back to GStreamer instead of printing to stderr), but it already works and the actual implementations of the elements in Rust is rather simple and fun. And even the interfacing with C code is quite convenient at the Rust level.

How to test it?

First of all you need to get Rust and Cargo, check the Rust website or your Linux distribution for details. This was all tested with the stable 1.8 release. And you need GStreamer plus the development files, any recent 1.x version should work.

What next?

The three implemented elements are not too interesting and were mostly an experiment to see how far we can get in a weekend. But the HTTP source for example, once more features are implemented, could become useful in the long term.

Also, in my opinion, it would make sense to consider using Rust for some categories of elements like parsers, demuxers and muxers, as traditionally these elements are rather complicated and have the biggest exposure to arbitrary data coming from untrusted sources.

And maybe in the very long term, GStreamer or parts of it can be rewritten in Rust. But that’s a lot of effort, so let’s go step by step to see if it’s actually worthwhile and build some useful things on the way there already.

For myself, the next step is going to be to implement something like GStreamer’s caps system in Rust (of which I already have the start of an implementation), which will also be necessary for any elements that handle specific data and not just an arbitrary stream of bytes, and it could probably be also useful for other applications independent of GStreamer.

Issues

The main problem with the current code is that all IO is synchronous. That is, if opening the file, creating a connection, reading data from the network, etc. takes a bit longer it will block until a timeout has happened or the operation finished in one way or another.

Rust currently has no support for non-blocking IO in its standard library, and also no support for asynchronous IO. The latter is being discussed in this RFC though, but it probably will take a while until we see actual results.

While there are libraries for all of this, having to depend on an external library for this is not great as code using different async IO libraries won’t compose well. Without this, Rust is still missing one big and important feature, which will definitely be needed for many applications and the lack of it might hinder adoption of the language.