Exporting a GObject C API from Rust code and using it from C, Python, JavaScript and others

During the last days I was experimenting a bit with implementing a GObject C API in Rust. The results can be found in this repository, and this is something like an overview of the work, code walkthrough and status report. Note that this is quite long, a little bit further down you can find a table of contents and then jump to the area you’re interested in. Or read it chapter by chapter.

GObject is a C library that allows to write object-oriented, cross-platform APIs in C (which does not have support for that built-in), and provides a very expressive runtime type system with many features known from languages like Java, C# or C++. It is also used by various C library, most notably the cross-platform GTK UI toolkit and the GStreamer multimedia framework. GObject also comes with strong conventions about how an API is supposed to look and behave, which makes it relatively easy to learn new GObject based APIs as compared to generic C libraries that could do anything unexpected.

I’m not going to give a full overview about how GObject works internally and how it is used. If you’re not familiar with that it might be useful to first read the documentation and especially the tutorial. Also some C & Rust (especially unsafe Rust and FFI) knowledge would be good to have for what follows.

If you look at the code, you will notice that there is a lot of unsafe code, boilerplate and glue code. And especially code duplication. I don’t expect anyone to manually write all this code, and the final goal of all this is to have Rust macros to make the life easier. Simple Rust macros that make it as easy as in C are almost trivial to write, but what we really want here is to be able to write it all only in safe Rust in code that looks a bit like C# or Java. There is a prototype for that already written by Niko Matsakis, and a blog post with further details about it. The goal for this code is to work as a manual example that can be integrated one step at a time into the macro based solution. Code written with that macro should in the end look similar to the following

and be usable like

The code in my repository is already integrated well into GTK-rs, but the macro generated code should also be integrated well into GTK-rs and work the same as other GTK-rs code from Rust. In addition the generated code should of course make use of all the type FFI conversion infrastructure that already exists in there and was explained by Federico in his blog post (part 1, part 2).
In the end, I would like to see such a macro solution integrated directly into the GLib bindings.

Table of Contents

  1. Why?
  2. Simple (boxed) types
  3. Object types
    1. Inheritance
    2. Virtual Methods
    3. Properties
    4. Signals
  4. Interfaces
  5. Usage from C
  6. Usage from Rust
  7. Usage from Python, JavaScript and Others
  8. What next?

Why?

Now one might ask why? GObject is yet another C library and Rust can export plain C API without any other dependencies just fine. While that is true, C is not very expressive at all and there are no conventions about how C APIs should look like and behave, so everybody does their own stuff. With GObject you would get all kinds of object-oriented programming features and strong conventions about API design. And you actually get a couple of features (inheritance, properties/signals, full runtime type system) that Rust does not have. And as bonus points, you get bindings for various other languages (Python, JavaScript, C++, C#, …) for free. More on the last point later.

Another reason why you might want to do this, is to be able to interact with existing C libraries that use GObject. For example if you want to create a subclass of some GTK widget to give it your own custom behaviour or modify its appearance, or even writing a completely new GTK widget that should be placed together with other widgets in your UI, or for implementing a new GStreamer element that implements some fancy filter or codec or … that you want to use.

Simple (boxed) types

Let’s start with the simple and boring case, which already introduces various GObject concepts. Let’s assume you already have some simple Rust type that you want to expose a C API for, and it should be GObject-style to get all the above advantages. For that, GObject has the concept of boxed types. These have to come with a “copy” and “free” function, which can do an actual copy of the object or just implement reference counting, and GObject allows to register these together with a string name for the type and then gives back a type ID (GType) that allows referencing this type.

Boxed types can then be automatically used, together with any C API they provide, from C and any other languages for which GObject support exists (i.e. basically all). It allows to use instances of these boxed types to be used in signals and properties (see further below), allows them to be stored in GValue (a container type that allows to store an instance of any other type together with its type ID), etc.

So how does all this work? In my repository I’m implementing a boxed type around a Option, one time as a “copy” type RString, another time reference counted (SharedRString). Outside Rust, both are just passed as pointers and the implementation of them is private/opaque. As such, it is possible to use any kind of Rust struct or enum and e.g. marking them as #[repr(C)] is not needed. It is also possible to use #[repr(C)] structs though, in which case the memory layout could be public and any struct fields could be available from C and other languages.

RString

The actual implementation of the type is in the imp.rs file, i.e. in the imp module. I’ll cover the other files in there at a later time, but mod.rs is providing a public Rust API around all this that integrates with GTK-rs.

The following is the whole implementation, in safe Rust:

Type Registration

Once the macro based solution is complete, this would be more or less all that would be required to also make this available to C via GObject, and any other languages. But we’re not there yet, and the goal here is to do it all manually. So first of all, we need to register this type somehow to GObject, for which (by convention) a C function called ex_rstring_get_type() should be defined which registers the type on the first call to get the type ID, and on further calls just returns that type ID. If you’re wondering what ex is: this is the “namespace” (C has no built-in support for namespaces) of the whole library, short for “example”. The get_type() function looks like this:

This is all unsafe Rust and calling directly into the GObject C library. We use std::sync::Once for the one-time registration of the type, and store the result in a static mut called TYPE (super unsafe, but OK here as we only ever write to it once). For registration we call g_boxed_type_register_static() from GObject (provided to Rust via the gobject-sys crate) and provide the name (via std::ffi::CString for C interoperability) and the copy and free functions. Unfortunately we have to cast them to a generic pointer, and then transmute them to a different function pointer type as the arguments and return value pointers that GObject wants there are plain void * pointers but in our code we would at least like to use RString *. And that’s all that there is to the registration. We mark the whole function as extern “C” to use the C calling conventions, and use #[no_mangle] so that the function is exported with exactly that symbol name (otherwise Rust is doing symbol name mangling), and last we make sure that no panic unwinding happens from this Rust code back to the C code via the callback_guard!() macro from the glib crate.

Memory Managment Functions

Now let’s take a look at the actual copy and free functions, and the actual constructor function called ex_rstring_new():

These are also unsafe Rust functions that work with raw pointers and C types, but fortunately not too much is happening here.

In the constructor function we get a C string (char *) passed as argument, convert this to a Rust string (actually Option as this can be NULL) via from_glib_none() from the glib crate and then pass that to the Rust constructor of our type. from_glib_none() means that we don’t take ownership of the C string passed to us, the other variant would be from_glib_full() in which case we would take ownership. We then pack up the result in a Rust Box to place the new RString in heap allocated memory (otherwise it would be stack allocated), and use Box’s into_raw() function to get a raw pointer to the memory and not have its Drop implementation called anymore. This is then returned to the caller.

Similarly in the copy and free functions we just do some juggling with Boxes: copy take a raw pointer to our RString, calls the compiler generated clone() function to copy it all, and then packs it up in a new Box to return to the caller. The free function converts the raw pointer back to a Box, and then lets the Drop implementation of Box take care of freeing all memory related to it.

Actual Functionality

The two remaining functions are C wrappers for the get() and set() Rust functions:

These only call the corresponding Rust functions. The set() function again uses glib’s from_glib_none() to convert from a C string to a Rust string. The get() function uses ToGlibPtrFull::to_glib_full() from GLib to convert from a Rust string (Option to be accurate) to a C string, while passing ownership of the C string to the caller (which then also has to free it at a later time).

This was all quite verbose, which is why a macro based solution for all this would be very helpful.

Corresponding C Header

Now if this API would be used from C, the header file to do so would look something like this. Probably no surprises here.

Ideally this would also be autogenerated from the Rust code in one way or another, maybe via rusty-cheddar or rusty-binder.

SharedRString

The shared, reference counted, RString works basically the same. The only differences are in how the pointers between C and Rust are converted. For this, let’s take a look at the constructor, copy (aka ref) and free (aka unref) functions again:

The only difference here is that instead of using a Box, std::alloc::Arc is used, and some differences in the copy (aka ref) function. Previously with the Box, we were just creating a immutable reference from the raw pointer and cloned it, but with the Arc we want to clone the Arc itself (i.e. have the same underlying object but increase the reference count). For this we use Arc::from_raw() to get back an Arc, and then clone the Arc. If we wouldn’t do anything else, at the end of the function our original Arc would get its Drop implementation called and the reference count decreased, defeating the whole point of the function. To prevent that, we convert the original Arc to a raw pointer again and “leak” it. That is, we don’t destroy the reference owned by the caller, which would cause double free problems later.

Apart from this, everything is really the same. And also the C header looks basically the same.

Object types

Now let’s start with the more interesting part: actual subclasses of GObject with all the features you know from object-oriented languages. Everything up to here was only warm-up, even if useful by itself already to expose normal Rust types to C with a slightly more expressive API.

In GObject, subclasses of the GObject base class (think of Object in Java or C#, the most basic type from which everything inherits) all get the main following features from the base class: reference counting, inheritance, virtual methods, properties, signals. Similarly to boxed types, some functions and structs are registered at runtime with the GObject library to get back a type ID but it is slightly more involved. And our structs must be #[repr(C)] and be structured in a very specific way.

Struct Definitions

Every GObject subclass has two structs: 1) one instance struct that is used for the memory layout of every instance and could contain public fields, and 2) one class struct which is storing the class specific data and the instance struct contains a pointer to it. The class struct is more or less what in C++ the vtable would be, i.e. the place where virtual methods are stored, but in GObject it can also contain fields for example. We define a new type Foo that inherits from GObject.

The first element of the structs must be the corresponding struct of the class we inherit from. This later allows casting pointers of our subclass to pointers of the base class, and re-use all API implemented for the base class. In our example here we don’t define any public fields or virtual methods, in the repository the version has them but we get to that later.

Now we will actually need to be able to store some state with our objects, but we want to have that state private. For that we define another struct, a plain Rust struct this time

This uses RefCell for each field, as in GObject modifications of objects are all done conceptually via interior mutability. For a thread-safe object these would have to be Mutex instead.

Type Registration

In the end we glue all this together and register it to the GObject type system via a get_type() function, similar to the one for boxed types

The main difference here is that we call g_type_register_static(), which takes a struct as parameter that contains all the information about our new subclass. In that struct we provide sizes of the class and instance struct (GObject is allocating them for us), various uninteresting fields for now and two function pointers: 1) class_init for initializing the class struct as allocated by GObject (here we would also override virtual methods, define signals or properties for example) and 2) instance_init to do the same with the instance struct. Both structs are zero-initialized in the parts we defined, and the parent parts of both structs are initialized by the code for the parent class already.

Struct Initialization

These two functions look like the following for us (the versions in the repository already do more things)

During class initialization, we tell GObject about the size of our private struct but we actually wrap it into an Option. This allows us to later replace it simply with None to deallocate all memory related to it. During instance initialization this private struct is already allocated for us by GObject (and zero-initialized), so we simply get a raw pointer to it via g_type_instance_get_private() and write an initialized struct to that pointer. Raw pointers must be used here so that the Drop implementation of Option is not called for the old, zero-initialized memory when replacing the struct.

As you might’ve noticed, we currently never set the private struct to None to release the memory, effectively leaking memory, but we get to that later when talking about virtual methods.

Constructor

With what we have so far, it’s already possible to create new instances of our subclass, and for that we also define a constructor function now

There is probably not much that has to be explained here: we only tell GObject to allocate a new instance of our specific type (by providing the type ID), which then causes the memory to be allocated and our initialization functions to be called. For the very first time, class_init would be called, for all times instance_init is called.

Methods

All this would be rather boring at this point because there is no way to actually do something with our object, so various functions are defined to work with the private data. For example to get the value of the counter

This gets the private struct from GObject (get_priv() is a helper function that does the same as we did in instance_init), and then calls a safe Rust function implemented on our struct to actually get the value. Notable here is that we don’t pass &self to the function, but something called FooWrapper. This is a GTK-rs style wrapper type that directly allows to use any API implemented on parent classes and provides various other functionality. It is defined in mod.rs but we will talk about that later.

Inheritance

GObject allows single-inheritance from a base class, similar to Java and C#. All behaviour of the base class is inherited, and API of the base class can be used on the subclass.

I shortly hinted at how that works above already: 1) instance and class struct have the parent class’ structs as first field, so casting to pointers of the parent class work just fine, 2) GObject is told what the parent class is in the call to g_type_register_static(). We did that above already, as we inherited from GObject.

By inheriting from GObject, we e.g. can call g_object_ref() to do reference counting, or any of the other GObject API. Also it allows the Rust wrapper type defined in mod.rs to provide appropriate API for the base class to us without any casts, and to do memory management automatically. How that works is probably going to be explained in one of the following blog posts on Federico’s blog.

In the example repository, there is also another type defined which inherits from our type Foo, called Bar. It’s basically the same code again, except for the name and parent type.

Virtual Methods

Overriding Virtual Methods

Inheritance alone is already useful for reducing code duplication, but to make it really useful virtual methods are needed so that behaviour can be adjusted. In GObject this works similar to how it’s done in e.g. C++, just manually: you place function pointers to the virtual method implementations into the class struct and then call those. As every subclass has its own copy of the class struct (initialized with the values from the parent class), it can override these with whatever function it wants. And as it’s possible to get the actual class struct of the parent class, it is possible to chain up to the implementation of the virtual function of the parent class. Let’s look at the example of the GObject::finalize virtual method, which is called at the very end when the object is to be destroyed and which should free all memory. In there we will free our private data struct with the RefCells.

As a first step, we need to override the function pointer in the class struct in our class_init function and replace it with another function that implements the behaviour we want

This new function could call into a safe Rust implementation, like it’s done for other virtual methods (see a bit later) but for finalize we have to do manual memory management and that’s all unsafe Rust. The way how we free the memory here is by replacing, that is take()ing the Some value out of the Option that contains our private struct, and then let it be dropped. Afterwards we have to chain up to the parent class’ implementation of finalize, which is done by calling map() on the Option that contains the function pointer.

All the function pointers in glib-sys and related crates is stored in Options to be able to handle the case of a NULL function pointer and an actual function pointer to a function.

Now for chaining up to the parent class’ finalize implementation, there’s a static, global variable containing a pointer to the parent class’ class struct, called PRIV. This is also initialized in the class_init function

While this is a static mut global variable, this is fine as it’s only ever written to once from class_init, and can only ever be accessed after class_init is done.

Defining New Virtual Methods

For defining new virtual methods, we would add a corresponding function pointer to the class struct and optionally initialize it to a default implementation in the class_init function, or otherwise keep it at NULL/None.

The trampoline function provided here is responsible for converting from the C types to the Rust types, and then calling a safe Rust implementation of the virtual method.

To make it possible to call these virtual methods from the outside, a C function has to be defined again similar to the ones for non-virtual methods. Instead of calling the Rust implementation directly, this gets the class struct of the type that is passed in and then calls the function pointer for the virtual method implementation of that specific type.

Subclasses would override this default implementation (or provide an actual implementation) exactly the same way, and also chain up to the parent class’ implementation like we saw before for GObject::finalize.

Properties

Similar to Objective-C and C#, GObject has support for properties. These are registered per type, have some metadata attached to them (property type, name, description, writability, valid value range, etc) and subclasses are inheriting them and can override them. The main difference between properties and struct fields is that setting/getting the property values is executing some code instead of just pointing at a memory location, and you can connect a callback to the property to be notified whenever its value changes. And they can be queried at runtime from a specific type, and set/get via their string names instead of actual C API. Allowed types for properties are everything that has a GObject type ID assigned, including all GObject subclasses, many fundamental types (integers, strings, …) and boxed types like our RString and SharedRString above.

Defining Properties

To define a property, we have to register it in the class_init function and also implement the GObject::get_property() and GObject::set_property() virtual methods (or only one of them for read-only / write-only properties). Internally inside the implementation of our GObject, the properties are identified by an integer index for which we define a simple enum, and when registered we get back a GParamSpec pointer that we should also store (for notifying about property changes for example).

In class_init we then override the two virtual methods and register a new property, by providing the name, type, value of our enum corresponding to that property, default value and various other metadata. We then store the GParamSpec related to the property in a Vec, indexed by the enum value. In our example we add a string-typed “name” property that is readable and writable, but can only ever be written to during object construction.

Afterwards we define the trampoline implementations for the set_property and get_property virtual methods.

In there we decide based on the index which property is meant, and then convert from/to the GValue container provided by GObject, and then call into safe Rust getters/setters.

This property can now be used via the GObject API, e.g. its value can be retrieved via g_object_get(obj, “name”, &pointer_to_a_char_pointer) in C.

Construct Properties

The property we defined above had one special feature: it can only ever be set during object construction. Similarly, every property that is writable can also be set during object construction. This works by providing a value to g_object_new() in the constructor function, which then causes GObject to pass this to our set_property() implementation.

Signals

GObject also supports signals. These are similar to events in e.g. C#, Qt or the C++ Boost signals library, and not to be confused with UNIX signals. GObject signals allow you to connect a callback that is called every time a specific event happens.

Signal Registration

Similarly to properties, these are registered in class_init together with various metadata, can be queried at runtime and are usually used by string name. Notification about property changes is implemented with signals, the GObject::notify signal.

Also similarly to properties, internally in our implementation the signals are used by an integer index. We also store that globally, indexed by a simple enum.

In class_init we then register the signal for our type. For that we provide a name, the parameters of the signal (anything that can be stored in a GValue can be used for this again), the return value (we don’t have one here) and various other metadata. GObject then tells us the ID of the signal, which we store in our vector. In our case we define a signal named “incremented”, that is emitted every time the internal counter of the object is incremented and provides the current value of the counter and by how much it was incremented.

One special part here is the class_offset. GObject allows to (optionally) define a default class handler for the signal. This is always called when the signal is emitted, and is usually a virtual method that can be overridden by subclasses. During signal registration, the offset in bytes to the function pointer of that virtual method inside the class struct is provided.

This is all exactly the same as for virtual methods, just that it will be automatically called when the signal is emitted.

Signal Emission

For emitting the signal, we have to provide the instance and the arguments in an array as GValues, and then emit the signal by the ID we got back during signal registration.

While all parameters to the signal are provided as a GValue here, GObject calls our default class handler and other C callbacks connected to the signal with the corresponding C types directly. The conversion is done inside GObject and then the corresponding function is called via libffi. It is also possible to directly get the array of GValues instead though, by using the GClosure API, for which there are also Rust bindings.

Connecting to the signal can now be done via e.g. g_object_connect() from C.

C header

Similarly to the boxed types, we also have to define a C header for the exported GObject C API. This ideally would also be autogenerated from the macro based solution (e.g. with rusty-cheddar), but here we write it manually. This is mostly GObject boilerplate and conventions.

Interfaces

While GObject only allows single inheritance, it provides the ability to implement any number of interfaces on a class to provide a common API between independent types. These interfaces are similar to what exists in Java and C#, but similar to Rust traits it is possible to provide default implementations for the interface methods. Also similar to Rust traits, interfaces can declare pre-requisites: interfaces an implementor must also implement, or a base type it must inherit from.

In the repository, a Nameable interface with a get_name() method is implemented. Generally it all works exactly the same as with non-interface types and virtual methods. You register a type with GObject that inherits from G_TYPE_INTERFACE. This type only has a class struct, no instance struct. And instead of an instance struct, a typedef’d void * pointer is used. Behind that pointer would be the instance struct of the actual type implementing the interface. A default implementation of methods can be provided the same way as with virtual methods in class_init.

There are two main differences though. One is for calling an interface method

Instead of directly getting the class struct from the instance, we have to call some GObject API to get the interface struct of a specific interface type ID with the virtual methods.

The other difference is for implementation of the interface. Inside the get_type() function a new set of functions is registered, which are used similar to class_init for initialization of the interface struct

The interface also gets a C header, which looks basically the same as for normal classes.

Usage from C

As mentioned above a few times, we export a normal (GObject) C API. For that various headers have to be written, or ideally be generated later. These can be all found here.

Nothing special has to be taken care off for using this API from C, you simply link to the generated shared library, use the headers and then use it like any other GObject based C API.

Usage from Rust

I mentioned shortly above that in the mod.rs there are gtk-rs-style Rust bindings. And these are also what would be passed (the “Wrapper” arguments) to the safe Rust implementations of the methods.

Ideally these would be autogenerated from a macro, similarly how the gir tool can do already for C based GObject libraries (this is the tool to generate most of the GLib, GTK, etc bindings for Rust).

For usage of those bindings, I’ll just let the code speak for itself

This does automatic memory management, allows to call base-class methods on instances of a subclass, provides access to methods, virtual methods, signals, properties, etc.

Usage from Python, JavaScript and Others

Now all this was a lot of boilerplate, but here comes the reason why it is probably all worth it. By exporting a GObject-style C API, we automatically get support for generating bindings for dozens of languages, without having to write any more code. This is possible thanks to the strong API conventions of GObject, and the GObject-Introspection project. Supported languages are for example Rust (of course!), Python, JavaScript (GJS and Node), Go, C++, Haskell, C#, Perl, PHP, Ruby, …

GObject-Introspection achieves this by scanning the C headers, introspecting the GObject types and then generating an XML based API description (which also contains information about ownership transfer!). This XML based API description can then be used by code generators for static, compiled bindings (e.g. Rust, Go, Haskell, …), but it can also be compiled to a so-called “typelib”. The typelib provides a C ABI that allows bindings to be generated at runtime, mostly used by scripting languages (e.g. Python and JavaScript).

To show the power of this, I’ve included a simple Python and JavaScript (GJS) application that uses all the types we defined above, and a Makefile that generates the GObject-Introspection metadata and can directly run the Python and JavaScript applications (“make run-python” and “make run-javascript”).

The Python code looks as follows

and the JavaScript (GJS) code as follows

Both are doing the same and nothing useful, they simple use all of the available API.

What next?

While everything here can be used as-is already (and I use a variation of this in gst-plugin-rs, a crate to write GStreamer plugins in Rust), it’s rather inconvenient. The goal of this blog post is to have a low-level explanation about how all this works in GObject with Rust, and to have a “template” to use for Nikos’ gnome-class macro. Federico is planning to work on this in the near future, and step by step move features from my repository to the macro. Work on this will also be done at the GNOME/Rust hackfest in November in Berlin, which will hopefully yield a lot of progress on the macro but also on the bindings in general.

In the end, this macro would ideally end up in the glib-rs bindings and can then be used directly by anybody to implement GObject subclasses in Rust. At that point, this blog post can hopefully help a bit as documentation to understand how the macro works.

Leave a Reply

Your email address will not be published. Required fields are marked *