Convert complex YAML to .NET types with custom YamlDotNet type converters

Custom .NET YAML serializers with YamlDotNet sample

When it comes to YAML serialization and deserialization in .NET, YamlDotNet is a go-to library with over 100 million downloads on NuGet. It is also integrated into various projects by Microsoft and the .NET team, despite the absence of an official Microsoft YAML library for .NET.

In this blog post, we will explore the process of creating custom YAML serializers and deserializers using YamlDotNet. To illustrate these concepts, we’ll examine the specific use case of partially parsing the environment variables section of a Docker Compose file.

The Docker Compose environment variables use case

Docker Compose allows the definition of environment variables in two distinct formats. The first, known as the object format, appears as follows:

This object format can be directly deserialized into a dictionary of strings. However, Docker Compose also supports an array format:

Unlike the object format, the array format is more complex to deserialize as it consists of an array of strings. If we want to consistently deserialize both formats into a dictionary of strings, we need to create a custom serializer. This can be done by implementing the IYamlTypeConverter interface.

Before jumping into the code, let’s first understand the three types of YAML tokens that can be encountered when parsing a YAML document with YamlDotNet:

  • The Scalar token represents the presence of a scalar value. It can be a string, a number, a boolean, etc.
  • The MappingStart and MappingEnd tokens represent the start and end of a YAML object, a collection of key-value pairs. Note that the keys are always scalars.
  • The SequenceStart and SequenceEnd tokens represent the start and end of a YAML array, an collection of values.

You can understand how YAML documents can be parsed using those tokens here:

Implementing our custom IYamlTypeConverter

With this in mind, let’s begin by implementing the IYamlTypeConverter interface. The interface has three methods. The first, Accepts, is used to determine if the converter can handle a given type. In our case, we want to handle the type EnvironmentVariables, which I’ve just created to represent a dictionary of environment variables:

The second method, ReadYaml, is used to deserialize a YAML document into a .NET object. In our scenario, we aim to deserialize a YAML object or array into an EnvironmentVariables:

The third method, WriteYaml, is used to serialize a .NET object back into a YAML document. For our purposes, we’ll serialize an EnvironmentVariables object into a YAML format:

Now that we’ve outlined the structure of our custom serializer, let’s delve into the deserialization logic. Given that the Docker Compose YAML schema for environment variables supports both an object format and a string array format, we can evaluate the first YAML token to find out whether it marks the beginning of a YAML object (MappingStart) or a YAML array (SequenceStart):

The TryConsume method, as its name suggests, attempts to consume a YAML token from the document. If the token is of the expected type, it’s consumed, and the method returns true. If not, the method returns false, and the parser doesn’t move forward. Let’s implement the two parsing methods:

With these methods in place, we can support both the object and array formats. The deserialization logic is now complete. Next, let’s tackle the serialization logic. Our goal is to serialize an EnvironmentVariables object into a YAML format. The Emit method can be used to produce YAML tokens:

We’ve constructed a custom YAML serializer and deserializer for the EnvironmentVariables type. We can subsequently use it to deserialize a YAML document into an EnvironmentVariables object:

Leave a Reply

%d bloggers like this: