Sample code

The sample code for this article can be found here https://github.com/michaelrosedev/snakecase_json.

.NET Core 3.0 Released

In my day job I spend a lot of time working with JSON APIs, and performance is always a huge factor when developing new functionality. I have used a number of different JSON libraries in the past, but in the end always tend to default to Newtonsoft.Json.

Recently .NET Core 3.0 was released, and brings with it the new System.Text.Json APIs, with a focus on performance and throughput (https://docs.microsoft.com/en-us/dotnet/api/system.text.json?view=netcore-3.0).

Snakecase

One of the key use-cases I use Newtonsoft.Json for is to accept and send snakecase JSON, e.g.:

{
    "id": 10,
    "some_property": "value",
    "nested_object": {
        "order_date": "2019-07-30T23:59:59+02:00",
        "visible": true
    }
}

This JSON is produced from an object such as this:

public class SomeObject {
    public string Id { get; set; }
    public string SomeProperty { get; set; }
    public NestedObject NestedObject { get; set; }
}

public class NestedObject {
    public DateTimeOffset OrderDate { get; set; }
    public bool Visible { get; set; }
}

As System.Text.Json promises performance improvements, I decided to experiment and see if I could manage to get it to work with snake case JSON.

To determine if this is a viable use-case, I’m going to write some unit tests and some benchmarks.

Before we begin

Make sure that you’ve got the latest .NET Core 3.0 installed. To check this, you can run:

dotnet --version

Creating some objects to test

The first thing I’m going to do it to create a new empty solution. I’ll then add a new class library to the solution that will contain the DTOs that I want to use to test serialization and deserialization.

Now I’m going to add some simple classes that we can use later.

The top-level object is a Booking. This doesn’t really have any real-world meaning, but it is an object that we can populate with properties I’m most interested in when serializing, such as DateTimeOffset, string, and nested complex objects:

using System;

namespace Sample.Contracts
{
    public class Booking
    {
        public string Id { get; set; }
        public DateTimeOffset BookingDate { get; set; }
        public string Title { get; set; }
        public bool Premium { get; set; }
        public Member Member { get; set; }
        public Price Price { get; set; }
    }
}

The Member and Price objects have the following contents:

namespace Sample.Contracts
{
    public class Member
    {
        public string Id { get; set; }
        public string FirstName { get; set; }
        public string LastName { get; set; }
        public string EmailAddress { get; set; }
    }
}

namespace Sample.Contracts
{
    public class Price
    {
        public decimal Value { get; set; }
        public string Currency { get; set; }
    }
}

We can now write some simple unit tests to make sure that it is possible to serialize and deserialize these objects using snake case JSON.

Unit Tests

I’ve added a sample unit test project to my solution. I tend to gravitate towards NUnit for unit tests, but that’s just a personal preference.

Note: I think TDD is a great approach, and I’m going to lay out some of the steps here that I followed when using this approach. For the purposes of what I wanted to test, I would say that the tests included need to be significantly extended and more complex. I just wanted to prove a very simple use-case.

First we’ll create a test to serialize an object to JSON and verify that the OrderDate property has been named order_date in the resulting JSON. This is as far as I’m going for now with this test, but in a real-world solution this wouldn’t be sufficient.

First test:

[Test]
public void CanSerializeObjectToJson() {
    var booking = new Booking();
    var json = JsonSerializer.Serialize(booking);
    Assert.That(
        json.Contains("booking_date"),
        Is.EqualTo(true)
    );
}

Now, if we run this test, it will fail. The default naming case of System.Text.Json is camel case, so the assertion fails.

To make this test pass, we’re going to need to make the serializer return snake case property names.

Converting camel case to snake_case

There are plenty of ways to convert a string such as ThisString to snake case (this_string). Later in this article I’ll show the benchmark comparisons for 2 different techniques I found using a quick Google search. For now,I’ll demonstrate the simple method used.

To begin with, we need some unit tests (TDD, remember!). I’ve created a few different tests:

[Test]
public void SimpleStringCanBeConvertedToSnakeCase()
{
    const string Input = "Id";
    const string ExpectedOutput = "id";

    var result = Input.ToSnakeCase();

    Assert.That(
        result,
        Is.EqualTo(ExpectedOutput)
    );
}

In this first test, I simply want to establish that a simple property name (Id) is converted to its lowercase representation (id).

Next, I want to test that a string with two words in it is converted to the appropriate snake case format, i.e. MyName is converted to my_name:

[Test]
public void MultiWordStringCanBeConvertedToSnakeCase()
{
    const string Input = "MyName";
    const string ExpectedOutput = "my_name";

    var result = Input.ToSnakeCase();

    Assert.That(
        result,
        Is.EqualTo(ExpectedOutput)
    );
}

Next, I’m going to test a slightly more complex property name:

[Test]
public void TripleWordStringCanBeConvertedToSnakeCase()
{
    const string Input = "ExpectedDeliveryDate";
    const string ExpectedOutput = "expected_delivery_date";

    var result = Input.ToSnakeCase();

    Assert.That(
        result,
        Is.EqualTo(ExpectedOutput)
    );
}

The final two tests I’m adding are some simple “defensive programming” style tests - make sure that there’s no Exception thrown if the string I’m trying to convert is null, or if it is an empty string:

[Test]
public void NullStringDoesNotCauseAnException()
{
    const string Input = (string) null;

    Assert.DoesNotThrow(() => { Input.ToSnakeCase(); });
}

[Test]
public void EmptyStringDoesNotCauseAnException()
{
    var input = string.Empty;

    Assert.DoesNotThrow(() => { input.ToSnakeCase(); });
}

Now that we’ve got some tests, we can create the implementation to satisfy the tests:

using System;
using System.Linq;

namespace Sample.Serialization
{
    /// <summary>
    /// Copied from https://gist.github.com/vkobel/d7302c0076c64c95ef4b
    /// </summary>
    public static class ExtensionMethods {

        public static string ToSnakeCase(this string str)
        {
            return string.Concat(
                str.Select(
                    (x, i) => i > 0 && char.IsUpper(x)
                        ? "_" + x
                        : x.ToString()
                        )
                ).ToLower();
        }
    }
}

The basic premise of this method is:

  • For each character in the string:
    • If the character is upper case, append _{char} to the result;
    • Otherwise just append the character to the result
  • Return the whole result to lowercase

It works, but it’s not very efficient. We’ll see some potentially more efficient code later.

Note: I put this code together using TDD and didn’t follow the exact steps laid out here (i.e. I didn’t write all the tests, then just write the code). However, I assume you don’t want to read this article all week, so I’ve cut things down to be more brief.

Now we are able to convert a CamelCase string to snake_case, but how do we wire this up for the new System.Text.Json methods?

System.Text.Json.JsonNamingPolicy

The implementation is actually very straightforward. Any call to Serialize or Deserialize<T> takes an optional JsonSerializerOptions parameter.

JsonSerializerOptions itself contains a PropertyNamingPolicy property, so we simply need to create a new JsonNamingPolicy which will allow us to handle snake case.

Let’s begin with the tests again:

using NUnit.Framework;

namespace Sample.Serialization.Tests
{
    [TestFixture]
    public class SnakeCaseNamingPolicyTests
    {
        private SnakeCaseNamingPolicy _sut;

        [SetUp]
        public void _Setup()
        {
            _sut = new SnakeCaseNamingPolicy();
        }

        [Test]
        public void NullStringDoesNotCauseAnException()
        {
            Assert.DoesNotThrow(() => { _sut.ConvertName((string) null); });
        }

        [Test]
        public void EmptyStringDoesNotCauseAnException()
        {
            Assert.DoesNotThrow(() => { _sut.ConvertName(string.Empty); });
        }

        [TestCase("Id", "id")]
        [TestCase("Forename", "forename")]
        [TestCase("PeopleCarrier", "people_carrier")]
        [TestCase("MultiWordString", "multi_word_string")]
        public void InputStringIsReturnedInSnakeCase(string input, string expectedOutput)
        {
            var result = _sut.ConvertName(input);

            Assert.That(
                result,
                Is.EqualTo(expectedOutput)
            );
        }
    }
}

I’ve tried to steer clear of testing the implementation of the snake case extension method again, but to ensure that we are getting the correct results from the SnakeCaseNamingPolicy we do have some tests that verify that the output is as expected.

The code isn’t going to compile for these tests, because SnakeCaseNamingPolicy doesn’t yet exist. It’s an extremely simple implementation:

using System.Text.Json;

namespace Sample.Serialization
{
    public class SnakeCaseNamingPolicy : JsonNamingPolicy
    {
        public override string ConvertName(string name)
        {
            return name.ToSnakeCase();
        }
    }
}

We just need to implement the ConvertName(string name) method, and the implementation itself simply requires us to call our new ToSnakeCase() extension method on the provided name parameter.

Now that the implementation exists, our test should compile and they should all pass.

Wiring it all up for our test

Now that we have a SnakeCaseNamingPolicy, we can update our unit test.

The first change is to add a new [SetUp] method to the test. This is going to initialise an instance of JsonSerializationOptions where we define the PropertyNamingPolicy:

private JsonSerializerOptions _options;

[SetUp]
public void Setup()
{
    _options = new JsonSerializerOptions
    {
        PropertyNamingPolicy = new Serialization.SnakeCaseNamingPolicy()
    };
}

Now I want to make sure that we have an actual object to test. In the first pass at this test, we simply created a new Booking without populating its properties. Now we’re going to make sure that we have a fully-populated instance:

[Test]
public void CanSerializeObjectToJson()
{
    var booking = new Booking
    {
        BookingDate = new DateTimeOffset(2019, 06, 23, 22, 00, 00, TimeSpan.FromHours(1)),
        Id = "af43ea6f-b3ff-4640-9a9a-dbfc7544a4a4",
        Title = "Sample Booking",
        Premium = false,
        Price = new Price
        {
            Value = 9.99M,
            Currency = "GBP"
        },
        Member = new Member
        {
            EmailAddress = "sample.member@somedomain.com",
            FirstName = "William",
            LastName = "McDowell",
            Id = "7ce13464-a9df-4630-a50b-7fdd8a3661c4"
        }
    };

    // rest of the implementation here...
}

Finally, we want to ensure that we’re passing the new booking and the _options to the serialize method:

var json = JsonSerializer.Serialize(booking, _options);
Assert.That(
    json.Contains("booking_date"),
    Is.EqualTo(true)
);

The full final unit test is shown below:

using System;
using System.Text.Json;
using NUnit.Framework;
using Sample.Contracts;

namespace Sample.Tests
{
    public class Tests
    {
        private JsonSerializerOptions _options;

        [SetUp]
        public void Setup()
        {
            _options = new JsonSerializerOptions
            {
                PropertyNamingPolicy = new Serialization.SnakeCaseNamingPolicy()
            };
        }

        [Test]
        public void CanSerializeObjectToJson()
        {
            var booking = new Booking
            {
                BookingDate = new DateTimeOffset(2019, 06, 23, 22, 00, 00, TimeSpan.FromHours(1)),
                Id = "af43ea6f-b3ff-4640-9a9a-dbfc7544a4a4",
                Title = "Sample Booking",
                Premium = false,
                Price = new Price
                {
                    Value = 9.99M,
                    Currency = "GBP"
                },
                Member = new Member
                {
                    EmailAddress = "sample.member@somedomain.com",
                    FirstName = "William",
                    LastName = "McDowell",
                    Id = "7ce13464-a9df-4630-a50b-7fdd8a3661c4"
                }
            };
            var json = JsonSerializer.Serialize(booking, _options);
            Assert.That(
                json.Contains("booking_date"),
                Is.EqualTo(true)
            );
        }
    }
}

This demonstrates that the following JSON is produced:

{
    "id": "af43ea6f-b3ff-4640-9a9a-dbfc7544a4a4",
    "booking_date": "2019-06-23T22:00:00+01:00",
    "title": "Sample Booking",
    "premium": false,
    "member": {
        "id": "7ce13464-a9df-4630-a50b-7fdd8a3661c4",
        "first_name": "William",
        "last_name": "McDowell",
        "email_address": "sample.member@somedomain.com"
    },
    "price": {
        "value": 9.99,
        "currency": "GBP"
    }
}

That produces exactly the JSON that we’re looking for.

Deserializing snake_case JSON

Now that we know it’s possible to produce JSON in snake case, we also want to ensure that we can deserialize JSON into a Booking object.

The JsonNamingPolicy in .NET Core 3.0 appears to be used for both serialisation and deserialisation, so there is no code change required to make deserialisation work. Let’s add a quick unit test to verify this.

Note: The following unit test is another over-simplified example; it demonstrates successful deserialisation of a single property, not the whole object.

[Test]
public void CanDeserializeJsonToObject()
{
    const string JsonBooking = @"{
        ""id"": ""4f9ca774-81b9-4296-a35e-b31b96cedfb7"",
        ""title"": ""Sample Booking"",
        ""booking_date"": ""2019-05-06T16:45:00+02:00"",
        ""premium"": true,
        ""price"": {
            ""value"": 12.95,
            ""currency"": ""EUR""
            },
        ""member"": {
            ""id"": ""64cc7df1-5635-44d1-bfbc-00289abd3603"",
            ""first_name"": ""Jessica"",
            ""last_name"": ""Smithsson"",
            ""email_address"": ""jessica99987@mmail.com""
            }
        }";
    var booking = JsonSerializer.Deserialize<Booking>(JsonBooking, _options);
    Assert.That(
        booking.Title,
        Is.EqualTo("Sample Booking")
    );
}

This test should pass immediately with no code changes required.

Testing Performance

Now that we know that our code works, and we can serialise to and deserialise from snake case JSON, I want to see how the code performs compared to Newtonsoft.JSON. Let’s add some simple benchmarks.

Before we start

I’ve added a new Sample.Benchmarks console app to my solution and installed BenchmarkDotNet:

dotnet add package BenchmarkDotNet --version 0.11.5

I’ve also added Newtonsoft.Json and made sure that the version of .NET Core that I am using is 3.0. This allows me to write code to do the equivalent conversion using both Newtonsoft.Json and System.Text.Json.

Writing the benchmark

I want the benchmark to be simple, but to represent a real-world use-case. I’m going to test both serialising and deserialising, and I’m going to use the same Booking object as the unit tests above.

In the benchmark class, I’m going to add a populated Booking instance for serialisation tests, and a string representation of a serialised Booking to test deserialisation:

[MemoryDiagnoser]
public class Benchmarks
{
    private const string JsonBooking = @"{
            ""id"": ""4f9ca774-81b9-4296-a35e-b31b96cedfb7"",
            ""title"": ""Sample Booking"",
            ""booking_date"": ""2019-05-06T16:45:00+02:00"",
            ""premium"": true,
            ""price"": {
                ""value"": 12.95,
                ""currency"": ""EUR""
                },
            ""member"": {
                ""id"": ""64cc7df1-5635-44d1-bfbc-00289abd3603"",
                ""first_name"": ""Jessica"",
                ""last_name"": ""Smithsson"",
                ""email_address"": ""jessica99987@mmail.com""
                }
            }";

    private readonly Booking _booking = new Booking
    {
        BookingDate = new DateTimeOffset(2019, 06, 23, 22, 00, 00, TimeSpan.FromHours(1)),
        Id = "af43ea6f-b3ff-4640-9a9a-dbfc7544a4a4",
        Title = "Sample Booking",
        Premium = false,
        Price = new Price
        {
            Value = 9.99M,
            Currency = "GBP"
        },
        Member = new Member
        {
            EmailAddress = "sample.member@somedomain.com",
            FirstName = "William",
            LastName = "McDowell",
            Id = "7ce13464-a9df-4630-a50b-7fdd8a3661c4"
        }
    };
}

Newtonsoft.Json version

We want to make sure that we configure Newtonsoft.Json with a snake case naming strategy. This is already built in to the library; we just need to configure it:

private static readonly DefaultContractResolver ContractResolver = new DefaultContractResolver
{
    NamingStrategy = new SnakeCaseNamingStrategy()
};

private readonly JsonSerializerSettings _jsonSerializerSettings = new JsonSerializerSettings
{
    ContractResolver = ContractResolver
};

The code above gives us a JsonSerializerSettings that we can pass to our serialize and deserialize methods that use Newtonsoft.Json:

[Benchmark]
public string SerializeWithNewtonsoft()
{
    var result = JsonConvert.SerializeObject(_booking, Formatting.Indented, _jsonSerializerSettings);
    return result;
}

[Benchmark]
public Booking DeserializeWithNewtonsoft()
{
    var result = JsonConvert.DeserializeObject<Booking>(JsonBooking, _jsonSerializerSettings);
    return result;
}

System.Text.Json version

The System.Text.Json version is very similar, we want a shared JsonSerializerOptions so that we can configure our JsonNamingPolicy to be snake case:

private readonly JsonSerializerOptions _options = new JsonSerializerOptions
{
    PropertyNamingPolicy = new Serialization.SnakeCaseNamingPolicy()
};

Now we can add the equivalent benchmarks:

[Benchmark]
public string SerializeWithSystemTextJson()
{
    var result = System.Text.Json.JsonSerializer.Serialize(_booking, _options);
    return result;
}

[Benchmark]
public Booking DeserializeWithSystemTextJson()
{
    var result = System.Text.Json.JsonSerializer.Deserialize<Booking>(JsonBooking, _options);
    return result;
}

Now let’s run the benchmarks and see the result:


BenchmarkDotNet=v0.11.5, OS=macOS Mojave 10.14.6 (18G95) [Darwin 18.7.0]
Intel Core i7-8850H CPU 2.60GHz (Coffee Lake), 1 CPU, 12 logical and 6 physical cores
.NET Core SDK=3.0.100
  [Host]     : .NET Core 3.0.0 (CoreCLR 4.700.19.46205, CoreFX 4.700.19.46214), 64bit RyuJIT
  DefaultJob : .NET Core 3.0.0 (CoreCLR 4.700.19.46205, CoreFX 4.700.19.46214), 64bit RyuJIT

MethodMeanErrorStdDevGen 0Gen 1Gen 2Allocated
Serialize with Newtonsoft3.483 us0.0717 us0.1433 us0.65610.0038-3.02 KB
Serialize with System.Text.Json2.351 us0.0202 us0.0179 us0.3471--1.6 KB
Deserialize with Newtonsoft4.782 us0.0320 us0.0299 us0.7324--3.37 KB
Deserialize with System.Text.Json3.470 us0.0673 us0.0899 us0.3319--1.53 KB

So, System.Text.Json shows itself (for this very limited use-case) to be faster, and to allocate less memory.

Serialisation with System.Text.Json had a mean of 2.351 us compared to 3.483 us, and improvement of ~38.8%. That doesn’t seem like a massive improvement, but on a system with say 1,000,000 requests per day that could be a huge saving in time spent serialising objects to JSON.

Deserialisation shows a similar improvement. System.Text.Json had a mean of 3.470 us compared to 4.782 us, and improvement of ~31.79%.

However, I did mention earlier that the code we used to convert a string to snake case was not as efficient as it could be. Another version I found on Stack Overflow uses the new Span to reduce allocations, and looked a little more efficient.

I added a new ToSnakeCaseSpan extension method with this implementation and then re-ran the benchmarks.

Note: I actually added some benchmarks specifically to cover the different snake case extension method implementations too, but this post is getting long enough…check the sample code in GitHub to see those benchmarks.

The new implementation (with some tidying):

public static string ToSnakeCase(this string str) {
    if (str == null)
    {
        return string.Empty;
    }

    var upperCaseLength = str.Count(t => t >= 'A' && t <= 'Z' && t != str[0]);
    var bufferSize = str.Length + upperCaseLength;
    Span<char> buffer = new char[bufferSize];
    var bufferPosition = 0;
    var namePosition = 0;
    while (bufferPosition < buffer.Length)
    {
        if (namePosition > 0 && str[namePosition] >= 'A' && str[namePosition] <= 'Z')
        {
            buffer[bufferPosition] = '_';
            buffer[bufferPosition + 1] = str[namePosition];
            bufferPosition += 2;
            namePosition++;
            continue;
        }
        buffer[bufferPosition] = str[namePosition];
        bufferPosition++;
        namePosition++;
    }

    return new string(buffer).ToLower();
}

In order to view the whole set of results, including the previous snake case implementation and the new one, I added another JsonNamingPolicy to my solution, SnakeCaseNamingPolicySpan. This allows me to add another couple of benchmarks.

First, I added a new version of JsonSerializerOptions to the benchmark class:

private readonly JsonSerializerOptions _spanOptions = new JsonSerializerOptions
{
    PropertyNamingPolicy = new Serialization.SnakeCaseNamingPolicySpan()
};

Then I added new serialise and deserialise benchmarks:

[Benchmark]
public string SerializeWithSystemTextJsonSpan()
{
    var result = System.Text.Json.JsonSerializer.Serialize(_booking, _spanOptions);
    return result;
}

[Benchmark]
public Booking DeserializeWithSystemTextJsonSpan()
{
    var result = System.Text.Json.JsonSerializer.Deserialize<Booking>(JsonBooking, _spanOptions);
    return result;
}

Then I re-ran the benchmarks to see what performs best:


BenchmarkDotNet=v0.11.5, OS=macOS Mojave 10.14.6 (18G95) [Darwin 18.7.0]
Intel Core i7-8850H CPU 2.60GHz (Coffee Lake), 1 CPU, 12 logical and 6 physical cores
.NET Core SDK=3.0.100
  [Host]     : .NET Core 3.0.0 (CoreCLR 4.700.19.46205, CoreFX 4.700.19.46214), 64bit RyuJIT
  DefaultJob : .NET Core 3.0.0 (CoreCLR 4.700.19.46205, CoreFX 4.700.19.46214), 64bit RyuJIT


MethodMeanErrorStdDevGen 0Gen 1Gen 2Allocated
SerializeWithNewtonsoft3.213 us0.0319 us0.0299 us0.65610.0038-3.02 KB
SerializeWithSystemTextJson2.318 us0.0161 us0.0143 us0.3471--1.6 KB
SerializeWithSystemTextJsonSpan2.468 us0.0470 us0.0560 us0.3471--1.6 KB
DeserializeWithNewtonsoft4.793 us0.0917 us0.0982 us0.73240.0076-3.37 KB
DeserializeWithSystemTextJson3.401 us0.0284 us0.0266 us0.3319--1.53 KB
DeserializeWithSystemTextJsonSpan3.370 us0.0360 us0.0336 us0.3319--1.53 KB

This was surprising, I had expected the Span-based version to perform better or to allocate less, but it didn’t!

I’ll need to look into that more.

Conclusion

Using System.Text.Json is very straightforward. For this very trivial use-case it performs over 30% better than Newtonsoft.Json. However, before I start advocating that we all drop what we’re doing and immediately switch to System.Text.Json for all systems, I’ll need to do a little more in-depth analysis and experimentation.

Next Steps

I hear very good things about nuecc’s Utf8Json, so next I am going to include some samples for that library and compare them to Newtonsoft.Json and System.Text.Json. I’ve also experimented with Kevin Montrose’s Jil library in the past, so I’m going to take that for a spin too. Watch this space.