November 27, 2020

My guide to working with JSON in Go

Working with JSON in Go can be tricky. Specially if you’re coming from dynamic languages like Python or JavaScript, where JSON encoding often “just works”. Over the years I had grown accustomed to those conveniences and that made me overlook many details which affect JSON encoding in Go. This guide hopefully helps me to avoid those mistakes in the future.

There are many libraries that deal with JSON, but this guide focuses on the one in standard library: encoding/json.

struct properties must be exported

Go differentiates between private and public objects by capitalization. In order for json package to access the type defined in your package its fields must be capitalised. For example:

always use struct tags

Struct tags are metadata assigned to fields. While JSON encode/decode works without specifying any tags, it makes sense to always use them. This makes it easier for the reader to quickly understand which structs are used for JSON and it avoids issues caused by capitalization.

The following example demonstrates how input data lowercase key will get lost:

To avoid issues use tags as:

type object struct {
	Somekey string `json:"somekey"`
}

Without tags decoding is essentially case insensitive and encoding always makes keys to be Capitalised.

omitempty and default values

omitempty is a value you can use in your tags to specify how null values are handled. Usage seems simple:

type object struct {
    Somekey string `json:"somekey,omitempty"`
}

The behaviour however may surprise you. By looking at it’s definition:

The “omitempty” option specifies that the field should be omitted from the encoding if the field has an empty value, defined as false, 0, a nil pointer, a nil interface value, and any empty array, slice, map, or string.

.. it becomes clear that you need to pay close attention of the underlying type of your field. One of the first things you need to know are the default values for each data type:

  • string: ""
  • int: 0
  • number: 0.0
  • bool: false
  • array: empty slice

The problem with omitempty raises when your intended values match the default ones. So in most cases you probably want to use omitempty only for optional nested objects. Dropping boolean or number value keys from the JSON will most likely create unnecessary confusion. Example:

The out value is: {"number":0}

The second group you need to pay attention to are nested objects, pointers and interface{} types. Without omitempty they will be encoded as {"key": null}.

nested objects

You can define nested objects by referencing a struct (or a pointer to one):

type Inner struct {
    Key string
}

type object struct {
    Key      string `json:"key"`
    Inner    Inner  `json:"object"`
}

You can also use nested structs for deeply nested JSON:

any type and interface{}

If you cannot get away from not defining your input type you can use interface{}. You’ll lose a convenient error check right on the Unmarshal call, but gain flexibility where needed. Example:

This is usually followed by a lot of type checks and casts, so I’d suggest to avoid it where possible.

datetime

In addition to standard JSON data types, encoding/json can also handle datetime. JSON doesn’t define an official datetime format, but it’s recommended to use the Internet Date/Time Format: 2006-01-02T15:05:05Z. That’s also supported in Go, so a string corresponding to datetime format will be decoded into time.Time type:

custom format

There are couple of ways how you could handle custom input or output formats. The one I like the most and which requires the least amount of code to be maintained is to use a wrapped type and define it’s encode/decode functions.

For example, if you would like to represent your time.Time object as a unix timestamp instead of the datetime string, you could define your own type with embedded time.Time. You need embedded, as we need access to Time methods, which would not be possible with type aliasing:

type UnixTime struct {
    time.Time
}

Then you can write your own function for encode:

func (t UnixTime) MarshalJSON() ([]byte, error) {
	return json.Marshal(t.Unix())
}

Similarly, you can decode an unix timestamp to time.Time with your own type and UnmarshalJSON function:

Similar approach works for any custom type you may have. Just pay attention to the UnmarshalJSON function which needs to be a pointer receiver.

null and collections

Null is bad. Some say billion dollars worth of bad even. Unfortunately Go does very little to help us with it. nil is everywhere, and the “optional” type introduced in many other languages has to wait at least till Go 2.x.

And even if you carefully check for nil and make sure your structs are always initiated properly via New(), JSON will still bite you. Specially when it comes to collections.

The main problem is that different encoding implementations have decided to treat empty collections differently, there is no real “standard” there. For example, json module from Python standard library will encode an empty array or a dictionary to it’s representation in JSON:

{
    "map": {},
    "array": []
}

While encoding/json in Go could give you:

{
    "map": null,
    "array": null
}

You can make an argument that the same can be achieved in Python by setting the map value to None. But by following good practises this doesn’t often happen.

The following example demonstrates things which may not be obvious at first sight:

The output may surprise you:

{"array":null,"map":null,"object":{"value":""}}

JSON doesn’t differentiate between maps and objects. For the sake of clarity in this post a map represents an arbitrary collection of keys and values, while object keys are always the same.

Getting rid of null values for arrays and maps requires explicit initialisation. Getting “empty” struct to be encoded as null requires a pointer and explicitly setting it’s value to nil:

{"array":[],"map":{},"object":null}

null vs default vs unset values

One of the problems of encoding/json is its inability to differentiate between null, default and unset values. Decoding a default or null value makes it impossible to know whether the value was not present in the JSON, or was it explicitly set to default/null/empty. A partial workaround for this is to use field pointers:

This makes it possible to check for nil:

if obj.Ptr == nil {
    return errors.New("ptr required")
}

That also works for collections, until they are missing or null in the JSON. Explicit [] and {} will be decoded to empty pointer slice and struct with its default values.

Depending on your use case that may not be feasible or convenient. In those cases it’s worth to look for drop-in replacements, or packages implementing nullable or optional types. More info on the limitations can be found from #11939.

you don’t need a library, and then you do

I’m a strong proponent of using Go standard library. Or at least starting from there. It is well designed and feature rich. And encoding/json has been enough for all the tasks I’ve needed to perform with it so far.

However, it does have some limitations. It is using reflection which makes it slower compared to some other implementations. You can find drop-in replacements which focus on being faster or taking less resources. Some even claim to be 10x faster, while forcing you to learn a new API. Some provide additional features like one line retrieval and dot notation paths. Know your task and pick the corresponding tool. But encoding/json can do a lot, you may not need anything else.

more code please

In order to better understand how a particular thing works, you could just read its code. The test files for encode and decode are worth a glance. I also wrote some tests to verify and clarify the things described in this post. All code is available in that gist.

Powered by Hugo & Kiss.