serde is pretty much the most commonly used serialization and deserialization library in the Rust ecosystem today.
Golang Implementation
As a Golang programmer, it’s important to compare.
The official Golang library directly implements serialization and deserialization of json.
For both serialization and deserialization, Go uses a simple interface called interface
.
Note that this interface is for JSON only, and the suffix JSON
is meaningful in the interface name. Because for the same data type, it may be necessary to implement serialization in many formats, such as yaml, toml, etc. If we go by this naming, it could be: MarshalYaml
, MarshalToml
.
It looks easy, doesn’t it? Well, it is simple. But in fact there is a catch, or rather, there are some details that must be paid attention to when implementing it.
The implementation of Marshaler
must use a normal receiver, (i.e. not a pointer receiver only implementation). This is because pointer only will prevent our own methods from being called during serialization if they are not in pointer form.
The implementation of Unmarshaler
must use a pointer receiver, because it is parsing the data to itself, so it must modify itself, unmodifiable is meaningless.
For example, suppose we want to JSON serialize a custom int type (mainly for use as an enum function) to some specific string (mainly for use in JSON structured logs, for more friendly reading).
|
|
Yes, the whole implementation is very simple, for MarshalJSON()
we just need to return the JSON string we want to return. Note that we must return a legal JSON string type here. Here we use "%v"
instead of %q
, mainly because in our scenario (like enum) we are 100% sure that we won’t need to escape it again, and if we are not sure, it is better to use %q
.
UnmarshalJSON is also very simple to handle since we are sure that the input is legal and the format must be specific. There is no need to think too much about it.
Rust implementation
Of course, serde itself can handle this kind of serialization, we just need to add #[derive(Serialize, Deserialize, Debug)]
. This is a custom implementation, mainly from a learning point of view.
Serialization is relatively simple to implement, we just need to implement the Serialize
trait in Rust.
Deserialization is a bit more complicated. Unlike Go, where the implementation details are left to the user, serde requires the data to be exchanged according to the serde data model.
Looking at the Deserialize
trait alone, it seems like deserialization is similar. But it actually requires a Vistor trait.
Syntactically, there is only one expecting
method that must be implemented for the Visitor
trait, but in practice we have to choose to implement other methods depending on the specific JSON data type. For example, in our case, we can clearly determine that the JSON data we receive is a string type, so we only need visit_str
, and note that visit_str
is already parsed out by default in serde compared to Golang, where the user has to parse the JSON data manually, so when the input is " foo"
, in serde we get visit_str
as foo
, whereas in Go we get "foo"
, so we have to handle the outer quotes ourselves.
Finally, we want to bind this type, which implements the Visitor
trait, to the Deserialize
trait of the type we want to deserialize: deserializer.deserialize_str(HideTypeVisitor)
.
|
|
Of course, it is not necessary to use hashmap here; it is possible to use match directly to handle the conversion from type to string. But the hashmap approach is more natural, and we no longer need to manually maintain a forward and reverse map.
In general, the implementation of custom serialization and deserialization in Golang is relatively straightforward and brute-force. The entire interface definition is also very simple. The serde in Rust has a lot of features and functionality. The first thing you must understand is the serde data model. In contrast to Golang, which uses Go types directly, and whose default implementation directly binds Go types, such as []byte, to its implementation ([]byte data is converted to base64 representation by Golang after serialization). The data mapping in serde is implemented through the serde data model. serde itself is a framework and is not responsible for the implementation, which is provided by extensions such as serde_json, serde_yaml and so on. So the advantage of serde is that the trait is unified, while Go is implemented in each of them, such as the official implementation of JSON, third-party implementation of YAML and so on.
Third-party implementation of the Yaml library:
|
|
The official json library:
|
|
As you can see, although the naming and parameters look similar, they are actually different, and of course, there is no official specification that says they should be the same. Suppose the Unmarshaler interface for both json and yaml is defined as Unmarshal([]byte) error
, which, since Go’s interface is a duck type, will result in a false assertion.
Another thing is that, since serde is a framework, our custom serialization is also mapped to the framework’s data model, so it is language-independent (json, yaml, toml, etc.). It can be defined once and run everywhere. While Golang’s custom serialization must be specific to a certain type (e.g. JSON), the above example, if replaced by yaml, would have to be implemented again, whereas serde does not have this problem.