Centralized schemas and microservices, match made in Hell?

So you’ve ended up in the microservice swamp, or somewhere else where you need to deal with a zoo full of fractious, opinionated, distributed systems. Now you’ve found out that there’s a set of common things many of your services need to have a shared understanding about, but don’t. You also prefer to retain even a bit of your dwindling sanity until the project is over. What to do?

This post is an attempt to distill into a digestible format the experiences I and my team have had during the last few years building a distributed system around centralized schema management (I’m going to just say CSM from here on.) I’m not entirely sure we were that sane to begin with, but at least the loss in cognitive coherency has been manageable. Your mileage may vary, caveat emptor, etc.

Centralized schema, in its most simplified form, means that you have a common authority who can tell every other part of your system (service, API, lambda function, database, etc.) what kinds of objects or data structures your system contains. It’s the responsibility of each service to determine what to do with that information — the assumption being, however, that whenever a service needs to communicate with other services (or the outside world) about a Thing it should use the authority-provided definition of that Thing.

Microservices kidnapped by CSM

Systems that need to work with a large set of various kinds of data objects across multiple services are the prime candidates to benefit from centralized schema management. Many don’t, and since CSM means giving up a portion of the flexibility service-oriented architectures (whether micro, midi or macro) bring, it’s not a good fit for every system.

In our case we have a system that is supposed to manage all the assets of the Finnish national road network. Assets in this case meaning the roads themselves (surface material, how many lanes, …), and anything related to them such as traffic signs, surface markings, fences, off-ramps, service areas, various measurements, and so on. Altogether that adds up to roughly a hundred different types of assets. Each of them needs to be imported, stored, validated, exported to interested parties via some API, returned as search results, drawn on a map… you get the idea.

Why centralize schema if everything else is distributed?

The common wisdom around microservices is that everything needs to be decentralized. That’s how you’re supposed to reap their benefits. Unfortunately, that wisdom tends to slink away (looking slightly guilty as it goes, with those benefits in its pocket) whenever you need to have more than two of your services talk about something, because it’s a lot of work to keep all parties in agreement on what they’re talking about.

CSM is a tool and architectural pattern to manage that problem. It forces all parties interested in a given kind of data to use a standard definition of it — what properties the data has, what’s the set of allowed values for each property, which ones are mandatory, and so forth. This is normally not optimal from the viewpoint of any particular service, since depending on how they’re built they usually have a richer set of tools available than the smallest common denominator an external schema represents. For example, a TypeScript-based service would rather use its own type system for defining objects, and a relational schema defined in SQL in the fourth normal form is a thing of beauty compared to any JSON schema document.

For systems that have just a few kinds of data, or just a few different services that deal with it, the tradeoff is likely not worthwhile. But when you have a dozen services that all need to be able to process some or all of the set of 100 data types, implementing CSM is the only way to stave off the impending madness. Even if that results in all of your services having to submit to a jackbooted overlord who controls what they may or may not say in public. (Please do not try to extrapolate this into anything human-related.)

A portrait of a CSM in three acts

The MVP

To be of any use, a CSM implementation, whether homebrew or off-the-shelf, needs to include at least the following:

We’re not in MVP land anymore

The above is fine for a workshop demo. However, it’s likely your system will need something a bit more advanced to survive in the wild. To upgrade your CSM from the “Home Edition” into “Professional”, you’ll probably need these:

Nah, man. I’m pretty friggin’ far from MVP

In for a penny, in for a pound? Why not go for the whole hog? Since you’re already committed, why leave money on the table by not extracting some more value from your fancy bespoke CSM? To evolve your CSM into its final form you could…

This will quickly veer into “if all you have is a hammer” territory so let’s stop here.

Are there existing options?

Yes.

There are things like Confluent Schema Registry which provides tight integration with Apache Kafka but also works with anything else that can consume JSON Schema, Avro or Protobuf. I don’t have any hands-on experience with it, but looks like it provides most, but not all of the functionality described above.

In any case, we rolled our own because we could not find anything fitting our requirements. Besides, this post is already too long to include a market survey.

How we did it (and why)

As I said above, in our project (Velho, for the Finnish Transport Infrastructure Agency) we ended up creating our own, custom centralized schema management solution. This was due to several reasons:

Our stack

AWS native. Containerized with AWS Fargate, FaaS with AWS Lambda. Multiple independent SQL databases (Aurora PostgreSQL). Elasticsearch. Redis. S3. AWS API Gateway. Infrastructure-as-Code via CloudFormation.

Languages: Clojure backend services. ClojureScript frontend with Reagent, Re-frame and Web Components. Lambdas in Clojure, Python and Javascript.

What we use CSM for

Get to the point already

It’s written in Clojure like most of our project and deployed as a Docker container. The schemas themselves are written as EDN files, which are more-or-less equivalent to JSON or YAML files, but with added Clojureness (including the ability to embed code). Each of the schema definitions contains the complete description of a single asset type, including

Here’s a sanitized, redacted and translated example of the schemas for Fence, which is part of the Road Furniture namespace (and therefore owned by the furniture-registry service). There are two schema versions, a transformation from v1 to v2, and some metadata. The schemas refer to two generic components (the velho/import directive) which include properties defined elsewhere, and there’s a property whose type is an enum schema (velho/enum).

{:latest-version 2
 :versions {1 (velho/import
                   [:general/basic-props
                    :location/linear-location]
                   {:properties {:code string?
                                 (ds/opt :material) (velho/enum :furniture/material)
                                 :type (velho/enum :furniture/type)
                                 :size (ds/maybe pos-int?)}})

            2 (velho/import
                   [:general/basic-props
                    :location/linear-location]
                   {:properties {:code string?
                                 (ds/opt :material) (velho/enum :furniture/material)
                                 :type (velho/enum :furniture/type)
                                 :height (ds/maybe pos-int?)}})}

 :transforms {1 "$merge([$, {'properties': $merge([$sift($, function($v, $k) {$k != 'size'}), {'height': $.'size'}]),
                             'version': 2}])"}

 :metadata  {:oid-prefix "1.2.246.578.5.100"
             :owner-service :furniture-registry
             :indexing true
             :name "Fence"
             :fields {:properties {:_metadata {:name "Properties" :indexing true}
                                   :code {:_metadata {:name "Code" :index true}}
                                   :material {:_metadata {:name "Material" :index true}}
                                   :type {:_metadata {:name "Fence type" :index true}}
                                   :height {:_metadata {:name "Fence height" :index true}}}}}}

We don’t serve these EDNs outside our schema registry service. EDN is a Clojure-specific format, therefore it’s an implementation detail, and we want to punish everyone equivalently. Our schemas are transformed into an OpenAPI 3 definition (which is an extension of JSON Schema) and served via a REST API.

OpenAPI does not directly support all of our features (e.g. transforms and metadata) so we use extensions for them. The resulting definition is still completely valid OpenAPI, and third parties would just ignore the more esoteric stuff.

Eat it up

Currently we consume our schemas only from Clojure or ClojureScript code. We do this by…

  1. Fetching the OpenAPI definition via our REST API
  2. Translating it back into the same EDN format as seen above
    • At this stage we also extract our custom stuff from the OAS extensions.
  3. Processing our custom extensions (import and enum)
  4. Feeding the processed and evaluated schemas to Data Spec, which…
  5. … ends up registering the schemas as Clojure specs.
  6. The resulting specs are then used in our code, both backend and frontend, in the usual Clojure/CLJS fashion to validate and coerce incoming data.

It’s not a coincidence that our “native” data format is so close to Data Spec already. Hooray for dynamic languages and runtime eval.

Yes, we do runtime consumption of schemas and our services (both frontend and backend) can handle schemas that change on-the-fly.

About those transforms…

More than meets the eye, isn’t it?

As I alluded to somewhere above, transformations between schema versions are an issue, primarily because we can’t really define them in a language/platform-independent way. (Unless we go full XML Schema in which case XSLT would work. But we don’t want to. Never go full XML.) Fortunately we have a good-enough solution in JSONata which is an expression language for querying and transforming JSON-like data. It has implementations for Java, JavaScript, Python and .NET (at least), covering the common platforms nicely.

It must be said that JSONata is far from perfect. The various implementations differ in the set of features they support, and this is not really documented anywhere.

Woman angry at a cat who´s not using XSLT

In the example above, the JSONata transform takes a version 1 object, adds a key properties.height which is set to the value of the properties.size key, and removes the now-unnecessary size. It additionally sets the version property to equal 2, as is good and correct for a version 2 object. The version property itself is imported from the general/basic-props component schema alongside many other properties, so it is not visible in the example.

An astute reader would at this point remember that JSON Schema and OpenAPI do not have support for these kinds of transformations. That’s entirely correct — we have custom consumer-side code to run them, and we deliver the transformations via OpenAPI extensions. Our consumers so far have been solely Clojure or ClojureScript-based, so we only have client code for browsers (JSONata/JS used from ClojureScript) and the JVM (JSONata/Java via interop from Clojure).

Can I play with your toys?

Hopefully yes, in the near future! We’d like very much to open-source our CSM implementation but there are a few bureaucratic hurdles to overcome yet (and it needs some cleanup).

Final words

To summarize:

This post took a long time to write. I’ve attempted to be not entirely boring, and I hope you got something useful out of it.

Acknowledgments

The things described in this post are a result of a lot of teamwork. While I might have written the largest number of lines (easy enough when you end up throwing away the entire first implementation!) the good stuff wouldn’t have been possible without the rest of the Velho team. Thank you, Mikko, Kimmo and the rest — you know who you are and you’re awesome! The dumb parts are my own, except for that meme picture, which is Mikko’s.