Sometimes, when you listen to a talk a light will go off, in this case it was more like a lighthouse.
Pieter Joost van de Sande gave a talk yesterday, about micro services.
What a great talk!
Some short notes I took during his talk (not necessarily the things Joost said and all references to businesses and persons are fictional).
What is the size of a micro service?
About a max size of 1000 rows of GO code.
So you should make a micro service based on the smallest consistent boundary.
Albert Einstein once said: make it as simple as possible but no simpler.
Joost said: make it as small as possible but not smaller.
What should NOT be a micro service?
Business entity are NOT a great base for micro service, because they are needed by multiple micro services. Instead use technical concepts as a micro service base.
Good candidates are:
Do NOT use:
There will be only one micro service responsible for the quality of a specific part of the data
Data will be replicated in many services, but there will only be one service responsible for the quality of a specific part of the data.
E.g. if the registration service fires an event, that an user is registered with nickname: bigelefant, then all other services that depend on this data will be receive this specific event and they can’t argue that the event happened or that the data in the event is not correct.
It happened, so it is true!
How do you scale 1 micro services?
By using sharding. So all users with a nickname starting with an “A” go to registration service instance A and all users with a nickname starting with a “B” go to registration service B.
One note on sharding, you can’t do sharding on 2 different keys. If you want to shard the data on nickname and email (to guarantee unique nicknames and unique email addresses), then you should create 3 micro services, the first shards on nickname the second on email and de third service will gather both events and will be responsible for acknowledgements of the uniqueness of both keys.
Which tool did you use for requirements gathering?
Event storming by Alberto Brendolini (http://ziobrando.blogspot.nl/2013/11/introducing-event-storming.html).
Do NOT use queues, use persistent streams
Queues destroy data, by using persistent streams you can always continue where you left of.
Micro services do not have dependencies
Well this is not entirely true, when a micro service goes down and comes up again, it will ask the event stream system to play back all events it missed during it’s down time, so each micro service has a dependency on the event stream system, but there are 2 reasons, why this dependency is not a problem.
First, the event stream system is constructed of many independent event streams and an event stream is just an immutable (append only) file on disk.
Second, the event streams are stored on a google file system, that is guaranteed to have a 100% up time.
There are 2 things hard in software engineering: naming things and caching invalidation
Well the last part is solved in this micro services architecture by using immutable append only files. When something is immutable you can cache it indefinitely.
What is the data size increase factor by using denormalized replicated data?
Well in our case the total factor was about 200x!
But disk space is really really really cheap and the reducement in complexity is immense.
You gain a lot of flexibility and are much better prepared to scale your system.
How do you deal with configuration changes?
We use a micro service for that
The configuration service fires an event, when configuration changes. These events are received by all micro services that use that specific part of the configuration.
What is the greatest challenge, when working with a micro services architecture?
When using a micro services architecture, specific data will not be immediately updated on all micro services.
You can’t change specific data at a specific moment in time for the whole system.
Data will updated like a wave, but eventually all micro services will get the new data.
What besides using a micro services architecture, do you use to increase system availability?
E.g. we first ask the platinum online service to give a list of interesting people that are oneline, this service will take in to account geolocation and all kind of other metrics, when the service does not respond in 50ms we ask a the normal online service to give a list of people, this service will only take in to account gender en age, when this service does not respond in 100ms, Google adds will be displayed, so we earn by having our services offline :-).
How do you deal with different versions of different micro services?
We don’t have different versions!
We deploy the system as one big monolith.
All code lives in one big GIT repository and we have 3 branches:
The branches A and B are for A-B testing.
We don’t use semantic versioning any more.
The system has a version, but this is just a counter.
When we deploy, the whole system is deployed to 5% off all users and steadily increased, when errors are found in the log the system will automatically rollback the deployment.
All deployments are stored.
How do you deal with different versions of events?
We make the schema of our events backwards compatible, but sometimes this does not work.
In 2 cases we even asked the user to supply data, because we wanted to add a new property and we didn’t know what the default value should be for the existing property in that case, we had to ask the user, when the user signed in, he or she was asked a question, the answer was the default value for the new property.
Some people say you must not put data in an event only “keys”?
We put as much data as possible in an event, because these events are shared by the micro services, when a micro service needs a specific part of the data and only has the key of this data it must ask another service for this data, hence you get the SOA dependency hell.
Data that is not of interest for the micro service that produces the event, can be of vital importance for another micro service, so we put as much as possible all data in an event.
How do you deal with planning?
We don’t plan.
You as a software engineer deliver value by increasing quality of the code or by adding new features, that’s the only thing that counts.
If your manager ask, when will this new and great part of the system be ready, you see, I can’t tell you.
You do requirements gathering and start to build, then by delivering working software you can steadily get a notion, when parts will be ready.
It’s ready, when it is ready.
How do you take on testing and documentation?
We don’t test
Well this is not entirely true, for each micro service we describe (in GO code) which events go in and which events go out. This delivers executable code, that can check if the micro service behaves like it should and delivers documentation about, how the system works.
Bugs are our highest priority!
When an bug that impacts the user is encountered all is dropped en the bug is tackled immediately.
When you have a bug list, you are doing something drastically wrong.