For part one of this article please refer
In my previous article, I explained how to implement 'Join' between databases in Microservices. Let us discuss another important factor of Microservice implementation, "Transaction Management".
When we talk about the distributed systems, it is important to understand the CAP theorem so to have the right kind of expectations during system design.
CAP Theorem
CAP theorem applies to the distributed system and it states that "It is not possible for a distributed system to simultaneously provide more than two out of the following three guarantees."
Consistency - Every read receives the most recent write or error
Availability - Every request receives a response(non -error) without the guarantee that it contains the recent write
Partition Tolerance - The system continues to operate despite an arbitrary number of messages being dropped in the network between nodes
Transaction Management
We discussed DB per Microservice pattern in my last article. One of the biggest pain points in addressing DB per service pattern is managing business transactions that span multiple services. A mechanism is therefore required to maintain data consistency across different databases of Microservices.
Let us consider an example, you have two Microservice, one Microservice is responsible for passing Debit entry in the accounts table and another Microservice is responsible for updating customer balance in the Customer table. Suppose, Microservice A executes an insert passing debit entry into the customer account and call Microservice B. What if Microservice B fails to update customer balance? You can apply ACID transactions only at local Microservice and distributed transactions won't work due to limitations given by CAP theorem. Therefore you need a mechanism where these two Microservices communicate with each other and agree upon a protocol to ensure consistency of data at both databases.
Saga
Saga is a transaction model that is useful when transactions span across multiple services or systems. Saga breaks up distributed transactions into a sequence of local transactions where one transaction triggers the other till the transaction is complete. Each local transaction updates its local database and publishes a message or event to initiate the next local transaction in the saga. If any of the local transactions fails for any reason, then the saga executes a series of compensating transactions for preceding local transactions.
Compensating transactions are kind of rollback scripts that you would write, which will bring the transaction either in its original state or some kind of a state which will ensure data consistency in the database. Suppose for example T1 creates a new order in the database and if T2 fails, C1 will either delete the order or will update order status as "Canceled".
Consider a scenario where a user is awaiting the result of a Saga transaction, for example placing an online order. When a user initiates a transaction that in turn kicks off the first local transaction in a Saga, there are two ways in which a response can be sent back to the user.
1. Send response when Saga is complete
In this case, the original requestor waits for all the transactions to get complete, and then a processed response is sent to the requestor.
For example, when placing an online order, order creation, inventory check, and credit check will be done in a Saga, and based upon Success or Failure of Saga, the response will be sent back to the requestor.
Since wait time is slightly more due to obvious reasons, User experience can be enhanced by showing him a wait popup box with some sort of engaging content.
2. Send response immediately
In this case, user immediately gets the response irrespective of Saga outcome, user can be provided with some sort of message that the request is under process and will be updated shortly.
For example, when placing an online order, the requestor will be provided with the order id immediately, and then later either by Polling or by using Websockets, the requestor will be intimated with the outcome of the order.
I will vouch for the second approach over the first one as it guarantees the high availability and performance of the system.
However, it totally depends upon what kind of business requirements you have before you decide upon any of the above approaches.
Saga Patterns
Saga can be designed using any one of the following approaches,
Choreography Based - In this approach, Saga participants figure out how to execute Saga among themselves and there is no third party service or entity involved. In this approach, distributed decision making and business logic are embedded within Microservices and scattered in the system.
This however creates tight coupling between Saga participants, any changes in one service may impact other participants as well.
Orchestration Based - This is a much cleaner approach to implement Saga. In this approach, a centralized orchestrator controls the entire Saga execution and tells Saga participant services what to do. Services need to expose APIs which will be invoked by the orchestrator.
Inter-Service Communication
We talked about CQRS and Saga and how they help us in overcoming issues like joins and distributed transactions. Implementations like these are heavily dependent upon how communication occurs between Microservices. Communication has to be robust, fail-proof else it will defy the total purpose of implementing these patterns. Event sourcing is one such mechanism, before we talk about Event source let us understand Domain Events first.
Domain Events
A domain event is some kind of activity or event which has taken place inside a domain (application) and is important for others to be aware of it. Others can be a part of the same domain (inbound) or outside of that domain (outbound).
Some examples of domain events are Order Creation, Out Of Stock, User Registration, etc. Once an event occurred for example once an order is created, you may need to raise an event so that the shipping and packaging department, which can be in-house or an external party can start taking action against it.
Event Sourcing
Directly delivering domain events to concerned Microservices using HTTP or TCP may not be the right thing to do as it creates tight coupling between Microservices and also has a lot of other drawbacks such as HTTP failure, service unavailable etc.
Event sourcing persists in the state of a business entity such as an Order or a Customer as a sequence of state-changing events. Whenever a state of a business entity changes, a new event is published and appended to the list of events, and callback events are raised for subscribers of this event. Since saving an event is a single operation, it is inherently atomic. Reconstruction of an application's current state can be achieved by replaying the events.
Event Sourcing & Saga
Let us go back to the original Saga discussion and see how we can use event sourcing to execute Saga.
Microservices use event sourcing to publish events, aggregate event data, and use it as a mechanism for communication between services. Microservice publishes an event and all interested services can subscribe to that event, eventually getting callbacks when the event occurred.
In the given example, a Customer places an Order via Order Microservice. After creating an order this Microservice adds information in the event stream which raises a callback event in inventory Microservice. Inventory service will act accordingly to update inventory in the inventory database. Since the shipping service is also a subscriber of the Order creation event, it will get an update and initiate the shipping process. Shipping microservice is dependent on customer Microservice to fetch customer data like name and shipping address.
So now these Microservices are not calling each other directly and are decoupled. However, if you have noticed in the above diagram, you can see still coupling between Shipping and Customer Microservice. Now, I will leave it to you to figure out how to decouple these two Microservices (Hint: Use KAFKA).
We are almost done here with a small but very important concern in the system.
Consider two activities that occurred when an order is placed.
1. Order Creation
2. Publishing Order Event
Your individual transactions are atomic, however, we are still unsure of the atomicity of both Order creation and publishing of an Order event? Anyone or both transactions can fail irrespective of the order of execution.
What options we have to make sure that either both 1 and 2 are successful or both fail?
We have one lifesaver pattern here, which is called Listen to yourself pattern. In this pattern Create Order service creates an order however not commits the order, it then tries to publish this order creation into the event stream and also becomes a consumer of the order creation event. As soon as it receives the order creation event it commits the Order creation transaction in the database.
Transaction log trailing is also another pattern that addresses the above issue, however, I find this one much cleaner.
So that's it with Microservices, this pattern is a simple yet powerful way to break down your Monolithic system into smaller independent ones, however as complexity grows, it is a bit challenging to manage and run the entire ecosystem.
Careful planning, execution, and experience will surely hit the bullseye ◎.
Happy Reading...
Comments