The principles of Isolation.
When building microservices, one of the primary goals is to achieve to a higher degree of isolation between components within our application. When we talk about the Principles of Isolation, we are not referring to a rare B-side from the great late 70's Mancunian recording artists Joy Division. Instead, we are referring to the types of isolation we hope to achieve within our services.

Isolation Dimensionality

As we move from monolithic to microservice-based architectures, we decompose the application into multiple, independently executing services. By physically separating components, we have an opportunity to introduce more isolation between application components, which allows us to reduce the coupling between components and potentially increase each components scalability.

So what do we mean by isolation? When working with microservices, we focus on four dimensions of isolation: State, Space, Time, and Failure.

Isolation of State

One of the primary characteristics of a microservice is its state. State refers specifically to a microservice's persisted data, which the service is wholly responsible for managing its state. Any access to this state from another service is made exclusively through the service's API. This approach is intended to avoid any direct access to a service's database across services. This firewall prevents data from being modified without its service being aware of the change. Viewed from the perspective of a monolithic architecture, this may seem capricious, wasteful, and unnecessary administrative overhead but in a microservice architecture, this provides an external firewall for the service's persistent state.

By firewalling the service's data, we allow each service to evolve to maximize its ability to represent state without regard to its consuming services. From the consuming service's perspective, the data will only exist as it is represented through the API. This isolation allows not only the structure of the persisted data to change but also the mechanism by which the data is persisted. Each service now has the freedom to migrate its data to a different persistence technology if that change benefits the service. By isolating the internal representation and mechanism from the consumer, we no longer need coordination or consent to change the service's persistence implementation. From the consuming service's perspective, concerned nothing will have changed (except for a possible change in the service's performance characteristics).

Isolation of Space

Another important dimension of isolation is space. Space refers to the location in which the service is deployed. In monolithic applications, all components are deployed together and execute as a single process. The deployment strategy changes radically with microservice architectures. In a microservice architecture, services are deployed independently and execute within a separate process.

While this strategy increases deployment and administrative overhead, it also provides the necessary isolation to allows each service to be managed independently. A service that can be started and stopped independently of the other services is a service that can be re-deployed without taking the entire application offline. This ability to manage application components (services) independently allows remediated defects and new features to be deployed when they are ready. This flexibility allows new features to make it to production quickly and can improve application stability by deploying defect fixes in a timely manner.

The last aspect to spatial isolation is a service's physical location. Since a service no longer lives within the same process as its consuming services, there is no longer a requirement that all services be deployed to the same host. In fact, since all inter-service communication is accomplished across a network connection, the service no longer needs to be colocated in the same data-center.

Isolation of Failure

In a monolithic architecture, it is quite common for a single failing component to bring down an entire application. A single uncaught exception can crash the monolithic application's single OS process, taking the application offline.

Since each service in a microservice architecture execute independently, the same uncaught exception will still cause the service to crash and become unavailable to it's consuming services, but it no longer crashes the entire application. When a service fails, the application continues to run (in a degraded state).

When designing an application, we want to avoid any propagation of failure. Microservices-based applications often employ the Bulkhead Pattern to both isolate and mitigate failures. Failure propagation is avoided by replicating similar services into service pools which provide a degree of both fault-tolerance and scalability. If a service instance crashes, the remaining service replicas absorb the increased load until a new replica replaces the crashed instance. Additionally, these service pools can be scaled up and down elastically in response to changing load.

Isolation of Time

As we have already mentioned, monolithic architectures execute all of the application's constituent components within the same OS process. Since these components are co-located, they are naturally able to invoke each other directly and generally follow a synchronous call pattern. While this simplifies the implementation, it forces each caller to wait until the call has completed before returning control to the caller.

Microservices generally limit synchronous calls to services that respond quickly. Long-running service calls should be made asynchronously. By making long-running calls asynchronously, the caller no longer waits for the long-running service's response avoiding idle wait-time in the caller. By eliminating the idle time wasted waiting, we improve resource utilization in the caller.

Asynchronous operation is usually accomplished through the introduction of a Message Bus. The calling service sends the message to a message bus, which is responsible for delivering the message to the service. By handing the message to the message bus, we decouple the caller from the receiver. When the receiving service has completed, it returns any response to the caller in the same fashion (asynchronously).

Eventual consistency

Another aspect of temporal isolation relates to data. Since data is no longer shared directly between services, and there is no service-spanning ACID transaction mechanism, we may need to relax our data consistency model. Any attempt to implement a total consistency model would require a central coordinator responsible for ensuring that all participating services' data is in sync. As the number of participating services grows, the time-complexity of ensuring complete synchronization increases to a point at which the application will no longer meet its responsiveness requirements.

Instead, we embrace the notion that there will be times when the data replicated between services is temporarily out of sync. Microservice architectures often employ Eventual Consistency to address this. Eventual Consistency is a consistency model that provides a guarantee that over time, all replica data will converge to a consistent state. This approach balances service availability with data consistency.

Summary

We have seen the power and flexibility that can be derived by increasing isolation within the application. By isolating application state, we restrict access to a service's data strictly to the service's API. Limiting access to the service's API allows the underlying persistence model and mechanism to be completely abstracted from the application's other services. Isolating applications in space allows us to redistribute an application's components dynamically to manage resource utilization. Isolating the application in time using asynchronous service calls eliminates the need for resources to wait for long-running requests to finish and facilitates better distribution of load across the application. Finally, by isolating the application components from failure, we can prevent a single failing component from crashing the entire application.

Coming up

Now that we have had a chance to consider the types and benefits of isolation, we will take a look at the Laws of Scalability in our next article.