Photo by Drew Collins on Unsplash
Distributed LoggingAs we have seen in previous articles, one of the key characteristics when building microservice-based applications is its distributed nature. Because services are distributed across many different processes running in separate containers, it is significantly more difficult to troubleshoot applications using the same methods we used with monolithic applications.
One of the most important troubleshooting resources is the application log. In a monolithic world, the log provides a chronological sequence of event that the troubleshooting engineer can step through to review application log events. But what happens when your application has been decomposed into many different services often running replica copies? Where do you begin? With microservices, we must approach logging differently.
Externalize and CentralizeAs each microservice runs in a separate container, we need to externalize the storage of each service's log data. We do this for two reasons. First, container file systems are ephemeral and do not survive a restart or container crash. By externalizing the log data outside of the container, we can retain the log information beyond the life of any container. It is a generally accepted best practice to choose a common location outside of the application for persisting log data. We do this to avoid losing data in the event of a large scale application failure. By externalizing the data to a single static location, we have centralized all of the application's logging data which significantly simplifies the processing of the log data.
Logging DataWith microservice architectures, each service should write all its log data as raw text files encoded as a JSON log message. JSON provides a lightweight, universally readable, and writeable format that simplifies the production and consumption of logging events.
Every log event should include its timestamp, its service type, log-level, and a correlation id. The correlation id is populated by the first service in any chain of services or the API Gateway if one is used. It is passed the chain of participating service to provide a simple key with which to trace a request's log events across multiple services.
Once the JSON data has been generated and written to the local filesystem it gets picked up by a Log Collector.
LOG CollectorsThe first stage in a distributed logging pipeline is the Log Collector. A log collector is a daemon process running on the same container as the service and is responsible for parsing the services log files and transporting the logging data to a backend system. Two of the most popular log collectors are Logstash and Fluentd. Both are open-source and run on Linux and Windows platforms.
Index the Log dataThe second stage of a distributed logging pipeline indexes the incoming data to make it searchable. Searchability is commonly accomplished by inserting and indexing each field of the log event data into a database record. By indexing each field of the log record, we can perform complex queries against the data.
One of the most popular tools for this is Elastisearch. Elastisearch provides an open-source, indexed, queryable NoSQL datastore for storing and querying logging events. It is built on top of the Apache Lucene search engine and exposes a RESTful API to simplify query and integration into third-party applications.