Using logstash, ElasticSearch and log4net for centralized logging in Windows

From my archive - originally published on 6 April 2015

The ability to collate and interrogate your logs is an essential part of any distributed architecture. Windows doesn’t have much of a native story here and solutions often involve stitching together different technologies via configuration.

Given the bewildering number of technologies associated with logging it can be difficult to know where to start. The solutions tend to follow a broadly similar architecture. Generally, you separate the logging clients from agents that sit on a host and forward log messages onto a remote collection mechanism.

There are several distinct concerns in play here:

Logging is responsible for actually recording an event.
A separate collection mechanism parses and routes the events.
The storage medium persists messages routed by the collection mechanism
An analysis engine allows you to make sense of the logs

LogStash acts as a collection mechanism where a system of plug-ins allows you connect up a wide range of different data sources and destinations:

Input plug-ins can read log messages – i.e. log files, windows event logs, message queues, etc.
Formatters parse input – e.g. CSV, JSON, lines in a text file
Output plug-ins that send the log messages on to a destination – e.g. ElasticSearch or even an intermediate pipeline

Typically a “vanilla” setup would involve LogStash instances running on servers that read log files generated by log4net, parse the events and forward them to ElasticSearch for storage. An analysis tool such as Kibana can be used to throw up some visualisations and a dashboard.

This “ELK stack” combination is by no means your only choice. You can write to a different data source such as MongoDB. You can buffer your output in one of many different message queues or even something like Redis. The point is that by separating logging, collection, storage and analysis you buy yourself a great deal of flexibility.

Installing ElasticSearch as a storage medium

Firstly, you’ll need to make sure that you have Java installed and that your JAVA_HOME environment variable is set to the root directory of your java installation.

Production environments for ElasticSearch are set up as clusters and generally run on Linux. That said, it’s easy enough to set it up as a Windows service for a quick and dirty proof of concept, just don’t expect it to stay healthy for very long. You download and unzip the latest release and execute the following command in the installation’s bin directory:

service.bat install

You should have a new service running called “Elasticsearch 1.4.4 (elasticsearch-service-x64)” (or similar) but you can check that it’s ready to receive data by navigating to http://localhost:9200/. This should yield a JSON response that looks something like this:

{
  "status" : 200,
  "name" : "Bloodshed",
  "cluster_name" : "elasticsearch",
  "version" : {
    "number" : "1.4.4",
    "build_hash" : "c88f77ffc81301dfa9dfd81ca2232f09588bd512",
    "build_timestamp" : "2015-02-19T13:05:36Z",
    "build_snapshot" : false,
    "lucene_version" : "4.10.3"
  },
  "tagline" : "You Know, for Search"
}

In more recent versions of ElasticSearch (i.e. v 2.0 onwards) you’ll need to adjust the configuration to open it up to the outside world as it only listens to localhost by default. This is done by adding the following line to your config/elasticsearch.yml file:

network.bind_host: 0

Setting up logstash as a service

Logstash can run as a standalone application, but it is best to use a service manager such as NSSM to run it as a service in Windows. This provides a little more resilience so the application is restarted if it fails.

Firstly, we create a new file called run.bat in Logstash’s bin directory to act as a startup command. This should contain the following text:

logstash.bat agent -f logstash.conf

You should also create a new directory to hold the output and error logs that are generated by the service manager.

Next, download and unzip the service helper NSSM – it’s a single executable file. Run the following command against it to install logstash as a service:

nssm install logstash

This will open up the NSSM dialog so you can enter the following settings for the new service

In the Application tab enter the path to the run.bat file into “Path” and LogStash’s bin directory into “Startup directrory”
In the Details tab enter something for the “Display name” and “Description” fields
In the I/O tab enter the paths for where you want output and error logs to be created – e.g. [LogPath]\stdout.log and [LogPath]\stderror.log

Configuring logstash

Inputs

Logstash is configured through a chink of JSON held in the logstash.conf file.

The first element is the input configuration which will use the file input plugin to read new entries generated by log4net. The entry below shows how this configuration looks for a single file.

input { 
    file {
        path => "C:\logs\TestLog.log"
	type => "log4net"
        codec => multiline {
                    pattern => "^(DEBUG|WARN|ERROR|INFO|FATAL)"
                    negate => true
                    what => previous
                }
      }
}

Note that a multiline codec is being used to handle parsing log entries that are spread over multiple lines of text. This codec is configured to make logstash start a new event every time it encounters of log4net’s logging level statements.

Filters

The next element configures the formatter that converts the input to Logstash’s internal format. The example below uses log4net’s Grok filter to match the input against a regular expression:

filter {
  if [type] == "log4net" {
    grok {
      match => [ "message", "(?m)%{LOGLEVEL:level} %{TIMESTAMP_ISO8601:sourceTimestamp} %{DATA:logger} \[%{NUMBER:threadId}\]  \[%{IPORHOST:tempHost}\] %{GREEDYDATA:tempMessage}" ]
    }
    mutate {
        replace => [ "message" , "%{tempMessage}" ]
        replace => [ "host" , "%{tempHost}" ]
        remove_field => [ "tempMessage" ]
        remove_field => [ "tempHost" ]
    }
  }
}

The most important part of this configuration is the match string which is used to break up the log entry. Logstash defines more than 100 different regular expression patterns, the details of which can be found on GitHub.

The match string shown above does depend on a particular conversion pattern to be configured in log4net as shown below:

You can test that the regular expression in your match string corresponds to your logging output using the Grok online debugging tool.

Note the use of “(?m)” appended to the beginning of the match string. This is a fix for a problem that can cause very long log entries with many carriage returns to be broken up into separate events.

The mutate filter has also been used here, mainly to ensure that the “host” and “message” properties in the output are drawn from specific parts of the log file input.

Outputs

The final part of the configuration defines the output filter:

output {
  elasticsearch {
    host => "localhost"
    protocol => "http"
  }
}

This sends the formatted output to a vanilla installation of ElasticSearch that is just installed on the local host.

Checking that it works

Once you have started a configured LogStash service it should start pushing any new log entries into ElasticSearch. The easiest way to check whether this is happening is to interrogate ElasticSearch itself.

LogStash creates an index every day of the form logstash-yyyy-mm-dd. If log collection is happening then you’ll see an index for this pattern being created and you can query it using the ElasticSearch API to check that records are being added, e.g.

GET http://localhost:9200/logstash-2015.01.12/_stats

Will yield a JSON payload where the all.primaries.docs.count property will tell you how many log entries have been harvested by LogStash.

Installing Kibana

There are numerous ways of extracting query data from ElasticSearch and you may even want to develop your own tools using the REST API. However, Kibana is a free tool that lets you freely interrogate data and build up dashboards.

Once you have downloaded and unzipped the latest Kibana release you can set it up to run as a windows service using NSSM in much the same way you did for LogStash, i.e. run the following command line:

nssm install kibana

Enter the following settings for the new service:

In the Application tab enter the full path of <root>/bin/kibana.bat into “Path” and <root>/bin directory into “Startup directrory”
In the Details tab enter something for the “Display name” and “Description” fields
In the I/O tab enter the paths for where you want output and error logs to be created – e.g. <LogPath>\stdout.log and <LogPath>\stderror.log

Once the Kibana service is running you can access it by visiting http://localhost:5601.

Kibana documentation is pretty thin on the ground and getting the data you want is largely a matter of trial and error. The following are some easy guides to getting basic dashboards up and running:

https://www.digitalocean.com/community/tutorials/how-to-use-kibana-dashboards-and-visualizations

https://blog.trifork.com/2013/11/28/use-kibana-to-analyze-your-images/

Filed under Architecture and Development process.