8 Tools for Log Monitoring and Processing Big Data

Log management, analysis, processing, etc,. are all important tasks to carry out when dealing with a large set of applications, because those applications are bound to produce logs, both with useless and useful messages.

You could say that each and every application we use these days, in some way is able to produce a log file, which will alert about warnings, errors and critical messages – which should be paid attention to, immediately.

It’s not the most pleasant of tasks, to work hundreds of gigabytes of big data, trying to manipulate it using your own, hand-built tools. (which by themselves, are probably going to end up causing errors) Though, it’s known that large companies will go ahead and take the route of self-serving, rather than paying a 3rd party company to do all the work.

We’re not going to assume the position of a large company, middle-sized at most, and we will be looking at some of the most frequently used log monitoring and processing tools on the market. They’ve been around for long enough to find their place on this list, and before you pour it down for not having your favorite in the list – just scroll to the comment box and recommend it!

1. Loggly

Log Management Cloud Log Management Service Loggly
Loggly, cloud-based log management service that is recommended for both beginners, and experienced developers. They have a very generous 200MB/pd free plan, which will give you a glimpse of how it works, and what it can do.

Using the Loggly Node.js Library

It has a very user-friendly dashboard, and within hours of starting to use Loggly to monitor your logs, you’ll better understand what is going on at any given time. It’s a direct “raw to visual” type of log monitoring platform.

2. Papertrail

Papertrail cloud hosted log management live in seconds
Papertrail is known for being very easy to implement, and offering an extensive set of features / additional tools to make log management a breeze. Another thing they’ve managed to get noticed for, the ability to scale for hundreds of servers.

More accessible saved searches

I really enjoy their full-text search, which enables you to find pretty much anything, narrowed down to the very last thing, and you can also share those searches with co-workers, which really makes all the difference when it comes to effective log management.

3. Logentries

Log Management Made Easy – Logentries
Logentries couples powerful log analysis with ease-of-use. Designed with simplicity in mind – allows IT operations, developers and business analysts to quickly and easily gain actionable insights from log data. Trusted by customers around the world, analyzing billions of log events every day.

How to use Logentries to understand your logs

I think what stands out the most about Logentries is their real-time reporting system, as well as the ability to easily integrate the system with majority of your favorite tools that you already use. That includes popular frameworks like Node.js, and Ruby on Rails.

4. logstash

logstash open source log management
logstash is perfect for using to learn more about log events, and how to log and work with that data. It provides a web interface for concluding searches on your logs, making it very appealing choice as an open-source tool. Watch the video below.

It recently (8 months) joined forces with ElasticSearch, a popular web search server that I’ve covered in the past. Good documentation, alongside having a IRC channel where you can get answers to your questions.

5. Graylog2

Graylog2 Open source log management and data analytics
Field-tested open source data analytics system used and trusted all around the world. Search your logs, create charts, send reports and be alerted when something happens. All running on the existing JVM in your datacenter.

Graylog2: Java, Ruby, MongoDB-powered log management, monitoring, and alerting

It’s quite sophisticated for an open-source application, but not in a bad way. Graylog2 is perfect for both small and medium-sized companies to use – it helps to manage large incoming data without any strain on the end user.

6. Fluentd

Fluentd Open Source Log Management
Fluentd is an open source log collector/data stream processor used in production at hundreds of companies including Nintendo, Slideshare and LINE. Fluentd makes it easy to collect and store data from many places and helps developers analyze and monitor their log data. In this talk, Sada, co-founder of Treasure Data, goes over the basics of Fluentd and a couple of real-world use cases.

It’s built on top of Ruby, in just a couple of thousands lines of code – and mostly depends on the plugin ecosystem it has, which allows you to integrate well over 200+ Ruby gems in your logging application. Plugins that can help to better understand your logs, as well as to extend their usability / readability.

7. Apache Flume

Home Apache Flume Apache Software Foundation
Flumes sole purpose is to make the collection and aggregation of log data – easy, accessible and most important, reliable. It’s not a very lightweight log management tool, and is best understood after reading the official blog post on how the architecture is structured.

I’ve only learned that it’s quite popular among the real-time data scientists, so that might be a good lead to follow.

8. Apache Kafka

Apache Kafka
It’s not directly known as a log management application, but it does a great job at aggregating data that’s arriving trough multiple sources to the Kafka ecosystem. Kafka is well-known for being able to publish thousands upon thousands of messages and requests per second, a lot more than what other tools on the market might be able to do.

https://www.youtube.com/watch?v=MA_3fPBFBtg

LinkedIn is known for being the company that built Kafka from the ground up – a feat that many dream of, considering that multiple high-level companies are now using Kafka for their systems management.

9. Scribe

facebook scribe
The engineers over at Facebook, have taken their time to build their own log server – and it goes by the name of Scribe. It was built to scale, and to aggregate logs from multiple servers at the same time. The Wikipedia page has more on this.

Tools for Log Management and Analysis

I’m with you on the fact that the end of the post degraded a little bit, I suppose it’s easier to say good things about standard SaaS platforms, than it is about open-source projects which are meant to be tailored to your own specific needs.

All in all, it’s a great starting point for anyone who needs to learn more about log management, analysis and aggregation.