Elasticsearch, Inc., the commercial firm behind the open source Elasticsearch search engine, released version 1.4 of Logstash last week. Logstash is one of the most popular log management tools available today, though it competes in a crowded space with projects like Scribe, Flume, Chukwa, Fluentd, and Kafka.
The 1.4 release of Logstash contains a number of important improvements, the most obvious being the quicker startup time, now approximately three times faster. The new release maintains the radical emphasis on ease of use, which is a hallmark of the entire ELK (Elasticsearch, Logstash, and Kibana -- the last for reporting and visualization) stack.
Along with a quicker startup, Logstash 1.4 features an improved installation process. Version 1.4 also includes a simplified plug-in system that makes it even easier for users to customize their Logstash install to specific business needs, as well as redesigned Puppet modules to make it simpler to automate installation and configuration. You'll also find expanded documentation, with a new and improved get-started guide.
The Logstash legacy
Logstash was born out of Jordan Sissel's background in devops and system administration, when he found himself constantly dealing with large numbers of log files and needed a centralized mechanism to aggregate and manage them. Logstash was originally conceived without any awareness that Elasticsearch even existed, but as Sissel puts it, "writing storage systems is boring." When he discovered Elasticsearch in 2009, it was a perfect fit to store all that log data. Sissel joined Elasticsearch in August 2013.
Over time, Logstash has grown along with the other components of the ELK stack to become part of a comprehensive platform for using log data and helping businesses gain insight into how customers are interacting with e-commerce sites, support systems, and more.
"Logstash can get data from unknown places and from any source and will clean it up, so you don't have to worry about the exact log types or reconciling different data formats," says Sissel. "We handle it all and let you slice and dice that data with Elasticsearch. Serve it up nice and pretty with a side of Kibana, and you have instant feedback on how to better please your customers and drive business success."
Democratizing business data
Sissel and the Elasticsearch team refer to this as "democratizing business data." ELK is especially good at dealing with "any data with an element of time associated with it," but it's not limited to log data. Almost any type of data is ultimately a candidate to be stored, analyzed, and visualized using ELK.
Of course, the idea of "democratizing access to data" raises issues related to security and access control. Elasticsearch currently does not have a native access-control facility, although it's on the road map. As Sissel explains, "We don't have it yet because security is something you can't do halfway, so we want to make sure it's very good before launching it."
In the meantime, Elasticsearch recommends implementing access control at the HTTP level using HTTP proxies and firewalls. Here's a documented example of this configuration.
The rest of the world
While ELK is a powerful stack, it's not meant to be the be-all and end-all. As such, the creators have taken care to provide interoperability with the rest of the world. Logstash currently bundles output connectors for 60 or more different systems. The range of possible outputs includes such diverse possibilities as AWS S3 buckets, IRC, Solr, MongoDB, Redis, Riak, XMPP, and many more.
Sissel points out that Logstash can be used as part of a more complex analytics workflow, such as complex event processing with Esper, Storm, or S4 -- or even batch processing with Hadoop. While Logstash does not include an HDFS output connector today, Sissel says it may arrive in the future, "if we see community demand for it."
Another case where Logstash is more appropriately used as a complement to other tools is the "document ingestion" scenario. Logstash really is an event/log based system, and it'd be an awkward fit for trying to crawl and consume a document repository and load those documents into Elasticsearch. In such a scenario, a cleaner solution involves using ManifoldCF or Nutch to handle "document" data, with Logstash as a peer to handle event/log-oriented data.
Open source and support
Logstash is fully open source and licensed under the business-friendly Apache License Version 2.0 (ALv2). Source code is available at GitHub. Downloads of both Logstash and the rest of the ELK Stack components are available at elasticsearch.org.
Organizations can receive support from the engineers that built Logstash and the ELK stack by subscribing to annual support offerings from Elasticsearch, Inc. ELK Stack subscriptions also include free licenses to Marvel, a real-time monitoring system for ELK deployments.
Elasticsearch, Inc. has seen revenue growth of over 400 percent year over year and reached nearly 6 million downloads. Given the company's track record over the past few years -- and the history of the founders as contributors to projects like Logstash and Apache Lucene -- it's fair to expect a steady stream of innovative new products from Elasticsearch in the future.
This story, "What's new in Logstash and why you should care" was originally published by InfoWorld.