Installing Flume 0.9.4 Example Plugins

March 20th, 2012

As part of a project for my day job, I’ve been getting to grips with Flume. Chances are that if you’ve found this post, you’re already aware of what Flume does, but for the uninitiated:

Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. Its main goal is to deliver data from applications to Hadoop’s HDFS. It has a simple and flexible architecture based on streaming data flows. It is robust and fault tolerant with tunable reliability mechanisms and many failover and recovery mechanisms. The system is centrally managed and allows for intelligent dynamic management. It uses a simple extensible data model that allows for online analytic applications.

The work that I’m doing requires me to manipulate events as they traverse a data flow. To do this I will extend Flume using its plugin functionality and a custom Decorator:

Sink decorators can add properties to the sink and can modify the data streams that pass through them. For example, you can use them to increase reliability via write ahead logging, increase network throughput via batching/compression, sampling, benchmarking, and even lightweight analytics.

Flume ships with source code for some sample plugins called HelloWorld and HBaseSink. I planned to use the Decorator component of  the HelloWorld plugin as the basis for my work, but following the instructions for installing the example plugins in the Flume User Guide presented some problems:

  • The rpm packages provided by Cloudera do not include the sample plugin source code
  • The instructions in the Flume User Guide use ant, which require ‘build.xml’. The plugin source only includes ‘pom.xml’.

Not having used Java in anger for some time, I had to bring myself up to speed and work through a few issues to get up and running. Read the rest of this entry »

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)