[Flume Cookbook] Flume Terminology

In the 2 post in the series, lets examine some of the key Flume terms.
In part 1 of the series, we worked on setting up a single node cluster. Before we dive deeper into Flume, lets look at some basic concepts which shall help in understanding things.

Event

Event is the byte payload which needs to be transmitted from source to destination. The byte payload is the data that applications needs to store. Event also contains certain header information.

Flume Event

Flume Event

Agent

Flume Agent is central to Flume deployment. Agent is an independent process that hosts/manages Sources, Sinks and Channels.

Flume Agent

Flume Agent

Source

Source is the way by which data is ingested into Flume Agent. The Source could be the 1st Agent to ingest data into Flume or the Source could be one of the intermediate Agents while delivering data to end destination.

Channel

A Channel is a transient store where events are stored before they are consumed by a Sink. A Channel is link between a Source and a Sink. Source accepts the data and push to Channel. A Sink gets the data from Channel and writes either to next destination or the final destination. A Source can have more than one Channel.

Sink

A Sink is responsible for consuming the events from Channel. It either writes those events to next Source or if it's last Sink in chain it may write to eventual destination like a File System or HDFS etc.

Client

A Client is an implementation which resides at the point of origin of Events and has the capability to deliver Events to Flume Agent. For ex, if we use Flume Log4j appender, it acts as a Client and delivers the logs to the configured Flume Agent

Flow

Flow is the movement of Events across Flume topology from Source to eventual destination.
The picture below shows the a sample flow with two Nodes/Agent.

Flow

Sample Flume Flow

Topology

Topology is the way Flume Agents are arranged in the Flow path of Events from source to destination. A sample topology is shown below. There are other topologies possible, but we shall touch upon them post discussion on Sinks.

Sample Flume Topology

Sample Flume Topology

Reference

Flume NG Architecture

2 thoughts on “[Flume Cookbook] Flume Terminology

  1. Hi Ashish,
    Could you Please inform me what is the use of Flume NG and how can use apache flume on window.

    Thanks
    Saurabh Sinng

  2. I am just starting with FLUME. You mentioned like single node installation of FLUME… Is there a distributed flavor? How do we install that? What are collectors, i saw them in few docs.. Do we need to physically configure flume with different physical nodes for being distributed or FLUME does it internally.
    I want to understand how FLUME is distributed….:)

Comments are closed.