Apache Flume: Distributed Log Collection for Hadoop
The problem with HDFS and streaming data/logs
Flume configuration file overview
Starting up with "Hello World"
Interceptors, ETL, and Routing
Monitoring performance metrics
There Is No Spoon – The Realities of Real-time Distributed Data Collection
Transport time versus log time