Apache Flume Training Course

Course Code

apacheflume

Duration

35 hours (usually 5 days including breaks)

Requirements

  • Programming experience

Overview

Apache Flume is a distributed service for collecting, aggregating, and moving event log data from multiple sources into a centralized data store.

In this instructor-led, live training, participants will have an in-depth understanding of the fundamentals of Apache Flume.

By the end of this training, participants will be able to:

  • Enhance their knowledge of Apache Flume features
  • Understand the architecture and data flow in Apache Flume
  • Apply their learnings to real world use cases and scenarios
  • Use Apache Flume for collecting, combining, and transferring large amounts of log data to a centralized data store

Audience

  • Developers
  • Engineers

Format of the course

  • Part lecture, part discussion, exercises and heavy hands-on practice

Course Outline

Introduction

Understanding the Fundamentals of Apache Flume

  • About Apache Flume
  • Understanding How Flume Works
  • Overview of the Important Components of Apache Flume
  • Architecture of Apache Flume
  • Data Flow Mode
  • Reliability
  • Recoverability

Setting Up Apache Flume

  • Setting up and Configuring an Agent
  • Starting an Agent
  • Using Environment Variables
  • Logging Raw Stream of Data
  • Installing Third-Party Plugins

Ingesting Data from External Resources

  • Using Avro RPC Mechanism
  • Executing Commands
  • Exploring Network Streams

Setting Multi-Agent Flow

Consolidating Events into a Single Channel

Defining a Flow Multiplexer

Flow Configuration

  • Defining the Flow
  • Setting Up Individual Components
  • Adding Multiple Flows in an Agent
  • Setting Up a Multi-Tier Flow
  • Fanning Out the Flow from a Single Source to Multiple Channels

Implementing a Flume Source

  • Using Avro Source
  • Using Thrift Source
  • Using Exec Source
  • Using JMS Source
  • Using Spooling Directory Source
  • Using Taildir Source
  • Using Twitter 1% firehose Source
  • Using Kafka Source
  • Using NetCat TCP Source
  • Using NetCat UDP Source
  • Using Sequence Generator Source
  • Using Syslog TCP Source
  • Using Multiport Syslog TCP Source
  • Using Syslog UDP Source
  • Using HTTP Source
  • Using Stress Source
  • Using Legacy Sources
  • Using Custom Source
  • Using Scribe Source

Implementing a Flume Sink

  • Using HDFS Sink
  • Using Hive Sink
  • Using Logger Sink
  • Using Avro Sink
  • Using Thrift Sink
  • Using IRC Sink
  • Using File Roll Sink
  • Using Null Sink
  • Using HBaseSinks
  • Using MorphlineSolrSink
  • Using ElasticSearchSink
  • Using Kite Dataset Sink
  • Using Kafka Sink
  • Using HTTP Sink
  • Using Custom Sink

Implementing a Flume Channel Interface

  • Using Memory Channel
  • Using JDBC Channel
  • Using Kafka Channel
  • Using File Channel
  • Using Spillable Memory Channel
  • Using Pseudo Transaction Channel
  • Using a Custom Channel

Using Flume Channel Selectors

  • Using the Replicating Channel Selector
  • Using the Multiplexing Channel Selector
  • Using a Custom Channel Selector

Implementing Flume Sink Processors

  • Using the Defauult Sink Processor
  • Using the Failover Sink Processor
  • Using the Load balancing Sink Processor
  • Using a Custom Sink Processor

Using Event Serializers

Using Flume Interceptors

  • Using the Timestamp Interceptor
  • Using the Host Interceptor
  • Using the Static Interceptor
  • Using the Remove Header Interceptor
  • Using the UUID Interceptor
  • Using the Morphline Interceptor
  • Using the Search and Replace Interceptor
  • Using the Regex Filtering Interceptor
  • Using the Regex Extractor Interceptor

Understanding Flume Properties

Security Configurations on Apache Flume

Monitoring and Reporting in Apache Flume

Using Tools in Apache Flume

  • Using the File Channel Integrity Tool
  • Using the Event Validator Tool

Understanding Topology Design Considerations

Handling Agent Failures

Handling Compatibility

Troubleshooting

Summary and Conclusion

Closing Remarks

Related Categories

Related Courses

Course Discounts

Course Discounts Newsletter

We respect the privacy of your email address. We will not pass on or sell your address to others.
You can always change your preferences or unsubscribe completely.

Some of our clients

is growing fast!

We are looking for a good mixture of IT and soft skills in Luxembourg!

As a NobleProg Trainer you will be responsible for:

  • delivering training and consultancy Worldwide
  • preparing training materials
  • creating new courses outlines
  • delivering consultancy
  • quality management

At the moment we are focusing on the following areas:

  • Statistic, Forecasting, Big Data Analysis, Data Mining, Evolution Alogrithm, Natural Language Processing, Machine Learning (recommender system, neural networks .etc...)
  • SOA, BPM, BPMN
  • Hibernate/Spring, Scala, Spark, jBPM, Drools
  • R, Python
  • Mobile Development (iOS, Android)
  • LAMP, Drupal, Mediawiki, Symfony, MEAN, jQuery
  • You need to have patience and ability to explain to non-technical people

To apply, please create your trainer-profile by going to the link below:

Apply now!

This site in other countries/regions