Flume-ng源码解析之Channel组件
如果还没看过Flume-ng源码解析之启动流程,可以点击Flume-ng源码解析之启动流程 查看
1 接口介绍
组件的分析顺序是按照上一篇中启动顺序来分析的,首先是Channel,然后是Sink,最后是Source,在开始看组件源码之前我们先来看一下两个重要的接口,一个是LifecycleAware ,另一个是NamedComponent
1.1 LifecycleAware
@InterfaceAudience.Public
@InterfaceStability.Stable
public interface LifecycleAware {
public void start();
public void stop();
public LifecycleState getLifecycleState();
}
非常简单就是三个方法,start()、stop()和getLifecycleState,这个接口是flume好多类都要实现的接口,包括Flume-ng源码解析之启动流程
所中提到PollingPropertiesFileConfigurationProvider(),只要涉及到生命周期的都会实现该接口,当然组件们也是要实现的!
1.2 NamedComponent
@InterfaceAudience.Public
@InterfaceStability.Stable
public interface NamedComponent {
public void setName(String name);
public String getName();
}
这个没什么好讲的,就是用来设置名字的。
2 Channel
作为Flume三大核心组件之一的Channel,我们有必要来看看它的构成:
@InterfaceAudience.Public
@InterfaceStability.Stable
public interface Channel extends LifecycleAware, NamedComponent {
public void put(Event event) throws ChannelException;
public Event take() throws ChannelException;
public Transaction getTransaction();
}
那么从上面的接口中我们可以看到Channel的主要功能就是put()和take(),那么我们就来看一下它的具体实现。这里我们选择MemoryChannel作为例子,但是MemoryChannel太长了,我们就截取一小段来看看
public class MemoryChannel extends BasicChannelSemantics {
private static Logger LOGGER = LoggerFactory.getLogger(MemoryChannel.class);
private static final Integer defaultCapacity = Integer.valueOf(100);
private static final Integer defaultTransCapacity = Integer.valueOf(100);
public MemoryChannel() {
}
...
}
我们又看到它继承了BasicChannelSemantics ,从名字我们可以看出它是一个基础的Channel,我们继续看看看它的实现
@InterfaceAudience.Public
@InterfaceStability.Stable
public abstract class BasicChannelSemantics extends AbstractChannel {
private ThreadLocal<BasicTransactionSemantics> currentTransaction
= new ThreadLocal<BasicTransactionSemantics>();
private boolean initialized = false;
protected void initialize() {}
protected abstract BasicTransactionSemantics createTransaction();
@Override
public void put(Event event) throws ChannelException {
BasicTransactionSemantics transaction = currentTransaction.get();
Preconditions.checkState(transaction != null,
"No transaction exists for this thread");
transaction.put(event);
}
@Override
public Event take() throws ChannelException {
BasicTransactionSemantics transaction = currentTransaction.get();
Preconditions.checkState(transaction != null,
"No transaction exists for this thread");
return transaction.take();
}
@Override
public Transaction getTransaction() {
if (!initialized) {
synchronized (this) {
if (!initialized) {
initialize();
initialized = true;
}
}
}
BasicTransactionSemantics transaction = currentTransaction.get();
if (transaction == null || transaction.getState().equals(
BasicTransactionSemantics.State.CLOSED)) {
transaction = createTransaction();
currentTransaction.set(transaction);
}
return transaction;
}
}
找了许久,终于发现了put()和take(),但是仔细一看,它们内部调用的是BasicTransactionSemantics 的put()和take(),有点失望,继续来看看BasicTransactionSemantics
public abstract class BasicTransactionSemantics implements Transaction {
private State state;
private long initialThreadId;
protected void doBegin() throws InterruptedException {}
protected abstract void doPut(Event event) throws InterruptedException;
protected abstract Event doTake() throws InterruptedException;
protected abstract void doCommit() throws InterruptedException;
protected abstract void doRollback() throws InterruptedException;
protected void doClose() {}
protected BasicTransactionSemantics() {
state = State.NEW;
initialThreadId = Thread.currentThread().getId();
}
protected void put(Event event) {
Preconditions.checkState(Thread.currentThread().getId() == initialThreadId,
"put() called from different thread than getTransaction()!");
Preconditions.checkState(state.equals(State.OPEN),
"put() called when transaction is %s!", state);
Preconditions.checkArgument(event != null,
"put() called with null event!");
try {
doPut(event);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
throw new ChannelException(e.toString(), e);
}
}
protected Event take() {
Preconditions.checkState(Thread.currentThread().getId() == initialThreadId,
"take() called from different thread than getTransaction()!");
Preconditions.checkState(state.equals(State.OPEN),
"take() called when transaction is %s!", state);
try {
return doTake();
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
return null;
}
}
protected State getState() {
return state;
}
...//我们这里只是讨论put和take,所以一些暂时不涉及的方法就被我干掉,有兴趣恩典朋友可以自行阅读
protected static enum State {
NEW, OPEN, COMPLETED, CLOSED
}
}
又是一个抽象类,put()和take()内部调用的还是抽象方法doPut()和doTake(),看到这里,我相信没有耐心的同学已经崩溃了,但是就差最后一步了,既然是抽象类,那么最终Channel所使用的肯定是它的一个实现类,这时候我们可以回到一开始使用的MemoryChannel,到里面找找有没有线索,一看,MemoryChannel中就藏着个内部类
private class MemoryTransaction extends BasicTransactionSemantics {
private LinkedBlockingDeque<Event> takeList;
private LinkedBlockingDeque<Event> putList;
private final ChannelCounter channelCounter;
private int putByteCounter = 0;
private int takeByteCounter = 0;
public MemoryTransaction(int transCapacity, ChannelCounter counter) {
putList = new LinkedBlockingDeque<Event>(transCapacity);
takeList = new LinkedBlockingDeque<Event>(transCapacity);
channelCounter = counter;
}
@Override
protected void doPut(Event event) throws InterruptedException {
channelCounter.incrementEventPutAttemptCount();
int eventByteSize = (int) Math.ceil(estimateEventSize(event) / byteCapacitySlotSize);
if (!putList.offer(event)) {
throw new ChannelException(
"Put queue for MemoryTransaction of capacity " +
putList.size() + " full, consider committing more frequently, " +
"increasing capacity or increasing thread count");
}
putByteCounter += eventByteSize;
}
@Override
protected Event doTake() throws InterruptedException {
channelCounter.incrementEventTakeAttemptCount();
if (takeList.remainingCapacity() == 0) {
throw new ChannelException("Take list for MemoryTransaction, capacity " +
takeList.size() + " full, consider committing more frequently, " +
"increasing capacity, or increasing thread count");
}
if (!queueStored.tryAcquire(keepAlive, TimeUnit.SECONDS)) {
return null;
}
Event event;
synchronized (queueLock) {
event = queue.poll();
}
Preconditions.checkNotNull(event, "Queue.poll returned NULL despite semaphore " +
"signalling existence of entry");
takeList.put(event);
int eventByteSize = (int) Math.ceil(estimateEventSize(event) / byteCapacitySlotSize);
takeByteCounter += eventByteSize;
return event;
}
//...依然删除暂时不需要的方法
}
在这个类中我们可以看到doPut()和doTake()的实现方法,也明白MemoryChannel的put()和take()最终调用的是MemoryTransaction 的doPut()和doTake()。
有朋友看到这里以为这次解析就要结束了,其实好戏还在后头,Channel中还有两个重要的类ChannelProcessor和ChannelSelector,耐心地听我慢慢道来。
3 ChannelProcessor
ChannelProcessor 的作用就是执行put操作,将数据放到channel里面。每个ChannelProcessor实例都会配备一个ChannelSelector来决定event要put到那个channl当中
public class ChannelProcessor implements Configurable {
private static final Logger LOG = LoggerFactory.getLogger(ChannelProcessor.class);
private final ChannelSelector selector;
private final InterceptorChain interceptorChain;
public ChannelProcessor(ChannelSelector selector) {
this.selector = selector;
this.interceptorChain = new InterceptorChain();
}
public void initialize() {
this.interceptorChain.initialize();
}
public void close() {
this.interceptorChain.close();
}
public void configure(Context context) {
this.configureInterceptors(context);
}
private void configureInterceptors(Context context) {
//配置拦截器
}
public ChannelSelector getSelector() {
return this.selector;
}
public void processEventBatch(List<Event> events) {
...
while(i$.hasNext()) {
Event optChannel = (Event)i$.next();
List tx = this.selector.getRequiredChannels(optChannel);
...//将event放到Required队列
t1 = this.selector.getOptionalChannels(optChannel);
Object eventQueue;
...//将event放到Optional队列
}
...//event的分配操作
}
public void processEvent(Event event) {
event = this.interceptorChain.intercept(event);
if(event != null) {
List requiredChannels = this.selector.getRequiredChannels(event);
Iterator optionalChannels = requiredChannels.iterator();
...//event的分配操作
List optionalChannels1 = this.selector.getOptionalChannels(event);
Iterator i$1 = optionalChannels1.iterator();
...//event的分配操作
}
}
}
为了简化代码,我进行了一些删除,只保留需要讲解的部分,说白了Channel中的两个写入方法,都是需要从作为参数传入的selector中获取对应的channel来执行event的put操作。接下来我们来看看ChannelSelector
4 ChannelSelector
ChannelSelector是一个接口,我们可以通过ChannelSelectorFactory来创建它的子类,Flume提供了两个实现类MultiplexingChannelSelector和ReplicatingChannelSelector。
public interface ChannelSelector extends NamedComponent, Configurable {
void setChannels(List<Channel> var1);
List<Channel> getRequiredChannels(Event var1);
List<Channel> getOptionalChannels(Event var1);
List<Channel> getAllChannels();
}
通过ChannelSelectorFactory 的create来创建,create中调用getSelectorForType来获得一个selector,通过配置文件中的type来创建相应的子类
public class ChannelSelectorFactory {
private static final Logger LOGGER = LoggerFactory.getLogger(
ChannelSelectorFactory.class);
public static ChannelSelector create(List<Channel> channels,
Map<String, String> config) {
...
}
public static ChannelSelector create(List<Channel> channels,
ChannelSelectorConfiguration conf) {
String type = ChannelSelectorType.REPLICATING.toString();
if (conf != null) {
type = conf.getType();
}
ChannelSelector selector = getSelectorForType(type);
selector.setChannels(channels);
Configurables.configure(selector, conf);
return selector;
}
private static ChannelSelector getSelectorForType(String type) {
if (type == null || type.trim().length() == 0) {
return new ReplicatingChannelSelector();
}
String selectorClassName = type;
ChannelSelectorType selectorType = ChannelSelectorType.OTHER;
try {
selectorType = ChannelSelectorType.valueOf(type.toUpperCase(Locale.ENGLISH));
} catch (IllegalArgumentException ex) {
LOGGER.debug("Selector type {} is a custom type", type);
}
if (!selectorType.equals(ChannelSelectorType.OTHER)) {
selectorClassName = selectorType.getChannelSelectorClassName();
}
ChannelSelector selector = null;
try {
@SuppressWarnings("unchecked")
Class<? extends ChannelSelector> selectorClass =
(Class<? extends ChannelSelector>) Class.forName(selectorClassName);
selector = selectorClass.newInstance();
} catch (Exception ex) {
throw new FlumeException("Unable to load selector type: " + type
+ ", class: " + selectorClassName, ex);
}
return selector;
}
}
对于这两种Selector简单说一下:
1)MultiplexingChannelSelector
下面是一个channel selector 配置文件
agent_foo.sources.avro-AppSrv-source1.selector.type = multiplexing
agent_foo.sources.avro-AppSrv-source1.selector.header = State
agent_foo.sources.avro-AppSrv-source1.selector.mapping.CA = mem-channel-1
agent_foo.sources.avro-AppSrv-source1.selector.mapping.AZ = file-channel-2
agent_foo.sources.avro-AppSrv-source1.selector.mapping.NY = mem-channel-1 file-channel-2
agent_foo.sources.avro-AppSrv-source1.selector.optional.CA = mem-channel-1 file-channel-2
agent_foo.sources.avro-AppSrv-source1.selector.mapping.AZ = file-channel-2
agent_foo.sources.avro-AppSrv-source1.selector.default = mem-channel-1
MultiplexingChannelSelector类中定义了三个属性,用于存储不同类型的channel
private Map<String, List<Channel>> channelMapping;
private Map<String, List<Channel>> optionalChannels;
private List<Channel> defaultChannels;
那么具体分配原则如下:
- 如果设置了maping,那么会event肯定会给指定的channel,如果同时设置了optional,也会发送给optionalchannel
- 如果没有设置maping,设置default,那么event会发送给defaultchannel,如果还同时设置了optional,那么也会发送给optionalchannel
- 如果maping和default都没指定,如果有指定option,那么会发送给optionalchannel,但是发送给optionalchannel不会进行失败重试
2)ReplicatingChannelSelector
分配原则比较简单
- 如果是replicating的话,那么如果没有指定optional,那么全部channel都有,如果某个channel指定为option的话,那么就要从requiredChannel移除,只发送给optionalchannel
5 总结:
作为一个承上启下的组件,Channel的作用就是将source来的数据通过自己流向sink,那么ChannelProcessor就起到将event put到分配好的channel中,而分配的规则是由selector决定的,flume提供的selector有multiplexing和replicating两种。所以ChannelProcessor一般都是在Source中被调用。那么Channel的take()肯定是在Sink中调用的。
Flume-ng源码解析之Channel组件的更多相关文章
- Flume-ng源码解析之Sink组件
作为启动流程中第二个启动的组件,我们今天来看看Sink的细节 1 Sink Sink在agent中扮演的角色是消费者,将event输送到特定的位置 首先依然是看代码,由代码我们可以看出Sink是一个接 ...
- Flume-ng源码解析之Source组件
如果你还没看过Flume-ng源码解析系列中的启动流程.Channel组件和Sink组件,可以点击下面链接: Flume-ng源码解析之启动流程 Flume-ng源码解析之Channel组件 Flum ...
- rest-framework源码解析和自定义组件----版本
版本 url中通过GET传参自定义的版本 12345678910111213141516171819202122 from django.http import HttpResponsefrom dj ...
- Spring源码解析系列汇总
相信我,你会收藏这篇文章的 本篇文章是这段时间撸出来的Spring源码解析系列文章的汇总,总共包含以下专题.喜欢的同学可以收藏起来以备不时之需 SpringIOC源码解析(上) 本篇文章搭建了IOC源 ...
- [源码解析] 并行分布式任务队列 Celery 之 EventDispatcher & Event 组件
[源码解析] 并行分布式任务队列 Celery 之 EventDispatcher & Event 组件 目录 [源码解析] 并行分布式任务队列 Celery 之 EventDispatche ...
- .Net Core缓存组件(Redis)源码解析
上一篇文章已经介绍了MemoryCache,MemoryCache存储的数据类型是Object,也说了Redis支持五中数据类型的存储,但是微软的Redis缓存组件只实现了Hash类型的存储.在分析源 ...
- .Net Core缓存组件(MemoryCache)源码解析
一.介绍 由于CPU从内存中读取数据的速度比从磁盘读取快几个数量级,并且存在内存中,减小了数据库访问的压力,所以缓存几乎每个项目都会用到.一般常用的有MemoryCache.Redis.MemoryC ...
- admin源码解析以及仿照admin设计stark组件
---恢复内容开始--- admin源码解析 一 启动:每个APP下的apps.py文件中. 首先执行每个APP下的admin.py 文件. def autodiscover(): autodisco ...
- admin源码解析及自定义stark组件
admin源码解析 单例模式 单例模式(Singleton Pattern)是一种常用的软件设计模式,该模式的主要目的是确保某一个类只有一个实例存在.当你希望在整个系统中,某个类只能出现一个实例时,单 ...
随机推荐
- Delphi流的操作
一.流的概念 流简单说是建立在面向对象基础上的一种抽象的处理数据的工具,它定义了一些处理数据的基本操作,如读取数据,写入数据等,程序员只需掌握对流进行操作,而不用关心流的另一头数据的真正流向.其实,流 ...
- Java Swing JScrollPane 设置滚动量
JScrollPane.getVerticalScrollBar().setUnitIncrement(20); 参考:http://bbs.csdn.net/topics/320249228
- 基于Doubango的iOS客户端开源框架
一.ios-ngn-statck工程 1.Tests ---功能测试 2.底层模块(c和c++) Doubango --- 基于3GPP IMS/RCS 并能用于嵌入式和桌面系统的开源框架 1) ti ...
- 蓝桥杯java试题《洗牌》
问题描述 小弱T在闲暇的时候会和室友打扑克,输的人就要负责洗牌.虽然小弱T不怎么会洗牌,但是他却总是输. 渐渐地小弱T发现了一个规律:只要自己洗牌,自己就一定会输.所以小弱T认为自己洗牌不够均匀,就独 ...
- 如何在Crystal框架项目中内置启动MetaQ服务?
当Crystal框架项目中需要使用消息机制,而项目规模不大.性能要求不高时,可内置启动MetaQ服务器. 分步指南 项目引入crystal-extend-metaq模块,如下: <depende ...
- Apple官方IOS开发入门教程[v0.2]
今天,又跑去找IOS开发入门教程了,结果发现没什么好的PDF. 后来发现,原来苹果官方有开发入门教程,而且写的很好.所以整理出来了,给大家分享一下. 我就不在这里贴pdf的内容了,下面有苹果官方教程的 ...
- 基于MAC OS 操作系统安装、配置mysql
$ sudo mv mysql-5.1.45-osx10.6-x86_64 /usr/local/mysql$ cd /usr/local$ sudo chown -R mysql:mysql mys ...
- ubuntu14.04 下手动安装java jdk
ubuntu14.04 下手动安装java jdk 第一步: 下载jdk.tar.gz (这里假设下载的文件名为jdk.tar.gz) 第二步: 解压 sudo tar -zxvf ./jdk.tar ...
- C++编程练习(13)----“排序算法 之 堆排序“
堆排序 堆是具有下列性质的完全二叉树:每个结点的值都大于或等于其左右孩子结点的值,称为大顶堆(也叫最大堆):或者每个结点的值都小于或等于其左右孩子结点的值,称为小顶堆(也叫最小堆). 最小堆和最大堆如 ...
- HUST 1584 摆放餐桌
1584 - 摆放餐桌 时间限制:1秒 内存限制:128兆 609 次提交 114 次通过 题目描述 BG准备在家办一个圣诞晚宴,他用一张大桌子招待来访的客人.这张桌子是一个圆形的,半径为R.BG邀请 ...