32:+:1

x:1

x<=8

https://flume.apache.org/FlumeUserGuide.html#flume-topology-design

Flume topology design

【拓扑 分层原因 0-缓冲1-路由】

The first step in designing a Flume topology is to enumerate all sources and destinations (terminal sinks) for your data. These will define the edge points of your topology. The next consideration is whether to introduce intermediate aggregation tiers or event routing. If you are collecting data form a large number of sources, it can be helpful to aggregate the data in order to simplify ingestion at the terminal sink. An aggregation tier can also smooth out burstiness from sources or unavailability at sinks, by acting as a buffer. If you are routing data between different locations, you may also want to split flows at various points: this creates sub-topologies which may themselves include aggregation points.

Sizing a Flume deployment

【猝发 事件数 字节数 每层最大吞吐量】

Once you have an idea of what your topology will look like, the next question is how much hardware and networking capacity is needed. This starts by quantifying how much data you generate. That is not always a simple task! Most data streams are bursty (for instance, due to diurnal patterns) and potentially unpredictable. A good starting point is to think about the maximum throughput you’ll have in each tier of the topology, both in terms of events per second and bytes per second. Once you know the required throughput of a given tier, you can calulate a lower bound on how many nodes you require for that tier. To determine attainable throughput, it’s best to experiment with Flume on your hardware, using synthetic or sampled event data. In general, disk-based channels should get 10’s of MB/s and memory based channels should get 100’s of MB/s or more. Performance will vary widely, however depending on hardware and operating environment.

Sizing aggregate throughput gives you a lower bound on the number of nodes you will need to each tier. There are several reasons to have additional nodes, such as increased redundancy and better ability to absorb bursts in load.

flume topology design . tier num 分层数目的更多相关文章

  1. Flume官方文档翻译——Flume 1.7.0 User Guide (unreleased version)(一)

    Flume 1.7.0 User Guide Introduction(简介) Overview(综述) System Requirements(系统需求) Architecture(架构) Data ...

  2. 【翻译】Flume 1.8.0 User Guide(用户指南) Processors

    翻译自官网flume1.8用户指南,原文地址:Flume 1.8.0 User Guide 篇幅限制,分为以下5篇: [翻译]Flume 1.8.0 User Guide(用户指南) [翻译]Flum ...

  3. Flume interceptor 使用注意事项

    1. 在使用 Regex Filtering Interceptor的时候一个属性是excludeEvents 当它的值为true 的时候,过滤掉匹配到当前正则表达式的一行 当它的值为false的时候 ...

  4. storm的并发

    1 storm并行的基本概念 storm集群中的一个机器可以运行一个或者多个worker,对应于一个或者多个topologies. 1个worker进程运行1个或多个excutor线程.每个worke ...

  5. 直线电机设计与优化(TFLM,FSLM)论文阅读笔记3

    2.21-(2.7论文引出)傅里叶对开关磁通电机建模 Modeling of Flux Switching Permanent Magnet Machines With Fourier Analysi ...

  6. C算法编程题(七)购物

    前言 上一篇<C算法编程题(六)串的处理> 有些朋友看过我写的这个算法编程题系列,都说你写的不是什么算法,也不是什么C++,大家也给我提出用一些C++特性去实现问题更方便些,在这里谢谢大家 ...

  7. Sizing and Capacity Planning for SharePoint 2013 - Resources

    http://blogs.msdn.com/b/sanjaynarang/archive/2013/04/06/sizing-and-capacity-planning-for-sharepoint- ...

  8. 微软职位内部推荐-Software Development Engineering II

    微软近期Open的职位: Job Title: Software Development Engineering II Work Location: Suzhou, China Enterprise ...

  9. 微软职位内部推荐-Senior Development Engineer

    微软近期Open的职位: Job Title: Senior Software Development Engineering Work Location: Suzhou, China Enterpr ...

随机推荐

  1. Python远程视频监控程序

    老板由于事务繁忙无法经常亲临教研室,于是让我搞个监控系统,让他在办公室就能看到教研室来了多少人.o(>﹏<)o||| 最初我的想法是直接去网上下个软件,可是找来找去不是有毒就是收费,无奈技 ...

  2. 机器学习实战读书笔记(二)k-近邻算法

    knn算法: 1.优点:精度高.对异常值不敏感.无数据输入假定 2.缺点:计算复杂度高.空间复杂度高. 3.适用数据范围:数值型和标称型. 一般流程: 1.收集数据 2.准备数据 3.分析数据 4.训 ...

  3. NYOJ90 整数划分(经典递归和dp)

    整数划分 时间限制:3000 ms  |  内存限制:65535 KB 难度:3   描述 将正整数n表示成一系列正整数之和:n=n1+n2+…+nk,  其中n1≥n2≥…≥nk≥1,k≥1.  正 ...

  4. sudo apt-get upgrade 不成功遇到问题

    一. sudo apt-get update 和 sudo apt-get upgrade 出错:(Ubuntu更新过程被中断后的问题) Ubuntu的更新过程是先下载完源里的文件就开始执行升级,如果 ...

  5. Java---详解方法传值问题

    过程解析: 1.首先执行int[] arr={3,5,6,1,7,9,0},遇到数组先执行等式右边的,{3,5,6,1,7,9,0}会在堆内存中开辟一块空间,分成7小块,下标分别从0~6,先进行系统初 ...

  6. Ant -----ant标签和自定义任务

    随便记一下 Ant的用法吧.ant ,maven, gradle ,三个打包工具到齐了,Ant 常见标签解析,ant 自定义task . <?xml version="1.0" ...

  7. Extjs grid 单元格编辑

    实现grid勾选后出现编辑按钮,通过增加一个字段checked来控制 事件如下: selectionchange: function (thi, selected, eOpts) { for (var ...

  8. 数据结构------------------二叉查找树(BST)的java实现

    数据结构------------------二叉查找树(BST)的java实现 二叉查找树(BST)是一种能够将链表插入的灵活性和有序数组查找的高效性相结合的一种数据结构.它的定义如下: 二叉查找树是 ...

  9. Linux下快速删除输错的密码技巧(快速删除输入的命令)

    1.[Esc]+[退格键(Backspace)] 2.[Ctrl]+[U] 说明:以上两个快捷键都会删除全部输错的命令或密码. 参考: http://blog.csdn.net/u013895662/ ...

  10. http put/delete方式请求

    HttpClient使用Delete方式提交数据 1. http请求主要有以下几种方法来对指定资源做不同操作: HTTP/1.1协议中共定义了八种方法(有时也叫“动作”)来表明Request-URI指 ...