flume topology design . tier num 分层数目
32:+:1
x:1
x<=8
https://flume.apache.org/FlumeUserGuide.html#flume-topology-design
Flume topology design
【拓扑 分层原因 0-缓冲1-路由】
The first step in designing a Flume topology is to enumerate all sources and destinations (terminal sinks) for your data. These will define the edge points of your topology. The next consideration is whether to introduce intermediate aggregation tiers or event routing. If you are collecting data form a large number of sources, it can be helpful to aggregate the data in order to simplify ingestion at the terminal sink. An aggregation tier can also smooth out burstiness from sources or unavailability at sinks, by acting as a buffer. If you are routing data between different locations, you may also want to split flows at various points: this creates sub-topologies which may themselves include aggregation points.
Sizing a Flume deployment
【猝发 事件数 字节数 每层最大吞吐量】
Once you have an idea of what your topology will look like, the next question is how much hardware and networking capacity is needed. This starts by quantifying how much data you generate. That is not always a simple task! Most data streams are bursty (for instance, due to diurnal patterns) and potentially unpredictable. A good starting point is to think about the maximum throughput you’ll have in each tier of the topology, both in terms of events per second and bytes per second. Once you know the required throughput of a given tier, you can calulate a lower bound on how many nodes you require for that tier. To determine attainable throughput, it’s best to experiment with Flume on your hardware, using synthetic or sampled event data. In general, disk-based channels should get 10’s of MB/s and memory based channels should get 100’s of MB/s or more. Performance will vary widely, however depending on hardware and operating environment.
Sizing aggregate throughput gives you a lower bound on the number of nodes you will need to each tier. There are several reasons to have additional nodes, such as increased redundancy and better ability to absorb bursts in load.
flume topology design . tier num 分层数目的更多相关文章
- Flume官方文档翻译——Flume 1.7.0 User Guide (unreleased version)(一)
Flume 1.7.0 User Guide Introduction(简介) Overview(综述) System Requirements(系统需求) Architecture(架构) Data ...
- 【翻译】Flume 1.8.0 User Guide(用户指南) Processors
翻译自官网flume1.8用户指南,原文地址:Flume 1.8.0 User Guide 篇幅限制,分为以下5篇: [翻译]Flume 1.8.0 User Guide(用户指南) [翻译]Flum ...
- Flume interceptor 使用注意事项
1. 在使用 Regex Filtering Interceptor的时候一个属性是excludeEvents 当它的值为true 的时候,过滤掉匹配到当前正则表达式的一行 当它的值为false的时候 ...
- storm的并发
1 storm并行的基本概念 storm集群中的一个机器可以运行一个或者多个worker,对应于一个或者多个topologies. 1个worker进程运行1个或多个excutor线程.每个worke ...
- 直线电机设计与优化(TFLM,FSLM)论文阅读笔记3
2.21-(2.7论文引出)傅里叶对开关磁通电机建模 Modeling of Flux Switching Permanent Magnet Machines With Fourier Analysi ...
- C算法编程题(七)购物
前言 上一篇<C算法编程题(六)串的处理> 有些朋友看过我写的这个算法编程题系列,都说你写的不是什么算法,也不是什么C++,大家也给我提出用一些C++特性去实现问题更方便些,在这里谢谢大家 ...
- Sizing and Capacity Planning for SharePoint 2013 - Resources
http://blogs.msdn.com/b/sanjaynarang/archive/2013/04/06/sizing-and-capacity-planning-for-sharepoint- ...
- 微软职位内部推荐-Software Development Engineering II
微软近期Open的职位: Job Title: Software Development Engineering II Work Location: Suzhou, China Enterprise ...
- 微软职位内部推荐-Senior Development Engineer
微软近期Open的职位: Job Title: Senior Software Development Engineering Work Location: Suzhou, China Enterpr ...
随机推荐
- VS2010 + winxp 无法定位程序输入点GetTickCount64 在动态链接库kernel32.dll上 错误
winxp系统,使用VS2010, 在使用boost中的thread中的sleep的时候出现 “无法定位程序输入点GetTickCount64 在动态链接库kernel32.dll上”的错误, 在网上 ...
- 慕课 python 操作数据库2 银行转账实例
CREATE TABLE `account` ( `acctid` ) DEFAULT NULL COMMENT '账户ID', `) DEFAULT NULL COMMENT '余额' ) ENGI ...
- win8防火墙配置出站规则禁止QQ访问
我们知道Windows自带防火墙可以自定义入站出站规则,那么今天我们就通过配置出站规则禁止QQ访问,在2015年少登QQ,多忙工作,登上人生巅峰,赢娶白富美,哈哈 首先,通过控制面板打开防火墙,可以看 ...
- 【leetcode】 First Missing Positive
[LeetCode]First Missing Positive Given an unsorted integer array, find the first missing positive in ...
- Linux 之 Xunsearch(2)
Linux 之 Xunsearch(2) 参考教程:[千峰教育] Xunsearch的项目配置文件: 基本说明: (1)项目配置是一个项目的核心灵魂,非常重要,通常保存为.ini文件, 通常存储在/u ...
- Laravel 基础知识
使用版本Laravel5.1.======================================================目录简单介绍:app目录,核心目录,应用目录.bootstra ...
- AC日记——dispatching bzoj 2809
2809: [Apio2012]dispatching Time Limit: 10 Sec Memory Limit: 128 MBSubmit: 3290 Solved: 1740[Submi ...
- CSS-滤镜 -webkit-filter
css滤镜属性,可以在元素呈现之前,为元素的渲染提供一些效果,如模糊.颜色转移之类的.滤镜常用于调整图像.背景.边框的渲染. 语法: webkit-filter: none | blur(px) | ...
- Python语言 介绍
一.python介绍python的创始人为吉多·范罗苏姆(Guido van Rossum).1989年的圣诞节期间,吉多·范罗苏姆为了在阿姆斯特丹打发时间,决心开发一个新的脚本解释程序,作为ABC语 ...
- CentOS配置sshd
用SSH来远程管理计算机,就不用到计算机实际地点来回跑了 环境:服务器:CentOS6.6,客户机win8.1 putty 配置服务器: 1.检查SSHD是否安装(默认情况下是系统自带的),使用命令 ...