streamsets Processors 说明
Processors 表示对于一种数据操作处理,在pipeline中可以应用多个Processors,
同时根据不同的执行模式,可以分为独立模式的,集群模式、边缘模式(agent),以及
帮助测试的测试Processors
独立pipelineonly
- Record Deduplicator - Removes duplicate records.
独立&&集群pipeline
- Aggregator - Performs aggregations and displays the results in Monitor mode and writes the results to events when enabled. This processor does not update the records being evaluated.
- Base64 Field Decoder - Decodes Base64 encoded data to binary data.
- Base64 Field Encoder - Encodes binary data using Base64.
- Data Parser - Parses NetFlow or syslog data embedded in a field.
- Delay - Delays passing a batch to the rest of the pipeline.
- Expression Evaluator - Performs calculations on data. Can also add or modify record header attributes.
- Field Flattener - Flattens nested fields.
- Field Hasher - Uses an algorithm to encode sensitive data.
- Field Masker - Masks sensitive string data.
- Field Merger - Merges fields in complex lists or maps.
- Field Order - Orders fields in a map or list-map root field type and outputs the fields into a list-map or list root field type.
- Field Pivoter - Pivots data in a list, map, or list-map field and creates a record for each item in the field.
- Field Remover - Removes fields from a record.
- Field Renamer - Renames fields in a record.
- Field Replacer - Replaces field values.
- Field Splitter - Splits the string values in a field into different fields.
- Field Type Converter - Converts the data types of fields.
- Field Zip - Merges list data from two fields.
- Geo IP- Returns geolocation and IP intelligence information for a specified IP address.
- Groovy Evaluator - Processes records based on custom Groovy code.
- HBase Lookup - Performs key-value lookups in HBase to enrich records with data.
- Hive Metadata - Works with the Hive Metastore destination as part of the Drift Synchronization Solution for Hive.
- HTTP Client - The HTTP Client processor sends requests to an HTTP resource URL and writes the results to a field.
- JavaScript Evaluator - Processes records based on custom JavaScript code.
- JDBC Lookup - Performs lookups in a database table through a JDBC connection.
- JDBC Tee - Writes data to a database table through a JDBC connection, and enriches records with data from generated database columns.
- JSON Generator - Serializes data from a field to a JSON-encoded string.
- JSON Parser - Parses a JSON object embedded in a string field.
- Jython Evaluator - Processes records based on custom Jython code.
- Kudu Lookup - Performs lookups in Kudu to enrich records with data.
- Log Parser - Parses log data in a field based on the specified log format.
- PostgreSQL Metadata - Tracks structural changes in source data then creates and alters PostgreSQL tables as part of the Drift Synchronization Solution for PostgreSQL.
- Redis Lookup - Performs key-value lookups in Redis to enrich records with data.
- Salesforce Lookup - Performs lookups in Salesforce to enrich records with data.
- Schema Generator - Generates a schema for each record and writes the schema to a record header attribute.
- Spark Evaluator - Processes data based on a custom Spark application.
- SQL Parser - Parses SQL queries in a string field.
- Static Lookup - Performs key-value lookups in local memory.
- Stream Selector - Routes data to different streams based on conditions.
- Value Replacer (Deprecated) - Replaces existing nulls or specified values with constants or nulls.
- Whole File Transformer - Transforms Avro files to Parquet.
- XML Flattener - Flattens XML data in a string field.
- XML Parser - Parses XML data in a string field.
边缘pipeline
- Expression Evaluator - Performs calculations on data. Can also add or modify record header attributes.
- Field Remover - Removes fields from a record.
- JavaScript Evaluator - Processes records based on custom JavaScript code.
- Stream Selector - Routes data to different streams based on conditions.
测试Processors
- Dev Identity
- Dev Random Error
- Dev Record Creator
参考资料
streamsets Processors 说明的更多相关文章
- StreamSets 相关文章
相关streamsets 文章(不按顺序) 学习视频-百度网盘 StreamSets 设计Edge pipeline StreamSets Data Collector Edge 说明 streams ...
- streamsets 3.5 的一些新功能
streamsets 3.5 有了一些新的特性以及增强,总之是越来越方便了,详细的可以 查看官方说明,以下简单例举一些比较有意义的. origins 新的pulsar 消费origin jdbc 多表 ...
- streamsets geoip 使用
geoip 分析对于网站数据分析是很方便的 安装geoip2 下载地址 https://dev.maxmind.com/geoip/geoip2/geolite2/ 配置streamsets geoi ...
- streamsets stream selector 使用
stream selector 就是一个选择器,可以方便的对于不同record 的数据进行区分,并执行不同的处理 pipeline flow stream selector 配置 local fs 配 ...
- StreamSets使用指南
StreamSets使用指南 最近在调研Streamsets,照猫画虎做了几个最简单的Demo鉴于网络上相关资料非常少,做个记录. 1.简介 Streamsets是一款大数据实时采集和ETL工具,可以 ...
- lib/sqlalchemy/cextension/processors.c:10:20: 致命错误: Python.h:没有那个文件或目录
本文地址:http://www.cnblogs.com/yhLinux/p/4063444.html $ sudo easy_install sqlalchemy [sudo] password fo ...
- BSS Audio® Introduces Full-Bandwidth Acoustic Echo Cancellation Algorithm for Soundweb London Conferencing Processors
BSS Audio® Introduces Full-Bandwidth Acoustic Echo Cancellation Algorithm for Soundweb London Confer ...
- regardless of how many processors are devoted to a parallelized execution of this program
https://en.wikipedia.org/wiki/Amdah's_law Amdahl's law is often used in parallel computing to predic ...
- using 40 logical processors based on SQL Server licensing SqlServer CPU核心数限制问题
公司服务器是120核心cpu,但是实际应用中只有40核,原因是业务部门发现服务器cpu承载30%的时候sql 就会卡死: 然后从sqlserver 去查询,cpu核心数: SELECT COUNT(1 ...
随机推荐
- # 20145118 《Java程序设计》第4周学习总结 ## 教材学习内容总结
20145118 <Java程序设计>第4周学习总结 教材学习内容总结 本周内容为教材第六.七两张内容. 重点概念: 1.面向对象中,子类继承父类,避免重复的行为定义,是一种简化操作. 2 ...
- 参考sectools,每个人至少查找5种安全工具、库等信息并深入研究至少两种并写出使用教程
1.Nessus Nessus是免费网络漏洞扫描器,它可以运行于几乎所有的UNIX平台之上.它不仅能永久升级,还免费提供多达11000种插件(但需要注册并接受EULA-acceptance--终端用户 ...
- ubuntu16.04解决tensorflow提示未编译使用SSE3、SSE4.1、SSE4.2、AVX、AVX2、FMA的问题【转】
本文转载自:https://blog.csdn.net/Nicholas_Wong/article/details/70215127 rticle/details/70215127 在我的机器上出现的 ...
- 同样的输入,为什么Objects.hash()方法返回的hash值每次不一样?
背景 开发过程中发现一个问题,项目中用Set保存AopMethod对象用于去重,但是发现即使往set中添加相同内容的对象,每次也能够添加成功. AopMethod类的部分代码如下: public cl ...
- CF_321_B_NetFlow
CF_321_B 题面:据说题目描述是游戏王的规则,然而我并没有玩过.大概意思就是我方有m张攻击牌,敌方有n张牌(防御,攻击都有),如果一回合我方选择攻击牌(X)攻击敌方防守牌(Y)且$Vval_X ...
- poj 2828 Buy Tickets 树状数组
Buy Tickets Description Railway tickets were difficult to buy around the Lunar New Year in China, so ...
- python 计算阶乘
# 用for循环计算 n! sum = n=int(input('请输入n=')) ,n+): ,-): sum *= j # sum=sum*j print('%d!=%3d' %(i,sum)) ...
- rails安装使用版本控制器的原因。
使用版本控制器的原因: 你没有系统根权限,所以你没有别的选择 你想要分开运行几个rails 系统 ,并且这几个rails有不同的Ruby版本.使用RVM就可以轻松做到. 没有什么新鲜的先安装xcode ...
- hdu1846巴什博弈
巴什博弈:只有一堆n个物品,两个人轮流从这堆物品中取物, 规定每次至少取一个,最多取m个.最后取光者得胜. 结论:只要不能整除,那么必然是先手取胜,否则后手取胜. #include<map> ...
- UVALive-2966 King's Quest(强连通+二分图匹配)
题目大意:有n个男孩和和n个女孩,已只每个男孩喜欢的女孩.一个男孩只能娶一个女孩.一个女孩只能嫁一个男孩并且男孩只娶自己喜欢的女孩,现在已知一种他们的结婚方案,现在要求找出每个男孩可以娶的女孩(娶完之 ...