Storm-Mongodb详解
这两天研究Trident,踩坑踩的遭不住,来和大家分享下。
首先写入数据库:
先看官方API给我们提供的方法:
//这是用来辅助下面MongoInsert的简单实现类
public class SimpleMongoMapper implements MongoMapper {
private String[] fields;
@Override
public Document toDocument(ITuple tuple) {
Document document = new Document();
for(String field : fields){
document.append(field, tuple.getValueByField(field));
}
return document;
}
public SimpleMongoMapper withFields(String... fields) {
this.fields = fields;
return this;
}
}
//写入数据库
String url = "mongodb://127.0.0.1:27017/test";
String collectionName = "wordcount";
MongoMapper mapper = new SimpleMongoMapper()
.withFields("word", "count");
MongoInsertBolt insertBolt = new MongoInsertBolt(url, collectionName, mapper);
一起把官网关于更新的mapper写了:
public class SimpleMongoUpdateMapper implements MongoMapper {
private String[] fields;
@Override
public Document toDocument(ITuple tuple) {
Document document = new Document();
for(String field : fields){
document.append(field, tuple.getValueByField(field));
}
return new Document("$set", document);
}
public SimpleMongoUpdateMapper withFields(String... fields) {
this.fields = fields;
return this;
}
}
下面是自己的代码,从spout到最后的存储与跟新:
import java.util.HashMap;
import java.util.Map;
import java.util.Random;
import org.apache.storm.spout.SpoutOutputCollector;
import org.apache.storm.task.TopologyContext;
import org.apache.storm.topology.OutputFieldsDeclarer;
import org.apache.storm.topology.base.BaseRichSpout;
import org.apache.storm.tuple.Fields;
import org.apache.storm.tuple.Values;
/**
* @author cwc
* @date 2018年6月1日
* @description:假数据生产厂
* @version 1.0.0
*/
public class MongodbSpout extends BaseRichSpout{
private static final long serialVersionUID = 1L;
private SpoutOutputCollector collector;
/**
* 作为字段word输出
*/
private static final Map<Integer, String> LASTNAME = new HashMap<Integer, String>();
static {
LASTNAME.put(0, "anderson");
LASTNAME.put(1, "watson");
LASTNAME.put(2, "ponting");
LASTNAME.put(3, "dravid");
LASTNAME.put(4, "lara");
}
/**
* 作为字段val输出
*/
private static final Map<Integer, String> COMPANYNAME = new HashMap<Integer, String>();
static {
COMPANYNAME.put(0, "abc");
COMPANYNAME.put(1, "dfg");
COMPANYNAME.put(2, "pqr");
COMPANYNAME.put(3, "ecd");
COMPANYNAME.put(4, "awe");
}
public void open(Map conf, TopologyContext context,
SpoutOutputCollector spoutOutputCollector) {
this.collector = spoutOutputCollector;
}
public void nextTuple() {
final Random rand = new Random();
int randomNumber = rand.nextInt(5);
this.collector.emit (new Values(LASTNAME.get(randomNumber),COMPANYNAME.get(randomNumber)));
System.out.println("数据来袭!!!!!!");
}
@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
//这边字段数量与上面的传输数量注意要一致
declarer.declare(new Fields("word","hello"));
}
}
import org.apache.storm.mongodb.common.mapper.MongoMapper;
import org.apache.storm.tuple.ITuple;
import org.bson.Document;
/**
* @author cwc
* @date 2018年6月1日
* @description:
* @version 1.0.0
*/
public class SimpleMongoMapper implements MongoMapper {
private String[] fields;
@Override
public Document toDocument(ITuple tuple) {
Document document = new Document();
for(String field : fields){
document.append(field, tuple.getValueByField(field));
}
return document;
}
public SimpleMongoMapper withFields(String... fields) {
this.fields = fields;
return this;
}
}
import org.apache.storm.mongodb.common.mapper.MongoMapper;
import org.apache.storm.tuple.ITuple;
import org.bson.Document;
/**
* @author cwc
* @date 2018年6月5日
* @description: 用于更新数据的mapper
* @version 1.0.0
*/
public class SimpleMongoUpdateMapper implements MongoMapper {
private static final long serialVersionUID = 1L;
private String[] fields;
@Override
public Document toDocument(ITuple tuple) {
Document document = new Document();
for(String field : fields){
document.append(field, tuple.getValueByField(field));
}
return new Document("$set", document);
}
public SimpleMongoUpdateMapper withFields(String... fields) {
this.fields = fields;
return this;
}
}
import java.util.Map;
import org.apache.storm.task.OutputCollector;
import org.apache.storm.task.TopologyContext;
import org.apache.storm.topology.OutputFieldsDeclarer;
import org.apache.storm.topology.base.BaseRichBolt;
import org.apache.storm.tuple.Fields;
import org.apache.storm.tuple.Tuple;
/**
* @author cwc
* @date 2018年5月30日
* @description:打印拿到的数据
* @version 1.0.0
*/
public class MongoOutBolt extends BaseRichBolt{
private static final long serialVersionUID = 1L;
private OutputCollector collector;
@Override
public void execute(Tuple tuple) {
String str =tuple.getString(0);
// String strs =tuple.getString(1);
System.err.println(str);
}
@Override
public void prepare(Map arg0, TopologyContext arg1, OutputCollector collector) {
// TODO Auto-generated method stub
this.collector=collector;
}
@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
declarer.declare(new Fields("MongoOutBolt"));
}
}
import org.apache.storm.Config;
import org.apache.storm.LocalCluster;
import org.apache.storm.StormSubmitter;
import org.apache.storm.generated.AlreadyAliveException;
import org.apache.storm.generated.AuthorizationException;
import org.apache.storm.generated.InvalidTopologyException;
import org.apache.storm.mongodb.bolt.MongoInsertBolt;
import org.apache.storm.mongodb.bolt.MongoUpdateBolt;
import org.apache.storm.mongodb.common.QueryFilterCreator;
import org.apache.storm.mongodb.common.SimpleQueryFilterCreator;
import org.apache.storm.mongodb.common.mapper.MongoMapper;
import org.apache.storm.topology.TopologyBuilder;
/**
* @author cwc
* @date 2018年6月1日
* @description:storm-mongodb的写入,更新,读取
* @version 1.0.0
*/
public class MongodbMain {
private static String url = "mongodb://172.xx.xx.x:27017/test";
private static String collectionName = "storm";
public static void main(String[]args){
// lookMongodb(url, collectionName, args);
// writeMongodb(url, collectionName,args);
updateMongodb(url, collectionName,args);
}
/**
* 将数据写入到Mongodb
* @param url
* @param collectionName
*/
public static void writeMongodb(String url,String collectionName,String[] args){
MongoMapper mapper = new SimpleMongoMapper()
.withFields("word", "val","xx");
MongoInsertBolt insertBolt = new MongoInsertBolt(url, collectionName, mapper);
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("mongodb-save", new MongodbSpout(), 2);
builder.setBolt("save", insertBolt, 1).shuffleGrouping("mongodb-save");
Config conf = new Config();
String name = MongodbMain.class.getSimpleName();
if (args != null && args.length > 0) {
String nimbus = args[0];
conf.put(Config.NIMBUS_HOST, nimbus);
conf.setNumWorkers(3);
try {
StormSubmitter.submitTopologyWithProgressBar(name, conf, builder.createTopology());
} catch (AlreadyAliveException | InvalidTopologyException | AuthorizationException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
} else {
conf.setMaxTaskParallelism(3);
LocalCluster cluster = new LocalCluster();
cluster.submitTopology(name, conf, builder.createTopology());
try {
Thread.sleep(100000);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
cluster.shutdown();
}
}
/**
* 更新mongodb数据
* @param url
* @param collectionName
*/
public static void updateMongodb(String url,String collectionName,String[] args){
MongoMapper mapper =new SimpleMongoUpdateMapper()
.withFields("word", "hello");
QueryFilterCreator updateQueryCreator = new SimpleQueryFilterCreator()
.withField("word");
MongoUpdateBolt updateBolt = new MongoUpdateBolt(url, collectionName, updateQueryCreator, mapper);
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("mongodb-update", new MongodbSpout(), 2);
builder.setBolt("update", updateBolt, 1).shuffleGrouping("mongodb-update");
Config conf = new Config();
String name = MongodbMain.class.getSimpleName();
if (args != null && args.length > 0) {
String nimbus = args[0];
conf.put(Config.NIMBUS_HOST, nimbus);
conf.setNumWorkers(3);
try {
StormSubmitter.submitTopologyWithProgressBar(name, conf, builder.createTopology());
} catch (AlreadyAliveException | InvalidTopologyException | AuthorizationException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
} else {
conf.setMaxTaskParallelism(3);
LocalCluster cluster = new LocalCluster();
cluster.submitTopology(name, conf, builder.createTopology());
try {
Thread.sleep(100000);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
cluster.shutdown();
}
}
/**
* 读取mongodb数据
* @param url
* @param collectionName
*/
public static void lookMongodb(String url,String collectionName,String[] args){
MongodbSpout spout =new MongodbSpout();
MongoLookupMapper mapper = new SimpleMongoLookupMapper()
.withFields("word", "hello");
QueryFilterCreator filterCreator = new SimpleQueryFilterCreator()
.withField("word");
MongoLookupBolt lookupBolt = new MongoLookupBolt(url, collectionName, filterCreator, mapper);
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("mongodb-look", new MongodbSpout(), 2);
builder.setBolt("mongodb-out", lookupBolt, 1).shuffleGrouping("mongodb-look");
builder.setBolt("out", new MongoOutBolt(), 1).shuffleGrouping("mongodb-out");
Config conf = new Config();
String name = MongodbMain.class.getSimpleName();
if (args != null && args.length > 0) {
String nimbus = args[0];
conf.put(Config.NIMBUS_HOST, nimbus);
conf.setNumWorkers(3);
try {
StormSubmitter.submitTopologyWithProgressBar(name, conf, builder.createTopology());
} catch (AlreadyAliveException | InvalidTopologyException | AuthorizationException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
} else {
conf.setMaxTaskParallelism(3);
LocalCluster cluster = new LocalCluster();
cluster.submitTopology(name, conf, builder.createTopology());
try {
Thread.sleep(100000);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
cluster.shutdown();
}
}
}
关于storm读取mongodb暂时还有些问题,因为时间原因 过段时间进行解决。
再看看Trident代码:
import org.apache.storm.Config;
import org.apache.storm.LocalCluster;
import org.apache.storm.StormSubmitter;
import org.apache.storm.generated.AlreadyAliveException;
import org.apache.storm.generated.AuthorizationException;
import org.apache.storm.generated.InvalidTopologyException;
import org.apache.storm.generated.StormTopology;
import org.apache.storm.mongodb.common.mapper.MongoMapper;
import org.apache.storm.mongodb.trident.state.MongoState;
import org.apache.storm.mongodb.trident.state.MongoStateFactory;
import org.apache.storm.mongodb.trident.state.MongoStateUpdater;
import org.apache.storm.trident.Stream;
import org.apache.storm.trident.TridentTopology;
import org.apache.storm.trident.state.StateFactory;
import org.apache.storm.trident.testing.FixedBatchSpout;
import org.apache.storm.tuple.Fields;
import org.apache.storm.tuple.Values;
import com.sunsheen.jfids.bigdata.storm.demo.mongodb.MongodbSpout;
import com.sunsheen.jfids.bigdata.storm.demo.mongodb.SimpleMongoMapper;
/**
* @author cwc
* @date 2018年6月5日
* @description:Storm-mongodb写入高级接口,写入普通数据
* @version 1.0.0
*/
public class MongoTridentState {
public static void main(String[]args){
String url = "mongodb://172.xxx.xxx.xxx:27017/test";
String collectionName = "storm";
Config conf = new Config();
conf.setMaxSpoutPending(3);
if (args != null && args.length > 0) {
//服务器
try {
StormSubmitter.submitTopology(args[1], conf, mongoTrident(url,collectionName));
} catch (AlreadyAliveException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (InvalidTopologyException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (AuthorizationException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
} else {
//本地
LocalCluster cluster = new LocalCluster();
cluster.submitTopology("test", conf, mongoTrident(url,collectionName));
try {
Thread.sleep(100000);
cluster.shutdown();
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
/**
* 写入到mongodb
* @param url
* @param collectionName
* @return
*/
public static StormTopology mongoTrident(String url,String collectionName){
//测试专用
// FixedBatchSpout spout = new FixedBatchSpout(new Fields("sentence", "key"), 5000, new Values("the cow jumped over the moon", 1l),
// new Values("the man went to the store and bought some candy", 2l), new Values("four score and seven years ago", 3l),
// new Values("how many apples can you eat", 4l), new Values("to be or not to be the person", 5l));
// spout.setCycle(true);
MongodbSpout spout =new MongodbSpout();
MongoMapper mapper = new SimpleMongoMapper()
.withFields("word");
MongoState.Options options = new MongoState.Options()
.withUrl(url)
.withCollectionName(collectionName)
.withMapper(mapper);
StateFactory factory = new MongoStateFactory(options);
TridentTopology topology = new TridentTopology();
Stream stream = topology.newStream("stream", spout);
stream.partitionPersist(factory, new Fields("word"), new MongoStateUpdater(), new Fields());
return topology.build();
}
}
关于storm-mongodb的详解就暂时写到这里,改天有时间再进行补充,研究。
Storm-Mongodb详解的更多相关文章
- Storm配置项详解【转】
Storm配置项详解 ——阿里数据平台技术博客:storm配置项详解 什么是Storm? Storm是twitter开源的一套实时数据处理框架,基于该框架你可以通过简单的编程来实现对数据流的实时处理变 ...
- centos7安装mongodb详解
记录一下linux下安装mongodb数据库过程. 安装mongodb #下载linux版本的tar文件# 例如笔者下载的是:mongodb-linux-x86_64-rhel70-3.4.4.tg ...
- CentOS 安装 Mongodb详解 --- 无Linux基础
先去官方下载离线安装包:https://www.mongodb.com/ ftp连接一下服务器,把离线包上传上去 XShell连接一下: 解压文件(你输一点就可以按tab键,它会自动补全):tar - ...
- Storm命令详解
在Linux终端直接输入storm,不带任何参数信息,或者输入storm help,可以查看storm命令行客户端(Command line client)提供的帮助信息.Storm 0.9.0.1版 ...
- mongodb 详解 error:10061 由于目标计算机积极拒绝,无法连接解决方法
mongodb下载地址(32位):下载地址 自己选择版本 建立如下与mongodb并行的两个文件夹data和log. 然后建立mongo.config. 在mongo.config配置文件中输入: # ...
- MongoDB详解学习历程
MongoDB是一个基于分布式文件存储的数据库,它是介于关系数据库和非关系数据库之间的产品. MongoDB支持的数据结构非常松散,类似json的bjson格式,因此可以存储比较复杂的数据类型.Mon ...
- 【转】Storm并行度详解
1.Storm并行度相关的概念 Storm集群有很多节点,按照类型分为nimbus(主节点).supervisor(从节点),在conf/storm.yaml中配置了一个supervisor,有多个槽 ...
- Storm Trident详解
Trident是基于Storm进行实时留处理的高级抽象,提供了对实时流4的聚集,投影,过滤等操作,从而大大减少了开发Storm程序的工作量.Trident还提供了针对数据库或则其他持久化存储的有状态的 ...
- storm配置详解
storm的配置文件在${STORM_HOME}/conf/storm.yaml.下面详细说明storm的配置信息. java.libary.path:storm本身依赖包的路径,有多个路径的时候使用 ...
- Storm之详解spout、blot
1.Topology的构造backtype.storm.topology.TopologyBuilder 2.Spout组件的编写实现接口 backtype.storm.topology.IRichS ...
随机推荐
- 终止过久没有返回的 Windows API 函数 ---- “CancelSynchronousIo”
Marks pending synchronous I/O operations that are issued by the specified thread as canceled. BOOL W ...
- Spring5参考指南: SpEL
文章目录 Bean定义中的使用 求值 支持的功能 函数 Bean引用 If-Then-Else Elvis Safe Navigation 运算符 集合选择 集合投影 表达式模板化 SpEL的全称叫做 ...
- Vagrant (二) - 日常操作
立即上手 上一节中,我们介绍了怎样安装 Vagrant,安装本身并不困难.本章节中我们首先要快速上手,以便获得一个直观的概念: 建立一个工作目录 打开命令行工具,终端工具,或者iTerm2等,建立一个 ...
- 苹果系统通过brew安装sshpass
默认使用brew install sshpass会出现Warning: MD5 support is deprecated and will be removedin a future version ...
- bfs—Dungeon Master—poj2251
Dungeon Master Time Limit: 1000MS Memory Limit: 65536K Total Submissions: 32228 Accepted: 12378 ...
- Centos7下查询jdk安装路径
今天一个小实验需要安装jdk,用命令java -version查询了一下,原来Centos7自带OpenJDK的环境,但是需要手动配置/etc/profile文件,于是开始找java的安装路径.... ...
- Codeforce 239 B. Easy Tape Programming
There is a programming language in which every program is a non-empty sequence of "<" a ...
- js世家委托详解
事件原理 通过div0.addElementListener来调用:用法:div0.addElementListener(事件类型,事件回调函数,是否捕获时执行){}.1.事件类型(type):必须是 ...
- Centos7 team 绑定多网卡
1.nmcli connection show 查看所有的网络连接 nmcli connection show 接下来我们要使用 ens37 ens38 两个网卡绑定 , 绑定的网卡取名: agg-e ...
- 给springboot增加XSS跨站脚本攻击防护功能
XSS原理 xss攻击的原理是利用前后端校验不严格,用户将攻击代码植入到数据中提交到了后台,当这些数据在网页上被其他用户查看的时候触发攻击 举例:用户提交表单时把地址写成:山东省济南市<scri ...