这两天研究Trident,踩坑踩的遭不住,来和大家分享下。

首先写入数据库:

先看官方API给我们提供的方法:

//这是用来辅助下面MongoInsert的简单实现类
public class SimpleMongoMapper implements MongoMapper {
private String[] fields; @Override
public Document toDocument(ITuple tuple) {
Document document = new Document();
for(String field : fields){
document.append(field, tuple.getValueByField(field));
}
return document;
} public SimpleMongoMapper withFields(String... fields) {
this.fields = fields;
return this;
}
}
//写入数据库
String url = "mongodb://127.0.0.1:27017/test";
String collectionName = "wordcount"; MongoMapper mapper = new SimpleMongoMapper()
.withFields("word", "count"); MongoInsertBolt insertBolt = new MongoInsertBolt(url, collectionName, mapper);

一起把官网关于更新的mapper写了:

public class SimpleMongoUpdateMapper implements MongoMapper {
private String[] fields; @Override
public Document toDocument(ITuple tuple) {
Document document = new Document();
for(String field : fields){
document.append(field, tuple.getValueByField(field));
}
return new Document("$set", document);
} public SimpleMongoUpdateMapper withFields(String... fields) {
this.fields = fields;
return this;
}
}

下面是自己的代码,从spout到最后的存储与跟新:

import java.util.HashMap;
import java.util.Map;
import java.util.Random; import org.apache.storm.spout.SpoutOutputCollector;
import org.apache.storm.task.TopologyContext;
import org.apache.storm.topology.OutputFieldsDeclarer;
import org.apache.storm.topology.base.BaseRichSpout;
import org.apache.storm.tuple.Fields;
import org.apache.storm.tuple.Values; /**
* @author cwc
* @date 2018年6月1日
* @description:假数据生产厂
* @version 1.0.0
*/
public class MongodbSpout extends BaseRichSpout{
private static final long serialVersionUID = 1L;
private SpoutOutputCollector collector;
/**
* 作为字段word输出
*/
private static final Map<Integer, String> LASTNAME = new HashMap<Integer, String>();
static {
LASTNAME.put(0, "anderson");
LASTNAME.put(1, "watson");
LASTNAME.put(2, "ponting");
LASTNAME.put(3, "dravid");
LASTNAME.put(4, "lara");
}
/**
* 作为字段val输出
*/
private static final Map<Integer, String> COMPANYNAME = new HashMap<Integer, String>();
static {
COMPANYNAME.put(0, "abc");
COMPANYNAME.put(1, "dfg");
COMPANYNAME.put(2, "pqr");
COMPANYNAME.put(3, "ecd");
COMPANYNAME.put(4, "awe");
} public void open(Map conf, TopologyContext context,
SpoutOutputCollector spoutOutputCollector) {
this.collector = spoutOutputCollector;
} public void nextTuple() {
final Random rand = new Random();
int randomNumber = rand.nextInt(5);
this.collector.emit (new Values(LASTNAME.get(randomNumber),COMPANYNAME.get(randomNumber)));
System.out.println("数据来袭!!!!!!");
}
@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
//这边字段数量与上面的传输数量注意要一致
declarer.declare(new Fields("word","hello"));
} }
import org.apache.storm.mongodb.common.mapper.MongoMapper;
import org.apache.storm.tuple.ITuple;
import org.bson.Document; /**
* @author cwc
* @date 2018年6月1日
* @description:
* @version 1.0.0
*/
public class SimpleMongoMapper implements MongoMapper { private String[] fields; @Override
public Document toDocument(ITuple tuple) { Document document = new Document();
for(String field : fields){
document.append(field, tuple.getValueByField(field));
}
return document;
} public SimpleMongoMapper withFields(String... fields) {
this.fields = fields;
return this;
}
}

import org.apache.storm.mongodb.common.mapper.MongoMapper;
import org.apache.storm.tuple.ITuple;
import org.bson.Document; /**
* @author cwc
* @date 2018年6月5日
* @description: 用于更新数据的mapper
* @version 1.0.0
*/
public class SimpleMongoUpdateMapper implements MongoMapper {
private static final long serialVersionUID = 1L;
private String[] fields; @Override
public Document toDocument(ITuple tuple) {
Document document = new Document();
for(String field : fields){
document.append(field, tuple.getValueByField(field));
}
return new Document("$set", document);
} public SimpleMongoUpdateMapper withFields(String... fields) {
this.fields = fields;
return this;
}
}

import java.util.Map;

import org.apache.storm.task.OutputCollector;
import org.apache.storm.task.TopologyContext;
import org.apache.storm.topology.OutputFieldsDeclarer;
import org.apache.storm.topology.base.BaseRichBolt;
import org.apache.storm.tuple.Fields;
import org.apache.storm.tuple.Tuple; /**
* @author cwc
* @date 2018年5月30日
* @description:打印拿到的数据
* @version 1.0.0
*/
public class MongoOutBolt extends BaseRichBolt{
private static final long serialVersionUID = 1L;
private OutputCollector collector;
@Override
public void execute(Tuple tuple) {
String str =tuple.getString(0);
// String strs =tuple.getString(1);
System.err.println(str); } @Override
public void prepare(Map arg0, TopologyContext arg1, OutputCollector collector) {
// TODO Auto-generated method stub
this.collector=collector;
} @Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
declarer.declare(new Fields("MongoOutBolt"));
} }

import org.apache.storm.Config;
import org.apache.storm.LocalCluster;
import org.apache.storm.StormSubmitter;
import org.apache.storm.generated.AlreadyAliveException;
import org.apache.storm.generated.AuthorizationException;
import org.apache.storm.generated.InvalidTopologyException;
import org.apache.storm.mongodb.bolt.MongoInsertBolt;
import org.apache.storm.mongodb.bolt.MongoUpdateBolt;
import org.apache.storm.mongodb.common.QueryFilterCreator;
import org.apache.storm.mongodb.common.SimpleQueryFilterCreator;
import org.apache.storm.mongodb.common.mapper.MongoMapper;
import org.apache.storm.topology.TopologyBuilder; /**
* @author cwc
* @date 2018年6月1日
* @description:storm-mongodb的写入,更新,读取
* @version 1.0.0
*/
public class MongodbMain {
private static String url = "mongodb://172.xx.xx.x:27017/test";
private static String collectionName = "storm";
public static void main(String[]args){ // lookMongodb(url, collectionName, args);
// writeMongodb(url, collectionName,args);
updateMongodb(url, collectionName,args);
}
/**
* 将数据写入到Mongodb
* @param url
* @param collectionName
*/
public static void writeMongodb(String url,String collectionName,String[] args){
MongoMapper mapper = new SimpleMongoMapper()
.withFields("word", "val","xx");
MongoInsertBolt insertBolt = new MongoInsertBolt(url, collectionName, mapper); TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("mongodb-save", new MongodbSpout(), 2);
builder.setBolt("save", insertBolt, 1).shuffleGrouping("mongodb-save"); Config conf = new Config();
String name = MongodbMain.class.getSimpleName(); if (args != null && args.length > 0) {
String nimbus = args[0];
conf.put(Config.NIMBUS_HOST, nimbus);
conf.setNumWorkers(3);
try {
StormSubmitter.submitTopologyWithProgressBar(name, conf, builder.createTopology());
} catch (AlreadyAliveException | InvalidTopologyException | AuthorizationException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
} else {
conf.setMaxTaskParallelism(3);
LocalCluster cluster = new LocalCluster();
cluster.submitTopology(name, conf, builder.createTopology());
try {
Thread.sleep(100000);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
cluster.shutdown();
}
}
/**
* 更新mongodb数据
* @param url
* @param collectionName
*/
public static void updateMongodb(String url,String collectionName,String[] args){
MongoMapper mapper =new SimpleMongoUpdateMapper()
.withFields("word", "hello");
QueryFilterCreator updateQueryCreator = new SimpleQueryFilterCreator()
.withField("word"); MongoUpdateBolt updateBolt = new MongoUpdateBolt(url, collectionName, updateQueryCreator, mapper); TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("mongodb-update", new MongodbSpout(), 2);
builder.setBolt("update", updateBolt, 1).shuffleGrouping("mongodb-update"); Config conf = new Config();
String name = MongodbMain.class.getSimpleName(); if (args != null && args.length > 0) {
String nimbus = args[0];
conf.put(Config.NIMBUS_HOST, nimbus);
conf.setNumWorkers(3);
try {
StormSubmitter.submitTopologyWithProgressBar(name, conf, builder.createTopology());
} catch (AlreadyAliveException | InvalidTopologyException | AuthorizationException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
} else {
conf.setMaxTaskParallelism(3);
LocalCluster cluster = new LocalCluster();
cluster.submitTopology(name, conf, builder.createTopology());
try {
Thread.sleep(100000);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
cluster.shutdown();
}
}
/**
* 读取mongodb数据
* @param url
* @param collectionName
*/
public static void lookMongodb(String url,String collectionName,String[] args){
MongodbSpout spout =new MongodbSpout();
MongoLookupMapper mapper = new SimpleMongoLookupMapper()
.withFields("word", "hello");
QueryFilterCreator filterCreator = new SimpleQueryFilterCreator()
.withField("word");
MongoLookupBolt lookupBolt = new MongoLookupBolt(url, collectionName, filterCreator, mapper);
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("mongodb-look", new MongodbSpout(), 2);
builder.setBolt("mongodb-out", lookupBolt, 1).shuffleGrouping("mongodb-look");
builder.setBolt("out", new MongoOutBolt(), 1).shuffleGrouping("mongodb-out"); Config conf = new Config();
String name = MongodbMain.class.getSimpleName(); if (args != null && args.length > 0) {
String nimbus = args[0];
conf.put(Config.NIMBUS_HOST, nimbus);
conf.setNumWorkers(3);
try {
StormSubmitter.submitTopologyWithProgressBar(name, conf, builder.createTopology());
} catch (AlreadyAliveException | InvalidTopologyException | AuthorizationException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
} else {
conf.setMaxTaskParallelism(3);
LocalCluster cluster = new LocalCluster();
cluster.submitTopology(name, conf, builder.createTopology());
try {
Thread.sleep(100000);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
cluster.shutdown();
}
} }

关于storm读取mongodb暂时还有些问题,因为时间原因 过段时间进行解决。

再看看Trident代码:

import org.apache.storm.Config;
import org.apache.storm.LocalCluster;
import org.apache.storm.StormSubmitter;
import org.apache.storm.generated.AlreadyAliveException;
import org.apache.storm.generated.AuthorizationException;
import org.apache.storm.generated.InvalidTopologyException;
import org.apache.storm.generated.StormTopology;
import org.apache.storm.mongodb.common.mapper.MongoMapper;
import org.apache.storm.mongodb.trident.state.MongoState;
import org.apache.storm.mongodb.trident.state.MongoStateFactory;
import org.apache.storm.mongodb.trident.state.MongoStateUpdater;
import org.apache.storm.trident.Stream;
import org.apache.storm.trident.TridentTopology;
import org.apache.storm.trident.state.StateFactory;
import org.apache.storm.trident.testing.FixedBatchSpout;
import org.apache.storm.tuple.Fields;
import org.apache.storm.tuple.Values; import com.sunsheen.jfids.bigdata.storm.demo.mongodb.MongodbSpout;
import com.sunsheen.jfids.bigdata.storm.demo.mongodb.SimpleMongoMapper; /**
* @author cwc
* @date 2018年6月5日
* @description:Storm-mongodb写入高级接口,写入普通数据
* @version 1.0.0
*/
public class MongoTridentState {
public static void main(String[]args){
String url = "mongodb://172.xxx.xxx.xxx:27017/test";
String collectionName = "storm"; Config conf = new Config();
conf.setMaxSpoutPending(3);
if (args != null && args.length > 0) {
//服务器
try {
StormSubmitter.submitTopology(args[1], conf, mongoTrident(url,collectionName));
} catch (AlreadyAliveException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (InvalidTopologyException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (AuthorizationException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
} else {
//本地
LocalCluster cluster = new LocalCluster();
cluster.submitTopology("test", conf, mongoTrident(url,collectionName));
try {
Thread.sleep(100000);
cluster.shutdown();
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
/**
* 写入到mongodb
* @param url
* @param collectionName
* @return
*/
public static StormTopology mongoTrident(String url,String collectionName){
//测试专用
// FixedBatchSpout spout = new FixedBatchSpout(new Fields("sentence", "key"), 5000, new Values("the cow jumped over the moon", 1l),
// new Values("the man went to the store and bought some candy", 2l), new Values("four score and seven years ago", 3l),
// new Values("how many apples can you eat", 4l), new Values("to be or not to be the person", 5l));
// spout.setCycle(true);
MongodbSpout spout =new MongodbSpout(); MongoMapper mapper = new SimpleMongoMapper()
.withFields("word"); MongoState.Options options = new MongoState.Options()
.withUrl(url)
.withCollectionName(collectionName)
.withMapper(mapper); StateFactory factory = new MongoStateFactory(options); TridentTopology topology = new TridentTopology();
Stream stream = topology.newStream("stream", spout);
stream.partitionPersist(factory, new Fields("word"), new MongoStateUpdater(), new Fields());
return topology.build();
}
}

关于storm-mongodb的详解就暂时写到这里,改天有时间再进行补充,研究。

Storm-Mongodb详解的更多相关文章

  1. Storm配置项详解【转】

    Storm配置项详解 ——阿里数据平台技术博客:storm配置项详解 什么是Storm? Storm是twitter开源的一套实时数据处理框架,基于该框架你可以通过简单的编程来实现对数据流的实时处理变 ...

  2. centos7安装mongodb详解

    记录一下linux下安装mongodb数据库过程. 安装mongodb #下载linux版本的tar文件#  例如笔者下载的是:mongodb-linux-x86_64-rhel70-3.4.4.tg ...

  3. CentOS 安装 Mongodb详解 --- 无Linux基础

    先去官方下载离线安装包:https://www.mongodb.com/ ftp连接一下服务器,把离线包上传上去 XShell连接一下: 解压文件(你输一点就可以按tab键,它会自动补全):tar - ...

  4. Storm命令详解

    在Linux终端直接输入storm,不带任何参数信息,或者输入storm help,可以查看storm命令行客户端(Command line client)提供的帮助信息.Storm 0.9.0.1版 ...

  5. mongodb 详解 error:10061 由于目标计算机积极拒绝,无法连接解决方法

    mongodb下载地址(32位):下载地址 自己选择版本 建立如下与mongodb并行的两个文件夹data和log. 然后建立mongo.config. 在mongo.config配置文件中输入: # ...

  6. MongoDB详解学习历程

    MongoDB是一个基于分布式文件存储的数据库,它是介于关系数据库和非关系数据库之间的产品. MongoDB支持的数据结构非常松散,类似json的bjson格式,因此可以存储比较复杂的数据类型.Mon ...

  7. 【转】Storm并行度详解

    1.Storm并行度相关的概念 Storm集群有很多节点,按照类型分为nimbus(主节点).supervisor(从节点),在conf/storm.yaml中配置了一个supervisor,有多个槽 ...

  8. Storm Trident详解

    Trident是基于Storm进行实时留处理的高级抽象,提供了对实时流4的聚集,投影,过滤等操作,从而大大减少了开发Storm程序的工作量.Trident还提供了针对数据库或则其他持久化存储的有状态的 ...

  9. storm配置详解

    storm的配置文件在${STORM_HOME}/conf/storm.yaml.下面详细说明storm的配置信息. java.libary.path:storm本身依赖包的路径,有多个路径的时候使用 ...

  10. Storm之详解spout、blot

    1.Topology的构造backtype.storm.topology.TopologyBuilder 2.Spout组件的编写实现接口 backtype.storm.topology.IRichS ...

随机推荐

  1. js的同步与异步

    JavaScript语言的一大特点就是单线程,也就是说,同一个时间只能做一件事.那么,为什么JavaScript不能有多个线程呢?这样能提高效率啊. JavaScript的单线程,与它的用途有关.作为 ...

  2. 线程状态及各状态下与锁和CPU的关系

    线程的状态 Thread.State枚举类型中定义了线程的六种状态:NEW,RUNNABLE,BLOCKED,WAITING,TIMED_WAITING和TERMINATED. 线程在某一时刻只能拥有 ...

  3. vue中 $refs的基本用法

    骚年,我看你骨骼惊奇,有撸代码的潜质,这里有324.57GB的前端学习资料传授于你!什么,你不信??? 先随便看几个图: 肯定没看够.再来个GIF图热个身??? 那么问题来了,如果你也想入坑前端或者学 ...

  4. RHEL6 搭建 keepalived + lvs/DR 集群

    搭建 keepalived + lvs/DR  集群 使用Keepalived为LVS调度器提供高可用功能,防止调度器单点故障,为用户提供Web服务: LVS1调度器真实IP地址为192.168.4. ...

  5. document.documentElement.scrollTop指定位置失效解决办法

    近期在vue的H5项目中,做指定位置定位的时候发现使用document.documentElement.scrollTop一直不生效. 解决办法是document.documentElement.sc ...

  6. Spring源码学习01:IntelliJ IDEA2019.3编译Spring5.3.x源码

    目录 Spring源码学习01:IntelliJ IDEA2019.3编译Spring5.3.x源码 前言 工欲善其事必先利其器.学习和深读Spring源码一个重要的前提:编译源码到我们的本地环境.这 ...

  7. cmake安装jsoncpp

    cd jsoncpp- mkdir -p build/debug cd build/debug cmake -DCMAKE_BUILD_TYPE=release -DBUILD_STATIC_LIBS ...

  8. tomcat日志清理

    删除指定IP的日志后,删除自身 import os import time import sys ip="127.0.0.1" logpath="/var/lib/tom ...

  9. 教你配置windows上的windbg,linux上的lldb,打入clr内部这一篇就够了

    一:背景 1. 讲故事 前几天公众号里有位兄弟看了几篇文章之后,也准备用windbg试试看,结果这一配就花了好几天,(づ╥﹏╥)づ,我想也有很多跃跃欲试的朋友在配置的时候肯定会遇到这样和那样的问题,所 ...

  10. Coursera课程笔记----P4E.Capstone----Week 6&7

    Visualizing Email Data(Week 6&7) code segment gword.py import sqlite3 import time import zlib im ...