Storm-Mongodb详解
这两天研究Trident,踩坑踩的遭不住,来和大家分享下。
首先写入数据库:
先看官方API给我们提供的方法:
//这是用来辅助下面MongoInsert的简单实现类
public class SimpleMongoMapper implements MongoMapper {
private String[] fields;
@Override
public Document toDocument(ITuple tuple) {
Document document = new Document();
for(String field : fields){
document.append(field, tuple.getValueByField(field));
}
return document;
}
public SimpleMongoMapper withFields(String... fields) {
this.fields = fields;
return this;
}
}
//写入数据库
String url = "mongodb://127.0.0.1:27017/test";
String collectionName = "wordcount";
MongoMapper mapper = new SimpleMongoMapper()
.withFields("word", "count");
MongoInsertBolt insertBolt = new MongoInsertBolt(url, collectionName, mapper);
一起把官网关于更新的mapper写了:
public class SimpleMongoUpdateMapper implements MongoMapper {
private String[] fields;
@Override
public Document toDocument(ITuple tuple) {
Document document = new Document();
for(String field : fields){
document.append(field, tuple.getValueByField(field));
}
return new Document("$set", document);
}
public SimpleMongoUpdateMapper withFields(String... fields) {
this.fields = fields;
return this;
}
}
下面是自己的代码,从spout到最后的存储与跟新:
import java.util.HashMap;
import java.util.Map;
import java.util.Random;
import org.apache.storm.spout.SpoutOutputCollector;
import org.apache.storm.task.TopologyContext;
import org.apache.storm.topology.OutputFieldsDeclarer;
import org.apache.storm.topology.base.BaseRichSpout;
import org.apache.storm.tuple.Fields;
import org.apache.storm.tuple.Values;
/**
* @author cwc
* @date 2018年6月1日
* @description:假数据生产厂
* @version 1.0.0
*/
public class MongodbSpout extends BaseRichSpout{
private static final long serialVersionUID = 1L;
private SpoutOutputCollector collector;
/**
* 作为字段word输出
*/
private static final Map<Integer, String> LASTNAME = new HashMap<Integer, String>();
static {
LASTNAME.put(0, "anderson");
LASTNAME.put(1, "watson");
LASTNAME.put(2, "ponting");
LASTNAME.put(3, "dravid");
LASTNAME.put(4, "lara");
}
/**
* 作为字段val输出
*/
private static final Map<Integer, String> COMPANYNAME = new HashMap<Integer, String>();
static {
COMPANYNAME.put(0, "abc");
COMPANYNAME.put(1, "dfg");
COMPANYNAME.put(2, "pqr");
COMPANYNAME.put(3, "ecd");
COMPANYNAME.put(4, "awe");
}
public void open(Map conf, TopologyContext context,
SpoutOutputCollector spoutOutputCollector) {
this.collector = spoutOutputCollector;
}
public void nextTuple() {
final Random rand = new Random();
int randomNumber = rand.nextInt(5);
this.collector.emit (new Values(LASTNAME.get(randomNumber),COMPANYNAME.get(randomNumber)));
System.out.println("数据来袭!!!!!!");
}
@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
//这边字段数量与上面的传输数量注意要一致
declarer.declare(new Fields("word","hello"));
}
}
import org.apache.storm.mongodb.common.mapper.MongoMapper;
import org.apache.storm.tuple.ITuple;
import org.bson.Document;
/**
* @author cwc
* @date 2018年6月1日
* @description:
* @version 1.0.0
*/
public class SimpleMongoMapper implements MongoMapper {
private String[] fields;
@Override
public Document toDocument(ITuple tuple) {
Document document = new Document();
for(String field : fields){
document.append(field, tuple.getValueByField(field));
}
return document;
}
public SimpleMongoMapper withFields(String... fields) {
this.fields = fields;
return this;
}
}
import org.apache.storm.mongodb.common.mapper.MongoMapper;
import org.apache.storm.tuple.ITuple;
import org.bson.Document;
/**
* @author cwc
* @date 2018年6月5日
* @description: 用于更新数据的mapper
* @version 1.0.0
*/
public class SimpleMongoUpdateMapper implements MongoMapper {
private static final long serialVersionUID = 1L;
private String[] fields;
@Override
public Document toDocument(ITuple tuple) {
Document document = new Document();
for(String field : fields){
document.append(field, tuple.getValueByField(field));
}
return new Document("$set", document);
}
public SimpleMongoUpdateMapper withFields(String... fields) {
this.fields = fields;
return this;
}
}
import java.util.Map;
import org.apache.storm.task.OutputCollector;
import org.apache.storm.task.TopologyContext;
import org.apache.storm.topology.OutputFieldsDeclarer;
import org.apache.storm.topology.base.BaseRichBolt;
import org.apache.storm.tuple.Fields;
import org.apache.storm.tuple.Tuple;
/**
* @author cwc
* @date 2018年5月30日
* @description:打印拿到的数据
* @version 1.0.0
*/
public class MongoOutBolt extends BaseRichBolt{
private static final long serialVersionUID = 1L;
private OutputCollector collector;
@Override
public void execute(Tuple tuple) {
String str =tuple.getString(0);
// String strs =tuple.getString(1);
System.err.println(str);
}
@Override
public void prepare(Map arg0, TopologyContext arg1, OutputCollector collector) {
// TODO Auto-generated method stub
this.collector=collector;
}
@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
declarer.declare(new Fields("MongoOutBolt"));
}
}
import org.apache.storm.Config;
import org.apache.storm.LocalCluster;
import org.apache.storm.StormSubmitter;
import org.apache.storm.generated.AlreadyAliveException;
import org.apache.storm.generated.AuthorizationException;
import org.apache.storm.generated.InvalidTopologyException;
import org.apache.storm.mongodb.bolt.MongoInsertBolt;
import org.apache.storm.mongodb.bolt.MongoUpdateBolt;
import org.apache.storm.mongodb.common.QueryFilterCreator;
import org.apache.storm.mongodb.common.SimpleQueryFilterCreator;
import org.apache.storm.mongodb.common.mapper.MongoMapper;
import org.apache.storm.topology.TopologyBuilder;
/**
* @author cwc
* @date 2018年6月1日
* @description:storm-mongodb的写入,更新,读取
* @version 1.0.0
*/
public class MongodbMain {
private static String url = "mongodb://172.xx.xx.x:27017/test";
private static String collectionName = "storm";
public static void main(String[]args){
// lookMongodb(url, collectionName, args);
// writeMongodb(url, collectionName,args);
updateMongodb(url, collectionName,args);
}
/**
* 将数据写入到Mongodb
* @param url
* @param collectionName
*/
public static void writeMongodb(String url,String collectionName,String[] args){
MongoMapper mapper = new SimpleMongoMapper()
.withFields("word", "val","xx");
MongoInsertBolt insertBolt = new MongoInsertBolt(url, collectionName, mapper);
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("mongodb-save", new MongodbSpout(), 2);
builder.setBolt("save", insertBolt, 1).shuffleGrouping("mongodb-save");
Config conf = new Config();
String name = MongodbMain.class.getSimpleName();
if (args != null && args.length > 0) {
String nimbus = args[0];
conf.put(Config.NIMBUS_HOST, nimbus);
conf.setNumWorkers(3);
try {
StormSubmitter.submitTopologyWithProgressBar(name, conf, builder.createTopology());
} catch (AlreadyAliveException | InvalidTopologyException | AuthorizationException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
} else {
conf.setMaxTaskParallelism(3);
LocalCluster cluster = new LocalCluster();
cluster.submitTopology(name, conf, builder.createTopology());
try {
Thread.sleep(100000);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
cluster.shutdown();
}
}
/**
* 更新mongodb数据
* @param url
* @param collectionName
*/
public static void updateMongodb(String url,String collectionName,String[] args){
MongoMapper mapper =new SimpleMongoUpdateMapper()
.withFields("word", "hello");
QueryFilterCreator updateQueryCreator = new SimpleQueryFilterCreator()
.withField("word");
MongoUpdateBolt updateBolt = new MongoUpdateBolt(url, collectionName, updateQueryCreator, mapper);
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("mongodb-update", new MongodbSpout(), 2);
builder.setBolt("update", updateBolt, 1).shuffleGrouping("mongodb-update");
Config conf = new Config();
String name = MongodbMain.class.getSimpleName();
if (args != null && args.length > 0) {
String nimbus = args[0];
conf.put(Config.NIMBUS_HOST, nimbus);
conf.setNumWorkers(3);
try {
StormSubmitter.submitTopologyWithProgressBar(name, conf, builder.createTopology());
} catch (AlreadyAliveException | InvalidTopologyException | AuthorizationException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
} else {
conf.setMaxTaskParallelism(3);
LocalCluster cluster = new LocalCluster();
cluster.submitTopology(name, conf, builder.createTopology());
try {
Thread.sleep(100000);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
cluster.shutdown();
}
}
/**
* 读取mongodb数据
* @param url
* @param collectionName
*/
public static void lookMongodb(String url,String collectionName,String[] args){
MongodbSpout spout =new MongodbSpout();
MongoLookupMapper mapper = new SimpleMongoLookupMapper()
.withFields("word", "hello");
QueryFilterCreator filterCreator = new SimpleQueryFilterCreator()
.withField("word");
MongoLookupBolt lookupBolt = new MongoLookupBolt(url, collectionName, filterCreator, mapper);
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("mongodb-look", new MongodbSpout(), 2);
builder.setBolt("mongodb-out", lookupBolt, 1).shuffleGrouping("mongodb-look");
builder.setBolt("out", new MongoOutBolt(), 1).shuffleGrouping("mongodb-out");
Config conf = new Config();
String name = MongodbMain.class.getSimpleName();
if (args != null && args.length > 0) {
String nimbus = args[0];
conf.put(Config.NIMBUS_HOST, nimbus);
conf.setNumWorkers(3);
try {
StormSubmitter.submitTopologyWithProgressBar(name, conf, builder.createTopology());
} catch (AlreadyAliveException | InvalidTopologyException | AuthorizationException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
} else {
conf.setMaxTaskParallelism(3);
LocalCluster cluster = new LocalCluster();
cluster.submitTopology(name, conf, builder.createTopology());
try {
Thread.sleep(100000);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
cluster.shutdown();
}
}
}
关于storm读取mongodb暂时还有些问题,因为时间原因 过段时间进行解决。
再看看Trident代码:
import org.apache.storm.Config;
import org.apache.storm.LocalCluster;
import org.apache.storm.StormSubmitter;
import org.apache.storm.generated.AlreadyAliveException;
import org.apache.storm.generated.AuthorizationException;
import org.apache.storm.generated.InvalidTopologyException;
import org.apache.storm.generated.StormTopology;
import org.apache.storm.mongodb.common.mapper.MongoMapper;
import org.apache.storm.mongodb.trident.state.MongoState;
import org.apache.storm.mongodb.trident.state.MongoStateFactory;
import org.apache.storm.mongodb.trident.state.MongoStateUpdater;
import org.apache.storm.trident.Stream;
import org.apache.storm.trident.TridentTopology;
import org.apache.storm.trident.state.StateFactory;
import org.apache.storm.trident.testing.FixedBatchSpout;
import org.apache.storm.tuple.Fields;
import org.apache.storm.tuple.Values;
import com.sunsheen.jfids.bigdata.storm.demo.mongodb.MongodbSpout;
import com.sunsheen.jfids.bigdata.storm.demo.mongodb.SimpleMongoMapper;
/**
* @author cwc
* @date 2018年6月5日
* @description:Storm-mongodb写入高级接口,写入普通数据
* @version 1.0.0
*/
public class MongoTridentState {
public static void main(String[]args){
String url = "mongodb://172.xxx.xxx.xxx:27017/test";
String collectionName = "storm";
Config conf = new Config();
conf.setMaxSpoutPending(3);
if (args != null && args.length > 0) {
//服务器
try {
StormSubmitter.submitTopology(args[1], conf, mongoTrident(url,collectionName));
} catch (AlreadyAliveException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (InvalidTopologyException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (AuthorizationException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
} else {
//本地
LocalCluster cluster = new LocalCluster();
cluster.submitTopology("test", conf, mongoTrident(url,collectionName));
try {
Thread.sleep(100000);
cluster.shutdown();
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
/**
* 写入到mongodb
* @param url
* @param collectionName
* @return
*/
public static StormTopology mongoTrident(String url,String collectionName){
//测试专用
// FixedBatchSpout spout = new FixedBatchSpout(new Fields("sentence", "key"), 5000, new Values("the cow jumped over the moon", 1l),
// new Values("the man went to the store and bought some candy", 2l), new Values("four score and seven years ago", 3l),
// new Values("how many apples can you eat", 4l), new Values("to be or not to be the person", 5l));
// spout.setCycle(true);
MongodbSpout spout =new MongodbSpout();
MongoMapper mapper = new SimpleMongoMapper()
.withFields("word");
MongoState.Options options = new MongoState.Options()
.withUrl(url)
.withCollectionName(collectionName)
.withMapper(mapper);
StateFactory factory = new MongoStateFactory(options);
TridentTopology topology = new TridentTopology();
Stream stream = topology.newStream("stream", spout);
stream.partitionPersist(factory, new Fields("word"), new MongoStateUpdater(), new Fields());
return topology.build();
}
}
关于storm-mongodb的详解就暂时写到这里,改天有时间再进行补充,研究。
Storm-Mongodb详解的更多相关文章
- Storm配置项详解【转】
Storm配置项详解 ——阿里数据平台技术博客:storm配置项详解 什么是Storm? Storm是twitter开源的一套实时数据处理框架,基于该框架你可以通过简单的编程来实现对数据流的实时处理变 ...
- centos7安装mongodb详解
记录一下linux下安装mongodb数据库过程. 安装mongodb #下载linux版本的tar文件# 例如笔者下载的是:mongodb-linux-x86_64-rhel70-3.4.4.tg ...
- CentOS 安装 Mongodb详解 --- 无Linux基础
先去官方下载离线安装包:https://www.mongodb.com/ ftp连接一下服务器,把离线包上传上去 XShell连接一下: 解压文件(你输一点就可以按tab键,它会自动补全):tar - ...
- Storm命令详解
在Linux终端直接输入storm,不带任何参数信息,或者输入storm help,可以查看storm命令行客户端(Command line client)提供的帮助信息.Storm 0.9.0.1版 ...
- mongodb 详解 error:10061 由于目标计算机积极拒绝,无法连接解决方法
mongodb下载地址(32位):下载地址 自己选择版本 建立如下与mongodb并行的两个文件夹data和log. 然后建立mongo.config. 在mongo.config配置文件中输入: # ...
- MongoDB详解学习历程
MongoDB是一个基于分布式文件存储的数据库,它是介于关系数据库和非关系数据库之间的产品. MongoDB支持的数据结构非常松散,类似json的bjson格式,因此可以存储比较复杂的数据类型.Mon ...
- 【转】Storm并行度详解
1.Storm并行度相关的概念 Storm集群有很多节点,按照类型分为nimbus(主节点).supervisor(从节点),在conf/storm.yaml中配置了一个supervisor,有多个槽 ...
- Storm Trident详解
Trident是基于Storm进行实时留处理的高级抽象,提供了对实时流4的聚集,投影,过滤等操作,从而大大减少了开发Storm程序的工作量.Trident还提供了针对数据库或则其他持久化存储的有状态的 ...
- storm配置详解
storm的配置文件在${STORM_HOME}/conf/storm.yaml.下面详细说明storm的配置信息. java.libary.path:storm本身依赖包的路径,有多个路径的时候使用 ...
- Storm之详解spout、blot
1.Topology的构造backtype.storm.topology.TopologyBuilder 2.Spout组件的编写实现接口 backtype.storm.topology.IRichS ...
随机推荐
- 解决 docker.io 上拉取 images Get https://registry-1.docker.io/v2/: net/http: TLS handshake timeout
处理方式 使用如下命令获取 registry-1.docker.io 可用的 ip dig @114.114.114.114 registry-1.docker.io 看到如下输出结果 ; <& ...
- 09-5.部署 EFK 插件
09-5.部署 EFK 插件 EFK 对应的目录:kubernetes/cluster/addons/fluentd-elasticsearch $ cd /opt/k8s/kubernetes/cl ...
- Eclipse Mac OS 安装中文简体语言包
打开Eclipse软件,在导航Eclipse下拉菜单中点开 About Eclipse 查看版本 我的是 Eclipse IDE for Enterprise Java Developers. Ver ...
- 如何设计scalable 的系统 (转载)
Design a Scalable System Design a system that scales to millions of users (AWS based) Step 1: Outlin ...
- Jaba_Web--JDBC 删除记录操作模板
import java.sql.Connection; import java.sql.DriverManager; import java.sql.PreparedStatement; import ...
- LCA 学习总结
怎么说,LCA裸题直接套板子,大家都会做,这样的题没必要看,剩下的题发先LCA只是一个工具就像是搜索一样,只是一个工具而不是一种算法,所以借助这套工具在其图论问题如最长路,数据结构等问题上再去发挥作用 ...
- 商汤提出解偶检测中分类和定位分支的新方法TSD,COCO 51.2mAP | CVPR 2020
目前很多研究表明目标检测中的分类分支和定位分支存在较大的偏差,论文从sibling head改造入手,跳出常规的优化方向,提出TSD方法解决混合任务带来的内在冲突,从主干的proposal中学习不同的 ...
- 【Hexo】使用Hexo+github pages+travis ci 实现自动化部署
目录 一.说明 二.成品展示 三.前期准备 本地安装 node.js 本地安装 git github 账号 创建仓库 travis ci 账号 四.安装 Hexo 五.使用 hexo 搭建博客 六.部 ...
- 2-MyBatisPlus教程(HelloWorld)
1,准备数据 DROP TABLE IF EXISTS user; CREATE TABLE user ( id ) NOT NULL COMMENT '主键ID', name ) NULL DEFA ...
- 线段树的区间合并 B - LCIS
B - LCIS HDU - 3308 这个是一个很简单很明显的线段树的区间合并,不过区间合并的题目都还是有点难写,建议存个板子. #include <cstdio> #include & ...