功能：把hdfs上的数据写入到hbase表。

hadoop的mapreduce输出要导入到hbase表，最好先输出HFile格式，再导入hbase,因为HFile是hbase的内部存储格式，所以导入效率很高，下面我们来看一下具体怎么做。

1、我们在hdfs上有一个文本文件：

2、在hbase表里我们创建一个t1表

　　创建语句：create 't1','cf'

3、写MR作业

 package cn.tendency.wenzhouhbase.hadoop;

 import java.io.IOException;

 import java.text.SimpleDateFormat;

 import java.util.Calendar;

 import org.apache.hadoop.conf.Configuration;

 import org.apache.hadoop.hbase.client.Mutation;

 import org.apache.hadoop.hbase.client.Put;

 import org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil;

 import org.apache.hadoop.hbase.mapreduce.TableOutputFormat;

 import org.apache.hadoop.hbase.mapreduce.TableReducer;

 import org.apache.hadoop.io.LongWritable;

 import org.apache.hadoop.io.NullWritable;

 import org.apache.hadoop.io.Text;

 import org.apache.hadoop.mapreduce.Job;

 import org.apache.hadoop.mapreduce.Mapper;

 import org.apache.hadoop.mapreduce.Reducer;

 import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;

 import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;

 public class Hadoop2Hbase {

     @SuppressWarnings("deprecation")

     public static void main(String[] args) throws Exception {

         Configuration conf = new Configuration();

         conf.set("hbase.zookeeper.quorum", "192.168.1.124,192.168.1.125,192.168.1.126");

         conf.set("hbase.zookeeper.property.clientPort", "2181");

         conf.set("hbase.master.port", "60000");

         conf.set("hbase.rootdir", "hdfs://192.168.1.122:9000/hbase");

         conf.set(TableOutputFormat.OUTPUT_TABLE, "t1");

         Job job = new Job(conf, Hadoop2Hbase.class.getSimpleName());

         TableMapReduceUtil.addDependencyJars(job);

         job.setJarByClass(Hadoop2Hbase.class);

         job.setMapperClass(HbaseMapper.class);

         job.setReducerClass(HbaseReducer.class);

         job.setMapOutputKeyClass(LongWritable.class);

         job.setMapOutputValueClass(Text.class);

         job.setInputFormatClass(TextInputFormat.class);

         job.setOutputFormatClass(TableOutputFormat.class);

         FileInputFormat.setInputPaths(job, "hdfs://192.168.1.123:9000/mytest/*");

         job.waitForCompletion(true);

     }

     static class HbaseMapper extends

             Mapper<LongWritable, Text, LongWritable, Text> {

         @Override

         protected void map(LongWritable key, Text value,

                 Mapper<LongWritable, Text, LongWritable, Text>.Context context)

                 throws IOException, InterruptedException {

             SimpleDateFormat sdf = new SimpleDateFormat("yyyyMMddHHmmss");

             String[] split = value.toString().split("\t");

             context.write(

                     key,

                     new Text(split[0]+sdf.format(Calendar.getInstance().getTime())

                             + "\t" + value.toString()));

         }

     }

     static class HbaseReducer extends

             TableReducer<LongWritable, Text, NullWritable> {

         @Override

         protected void reduce(

                 LongWritable key,

                 Iterable<Text> values,

                 Reducer<LongWritable, Text, NullWritable, Mutation>.Context context)

                 throws IOException, InterruptedException {

             for (Text text : values) {

                 String[] split = text.toString().split("\t");

                 Put put = new Put(split[0].getBytes());

                 put.addColumn("cf".getBytes(), "oneColumn".getBytes(), text

                         .toString().getBytes());

                 put.addColumn("cf".getBytes(), "id".getBytes(),

                         split[1].getBytes());

                 put.addColumn("cf".getBytes(), "name".getBytes(),

                         split[2].getBytes());

                 put.addColumn("cf".getBytes(), "age".getBytes(),

                         split[3].getBytes());

 //                put.addColumn("cf".getBytes(), "addr".getBytes(),

 //                        split[4].getBytes());

                 context.write(NullWritable.get(), put);

             }

         }

     }

 }

把hdfs数据写入到hbase表的更多相关文章

hbase使用MapReduce操作4（实现将 HDFS 中的数据写入到 HBase 表中）
实现将 HDFS 中的数据写入到 HBase 表中 Runner类 package com.yjsj.hbase_mr2; import com.yjsj.hbase_mr2.ReadFruitFro ...
Flink 使用（一）——从kafka中读取数据写入到HBASE中
1.前言本文是在<如何计算实时热门商品>[1]一文上做的扩展,仅在功能上验证了利用Flink消费Kafka数据,把处理后的数据写入到HBase的流程,其具体性能未做调优.此外,文中并未就 ...
使用spark将内存中的数据写入到hive表中
使用spark将内存中的数据写入到hive表中 hive-site.xml <?xml version="1.0" encoding="UTF-8" st ...
将从数据库中获取的数据写入到Excel表中
pom.xml文件写入代码,maven自动加载poi-3.1-beta2.jar  & ...
使用MapReduce将HDFS数据导入到HBase（三）
使用MapReduce生成HFile文件,通过BulkLoader方式(跳过WAL验证)批量加载到HBase表中 package com.mengyao.bigdata.hbase; import j ...
Mapreduce读取Hbase表，写数据到一个Hbase表中
public class LabelJob { public static void main(String[] args) throws Exception { Job job = Job.getI ...
使用MapReduce将HDFS数据导入到HBase（二）
package com.bank.service; import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.conf. ...
使用MapReduce将HDFS数据导入到HBase（一）
package com.bank.service; import java.io.IOException; import org.apache.hadoop.conf.Configuration;im ...
Mysql把一个表的数据写入另一个表中
一.表结构一样 insert into 表1 select * from 表2 二. 表结构不一样或者取部分列 insert into 表1 (列名1,列名2,列名3) select 列1,列2,列3 ...

随机推荐

（零）引言——关于effective Java 3th
去年4月份那时候,读过本书的第二版本,那时候寻思着好好读完,但是事与愿违,没有读完! 现在起,寻思着再次开始读吧: 现在第三版也出版了,还有第二版的翻译问题,遂决定读第三版的英文版吧: PDF版本可以 ...
PHP中的PDO数据对象
PDO: PHP Data Object:php的数据对象.pdo是数据库操作工具类!1,它能操作很多种数据库,比如mysql,oracle,sybase....2,它具有操作数据库的更多的功能,比如 ...
MySQL新项目如何确保上线安全
大纲 DBA最应该做的事情新项目开发环境应该注意什么功能测试和压力测试MySQL DBA关注点线上环境关注点业务在大压力情况下,MySQL如何能活下来 DBA最应该做的事情备份建议每天全备 ...
ES6语法基本使用
什么是ES6? ECMAScript 6(以下简称ES6)是JavaScript语言的下一代标准,已经在2015年6月正式发布了.Mozilla公司将在这个标准的基础上,推出JavaScript 2. ...
谷歌chrome浏览器提示“喔唷崩溃啦”的解决方案
原因分析:有可能是注册列表被一些卫士类优化工具或杀毒软件优化了. 解决方案:1. 卸载谷歌浏览器. ①开始→控制面板→添加或删除程序→找到谷歌浏览器卸载(卸载时勾选删除数据) ② 进入注册列表删除谷歌 ...
hdu 2102 a计划问题。。双层dfs问题
Problem Description 可怜的公主在一次次被魔王掳走一次次被骑士们救回来之后,而今,不幸的她再一次面临生命的考验.魔王已经发出消息说将在T时刻吃掉公主,因为他听信谣言说吃公主的肉也能长 ...
Python之特征工程-3
一.什么是特征工程?其实也是数据处理的一种方式,和前面的原始数据不一样的是,我们在原始数据的基础上面,通过提取有效特征,来预测目标值.而想要更好的去得出结果,包括前面使用的数据处理中数据特征提取,新增 ...
Django一对一查询，列类型及参数
一对一查询表的创建 # 通过 OneToOneField 创建一对一的关系 from django.db import models # Create your models here. class ...
Java 之反射机制
反射:框架设计的灵魂框架:是一个可以供我们使用的半成品软件.可以在框架的基础上进行软件开发,简化编码. 反射:将类的各个组成部分封装为其他对象,这就是反射机制. 好处: 1. 可以在程序运行过程中, ...
iOS copy和mutableCopy 整理
copy 和 mutableCopy 你真的理解吗?最近发现很多面试者基本都不能很好地回答这个问题.所以整理一下. copy和mutableCopy的概念: copy 浅拷贝,不拷贝对象本身,仅仅是拷 ...

把hdfs数据写入到hbase表

功能：把hdfs上的数据写入到hbase表。

把hdfs数据写入到hbase表的更多相关文章

随机推荐

热门专题