为了重复这个实验,遇到不少坑

https://www.iteblog.com/archives/1889.html

/**
* Created by Administrator on 2017/8/18.
*/
public class IteblogBulkLoadDriver {
public static class IteblogBulkLoadMapper extends Mapper<LongWritable, Text, StringWriter, Put> {
protected void map(LongWritable key, Text value, Context context) throws InterruptedException, IOException {
if(value==null) {
return;
} String line = value.toString(); String[] items = line.split("\\^");
if(items.length<){
items = line.split("\\^");
}
if(items.length<){
System.out.println("================less 3");
return;
}
System.out.println(line);
String rowKey = items[]+items[];
Put put = new Put(Bytes.toBytes(items[])); //ROWKEY
put.addColumn("cf".getBytes(), "url".getBytes(), items[].getBytes());
put.addColumn("cf".getBytes(), "name".getBytes(), items[].getBytes());
context.write(new StringWriter().append(rowKey), put);
}
} public static class HBaseHFileReducer extends
Reducer<StringWriter, Put, ImmutableBytesWritable, Put> {
protected void reduce(StringWriter key, Iterable<Put> values,
Context context) throws IOException, InterruptedException {
String value = "";
ImmutableBytesWritable k = new ImmutableBytesWritable(key.toString().getBytes()); Put val = values.iterator().next();
context.write(k, val);
} } public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
// String SRC_PATH= "hdfs:/slave1:8020/maats5/pay/logdate=20170906";
// String DESC_PATH= "hdfs:/slave1:8020/maats5_test/pay/logdate=20170906";
String SRC_PATH= args[];
String DESC_PATH=args[];
Configuration conf = HBaseConnectionFactory.config;
conf.set("fs.hdfs.impl", "org.apache.hadoop.hdfs.DistributedFileSystem");
Job job=Job.getInstance(conf);
job.setJarByClass(IteblogBulkLoadDriver.class);
job.setMapperClass(IteblogBulkLoadMapper.class);
job.setMapOutputKeyClass(StringWriter.class);
job.setMapOutputValueClass(Put.class);
job.setReducerClass(HBaseHFileReducer.class);
job.setOutputFormatClass(HFileOutputFormat2.class);
HTable table = new HTable(conf,"maatstest");
HFileOutputFormat2.configureIncrementalLoad(job,table,table.getRegionLocator());
FileInputFormat.addInputPath(job,new Path(SRC_PATH));
FileOutputFormat.setOutputPath(job,new Path(DESC_PATH)); System.exit(job.waitForCompletion(true)?:);
}
}

When using the bulkloader (LoadIncrementalHFiles, doBulkLoad) you can only add items that are "lexically ordered", ie. you need to make sure that the items you add are sorted by the row-id.

https://stackoverflow.com/questions/25860114/hfile-creation-added-a-key-not-lexically-larger-than-previous-key

http://ganliang13.iteye.com/blog/1884921

Caused by: java.io.IOException: Added a key not lexically larger than previous.的更多相关文章

  1. spark bulkload 报错异常:Caused by: java.io.IOException: Added a key not lexically larger than previous

    ------------恢复内容开始------------ Caused by: java.io.IOException: Added a key not lexically larger than ...

  2. eclipse连接远程Hadoop报错,Caused by: java.io.IOException: 远程主机强迫关闭了一个现有的连接。

    eclipse连接远程Hadoop报错,Caused by: java.io.IOException: 远程主机强迫关闭了一个现有的连接.全部报错信息如下: Exception in thread & ...

  3. hive对于lzo文件处理异常Caused by: java.io.IOException: Compressed length 842086665 exceeds max block size 67108864 (probably corrupt file)

    hive查询lzo数据格式文件的表时,抛 Caused by: java.io.IOException: Compressed length 842086665 exceeds max block s ...

  4. Caused by: java.io.IOException: Filesystem closed的处理

    org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename output from: hdfs://nameservice/u ...

  5. Caused by: java.io.IOException: 你的主机中的软件中止了一个已建立的连接。

    org.apache.catalina.connector.ClientAbortException: java.io.IOException: 你的主机中的软件中止了一个已建立的连接. at org ...

  6. Caused by: java.io.IOException: 您的主机中的软件中止了一个已建立的连接。

    异常详情 2017-07-16 10:55:26,218 ERROR [500.jsp] - java.io.IOException: 你的主机中的软件中止了一个已建立的连接. org.apache. ...

  7. Caused by : java.io.IOException: Cleartext HTTP traffic to 《“url”》 not permitted

    一.问题原因: 根本原因是从Android9.0开始,出于完全因素考虑,默认不再支持http网络请求,需要使用 https. 二.解决方案: 解决的基本思路是:对指定的网址进行过滤,强制允许指定网址继 ...

  8. Push notification - Caused by java.io.IOException toDerInputStream rejects tag

    苹果推送 : 文件不是P12文件当生成一个P12,需要选择两个,在钥匙串访问的私钥和证书.

  9. Caused by: java.io.IOException: Type mismath in vlaue from map: excepted org.apache.hadoop.io.InaWritable,received SC

    解决办法: 看map和reduce的输入是不是对应,看看map和reduce设置的参数和下面的是否一致

随机推荐

  1. Linux 的僵尸(zombie)进程

    可能很少有人意识到,在一个进程调用了exit之后,该进程 并非马上就消失掉,而是留下一个称为僵尸进程(Zombie)的数据结构.在Linux进程的5种状态中,僵尸进程是非常特殊的一种,它已经放弃了几乎 ...

  2. python 多进程,实际上都没有运行,sleep

    进程以及状态 1. 进程程序:例如xxx.py这是程序,是一个静态的 进程:一个程序运行起来后,代码+用到的资源 称之为进程,它是操作系统分配资源的基本单元. 不仅可以通过线程完成多任务,进程也是可以 ...

  3. Matlab中classperf对象各属性解释[原创]

    1.ClassLabels:类型标识.第一个label作为pos,第二次label作为neg. 2.GroundTruth:各次实验的观察值,也就是真实值. 3.ValidationCounter: ...

  4. VS Code插件Vue2 代码补全工具

    一.简介 此扩展将Vue 2代码片段和语法突出显示添加到Visual Studio代码中. 这个插件基于最新的Vue官方语法高亮文件添加了语法高亮,并且依据Vue 2的API添加了代码片段. 支持语言 ...

  5. 通过ip查找能应机器的MAC

    例如:10.100.0.61 这些都是基于linux系统: 首先:ping 一下这个ip 然后arp 10.100.0.61就可以找出主机的MAC地址

  6. GPG key retrieval failed

    Total size: 340 k Installed size: 1.2 M Is this ok [y/N]: y Downloading Packages: warning: rpmts_Hdr ...

  7. zabbix 实现 iptables 监控

    安装iptstate # yum install iptstate 配置zabbix key iptables.conf # cat /etc/zabbix/zabbix_agentd.d/iptab ...

  8. express应用中常用中间件介绍

    var strftime = require('strftime'); 时间格式化中间件,功能和moment.js差不多 var methodOverride = require('method-ov ...

  9. nil

    Lua中特殊的类型,他只有一个值:nil:一个全局变量没有被赋值以前默认值为nil:给全局变量负nil可以删除该变量.

  10. Flink PPT

    杭州第六次 Spark & Flink Meetup 资料分享 https://github.com/397090770/Spark-Flink-Meetup-6-Hangzhou https ...