Hadoop3集群搭建之——hive添加自定义函数UDTF

上篇：

简述下需求：

　　系统userid格式如下：

　　前三位代表国家

　　接下来三位代表省

　　再接下来三位代表市

　　剩下的所以代表商店

（瞎掰的需求，大意就是要切割字符串）

直接上代码：

/**

 * Created by venn on 5/20/2018.

 * SplitString : split string

 * first 3 string : country

 * next 3 string : province

 * next 3 string : city

 * next all : story

 */

public class SplitString extends GenericUDTF {

    /**

     * add the column name，添加列名，类型。使用的hive-exec 1.2.1,想用2.3.3的，但是不会初始化列名部分

     * @param args

     * @return

     * @throws UDFArgumentException

     */

    @Override

    public StructObjectInspector initialize(ObjectInspector[] args) throws UDFArgumentException {

        if (args.length != ) {

            throw new UDFArgumentLengthException("ExplodeMap takes only one argument");

        }

        if (args[].getCategory() != ObjectInspector.Category.PRIMITIVE) {

            throw new UDFArgumentException("ExplodeMap takes string as a parameter");

        }

        ArrayList<String> fieldNames = new ArrayList<String>();

        ArrayList<ObjectInspector> fieldOIs = new ArrayList<ObjectInspector>();

        fieldNames.add("userid"); // 第一列将输入字符串原样输出，方便查看

        fieldOIs.add(PrimitiveObjectInspectorFactory.javaStringObjectInspector);

        fieldNames.add("country");  // 第二列为国家

        fieldOIs.add(PrimitiveObjectInspectorFactory.javaStringObjectInspector);

        fieldNames.add("province"); //第三列为省

        fieldOIs.add(PrimitiveObjectInspectorFactory.javaStringObjectInspector);

        fieldNames.add("city");  // 第四列为市

        fieldOIs.add(PrimitiveObjectInspectorFactory.javaStringObjectInspector);

        fieldNames.add("story");  // 第五列商店

        fieldOIs.add(PrimitiveObjectInspectorFactory.javaStringObjectInspector);

　　　　 // 返回

        return ObjectInspectorFactory.getStandardStructObjectInspector(fieldNames, fieldOIs);

    }

    /**

     * process the column

     * @param objects

     * @throws HiveException

     */

    public void process(Object[] objects) throws HiveException {

        String[] result = new String[];

        try {

            /*System.out.println(objects[0].toString());

            System.out.println(objects[0] != null);

            System.out.println(StringUtils.isEmpty(objects[0].toString()));

            System.out.println(objects[0].toString().length() < 10);*/
　　　　　　　　// 如果数据不满足要求，返回 0 0 0 0 0

            if (objects[] == null || StringUtils.isEmpty(objects[].toString()) || objects[].toString().length() < ) {

                result[] = "";

                result[] = "";

                result[] = "";

                result[] = "";

                result[] = "";

            } else {

                result[] = objects[].toString();

                result[] = objects[].toString().substring(, );

                result[] = objects[].toString().substring(, );

                result[] = objects[].toString().substring(, );

                result[] = objects[].toString().substring();

            }

            // 将数据返回

            forward(result);

        } catch (Exception e) {

        }

    }

    public void close() throws HiveException {

    }

}

hive UDTF函数编有三个部分：　　

initialize ： 初始化列名

process ： 处理字符串部分

forward ： 返回结果

使用方式请见上篇：Hadoop3集群搭建之——hive添加自定义函数UDF打包、上传服务器，修改 $HIVE_HOME/bin/.hiverc
添加如下内容： jar包可以添加多个

[hadoop@venn05 bin]$ more .hiverc

add jar /opt/hadoop/idp_hd/viewstat/lib/hivefunction-1.0-SNAPSHOT.jar;

create temporary function split_area as 'com.venn.udtf.SplitString';

使用结果如下：

hive> select split_area(userid) from sqoop_test limit ;

OK

Hadoop3集群搭建之——hive添加自定义函数UDTF的更多相关文章

Hadoop3集群搭建之——hive添加自定义函数UDTF （一行输入，多行输出）
上篇: Hadoop3集群搭建之——虚拟机安装 Hadoop3集群搭建之——安装hadoop,配置环境 Hadoop3集群搭建之——配置ntp服务 Hadoop3集群搭建之——hive安装 Hadoo ...
Hadoop3集群搭建之——hive添加自定义函数UDF
上篇: Hadoop3集群搭建之——虚拟机安装 Hadoop3集群搭建之——安装hadoop,配置环境 Hadoop3集群搭建之——配置ntp服务 Hadoop3集群搭建之——hive安装 Hadoo ...
Hadoop3集群搭建之——hive安装
Hadoop3集群搭建之——虚拟机安装 Hadoop3集群搭建之——安装hadoop,配置环境 Hadoop3集群搭建之——配置ntp服务 Hadoop3集群搭建之——hbase安装及简单操作现在到 ...
Hadoop3集群搭建之——hbase安装及简单操作
折腾了这么久,hbase终于装好了 ------------------------- 上篇: Hadoop3集群搭建之——虚拟机安装 Hadoop3集群搭建之——安装hadoop,配置环境 Hado ...
Hadoop3集群搭建之——配置ntp服务
上篇: Hadoop3集群搭建之——虚拟机安装 Hadoop3集群搭建之——安装hadoop,配置环境下篇: Hadoop3集群搭建之——hive安装 Hadoop3集群搭建之——hbase安装及简 ...
Hadoop3集群搭建之——安装hadoop，配置环境
接上篇:Hadoop3集群搭建之——虚拟机安装下篇:Hadoop3集群搭建之——配置ntp服务 Hadoop3集群搭建之——hive安装 Hadoop3集群搭建之——hbase安装及简单操作上篇已 ...
Hadoop3集群搭建之——虚拟机安装
现在做的项目是个大数据报表系统,刚开始的时候,负责做Java方面的接口(项目前端为独立的Java web 系统,后端也是Java web的系统,前后端系统通过接口传输数据),后来领导觉得大家需要多元化 ...
集群搭建之Hive配置要点
注意点: 在启动Hive 的时候要先启动Hadoop和MySQL服务. Mysql 和 Hive 搭建在 yan00机器上. part1:MySQL配置相关安装和配置相关命令: Yum instal ...
Hive2.1.1集群搭建
软件环境: linux系统: CentOS6.7 Hadoop版本: 2.6.5 zookeeper版本: 3.4.8 主机配置: 一共m1, m2, m3这五部机, 每部主机的用户名都为centos ...

随机推荐

Tensorflow函数——tf.variable_scope（）
Tensorflow函数——tf.variable_scope()详解 https://blog.csdn.net/yuan0061/article/details/80576703 2018年06月 ...
1、__del__ 2、item系列 3、__hash__ 4、__eq__
1.__del__ 析构方法释放一个空间之前之前垃圾回收机制 2.item系列和对象使用[ ]访问值有联系 __getitem__ __setitem__ __delit ...
PAT1026 (大模拟)
A table tennis club has N tables available to the public. The tables are numbered from 1 to N. For a ...
公告栏添加时钟——利用canvas画出一个时钟
前言最近在学习HTML5标签,学到Canvas,觉得很有趣.便在慕课网找了个demo练手.就是Canvas时钟. 对于canvas,w3shcool上是这么描述的: HTML5 <canvas ...
Python: 浅淡Python中的属性(property)
起源:项目过程中需要研究youtube_dl这个开源组件,翻阅其中对类的使用,对比c#及Delphi中实现,感觉Python属性机制挺有意思.区别与高级编程语言之单一入口,在类之属性这一方面,它随意的 ...
用脚手架创建vue项目
.创建文件地址首先创建一个文件夹,我用的HBuilder编辑器 , 然后把文件夹拖入编辑器 , 在你创建的文件夹里面打开cmd 2.输入安装命令 : 1). npm install --global ...
xcode 更新svn/Git后发现模拟器显示No Scheme问题
这个是由于XXX..xcodeproj包中xcuserdata文件夹中user.xcuserdatad文件夹名字的问题...user.xcuserdatad文件夹的名字,不是当前用户的名字,就会显示n ...
mongodb在windows下的安装
Windows下安装MongoDB 1.下载MongoDB数据库http://fastdl.mongodb.org/win32/mongodb-win32-i386-1.6.5.zip: 2.将安装文 ...
ubuntu上mongodb的安装
Ubuntu上安装MongoDB的完全步骤以及注意事项本文我们详细介绍了Ubuntu上安装MongoDB的全部过程,希望本次的介绍能够对您有所帮助. AD: 2013大数据全球技术峰会课程PPT下载 ...
关于java.io.IOException: HADOOP_HOME or hadoop.home.dir are not set.的问题
报错如下: 300 [main] DEBUG org.apache.hadoop.util.Shell - Failed to detect a valid hadoop home directory ...

Hadoop3集群搭建之——hive添加自定义函数UDTF

Hadoop3集群搭建之——hive添加自定义函数UDTF的更多相关文章

随机推荐

热门专题