hive 学习系列五（hive 和elasticsearch 的交互，很详细哦，我又来吹liubi了）

hive 操作elasticsearch

一，从hive 表格向elasticsearch 导入数据

1，首先，创建elasticsearch 索引，索引如下

curl -XPUT '10.81.179.209:9200/zebra_info_demo?pretty' -H 'Content-Type: application/json' -d'

{

    "settings": {

        "number_of_shards":5,

        "number_of_replicas":2

    },

    "mappings": {

         "zebra_info": {

              "properties": {

                    "name" : {"type" : "text"},

                    "type": {"type": "text"},

                    "province": {"type": "text"},

                    "city": {"type": "text"},

                    "citycode": {"type": "text", "index": "no"},

                    "district": {"type": "text"},

                    "adcode": {"type": "text", "index": "no"},

                    "township": {"type": "text"},

                    "bausiness_circle": {"type": "text"},

                    "formatted_address": {"type": "text"},

                    "location": {"type": "geo_point"},

                    "extensions": {

                      "type": "nested",

                      "properties": {

                        "map_lat": {"type": "double", "index": "no"},

                        "map_lng": {"type": "double", "index": "no"},

                        "avg_price": {"type": "double", "index": "no"},

                        "shops": {"type":"short", "index": "no"},

                        "good_comments": {"type":"short", "index": "no"},

                        "lvl": {"type":"short", "index": "no"},

                        "leisure_type": {"type": "text", "index": "no"},

                        "fun_type": {"type": "text", "index": "no"},

                        "numbers": {"type": "short", "index": "no"}

                       }

                   }

             }

        }

    }

}

'

2，查看elasticsearch版本，下载相应的elasticsearch-hive-hadoop jar 包

可以用如下命令查看elastic search 的版本

本文版本5.6.9

到如下maven 官网下载jar 包。

https://repo.maven.apache.org/maven2/org/elasticsearch/elasticsearch-hadoop-hive/

选择正确的版本即可。

3，把下载下来的jar 包上传到hdfs 路径下。

本文jar 包路径，hdfs:///udf/elasticsearch-hadoop-hive-5.6.9.jar

4，哦了，建表，用起来

DELETE jars;

add jar hdfs:///udf/elasticsearch-hadoop-hive-5.6.9.jar;

drop table zebra_info_demo;

CREATE EXTERNAL  TABLE zebra_info_demo(

name string,

`type` string,

province double,

city string,

citycode string,

district string,

adcode string,

township string,

business_circle string,

formatted_address string,

location string,

extensions STRUCT<map_lat:double, map_lng:double, avg_price:double, shops:smallint, good_comments:smallint, lvl:smallint, leisure_type:STRING, fun_type:STRING, numbers:smallint>

)

STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'

TBLPROPERTIES('es.nodes' = '10.81.179.209:9200',

'es.index.auto.create' = 'false',

'es.resource' = 'zebra_info_demo/zebra_info',

'es.read.metadata' = 'true',

'es.mapping.names' = 'name:name, type:type, province:province, city:city, citycode:citycode, district:district, adcode:adcode, township:township, business_circle:business_circle, formatted_address:formatted_address, location:location, extensions:extensions');

5, 往里面填充数据，就O了。

INSERT INTO TABLE zebra_info_demo

SELECT

a.name,

a.brands,

a.province,

a.city,

null as citycode,

null as district,

null as adcode,

null as township,

a.business_circle,

null as formatted_address,

concat(a.map_lat, ', ', a.map_lng) as `location`,

named_struct('map_lat', cast(a.map_lat as double), 'map_lng',cast(a.map_lng as double) ,'avg_price', cast(0 as DOUBLE), 'shops', 0S,  'good_comments', 0S, 'lvl', cast(a.lv1 as SMALLINT), 'leisure_type', '', 'fun_type', '', 'numbers', 0S) as extentions

from medicalsite_childclinic a;

运行结果：

二，已知elasticsearch 索引，然后，建立hive 表格和elasticsearch 进行交互。可以join 哦，一个字，liubi

1,先看一下索引和数据

已知索引如下：

curl -XPUT  '10.81.179.209:9200/join_tests?pretty' -H 'Content-Type: application/json' -d'

{

  "mappings": {

    "cities": {

      "properties": {

        "province": {

          "type": "string"

        },

        "city": {

          "type": "string"

        }

      }

    }

    }

  }

}

'

curl -XPUT  '10.81.179.209:9200/join_tests1?pretty' -H 'Content-Type: application/json' -d'

{

  "mappings": {

    "shop": {

      "properties":{

        "name": {

          "type": "string"

        },

        "city": {

          "type": "string"

        }

      }

    }

   }

  }

}

'

数据如下：

2，建立表格，写一堆有毒的sql 语句。

DELETE jars;

add jar hdfs:///udf/elasticsearch-hadoop-hive-5.6.9.jar;

create table join_tests(

    province string,

    city string

)STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'

TBLPROPERTIES('es.nodes' = '10.81.179.209:9200',

'es.index.auto.create' = 'false',

'es.resource' = 'join_tests/cities',

'es.read.metadata' = 'true',

'es.mapping.names' = 'province:province, city:city');

create table join_tests1(

    name string,

    city string

)STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'

TBLPROPERTIES('es.nodes' = '10.81.179.209:9200',

'es.index.auto.create' = 'false',

'es.resource' = 'join_tests1/shop',

'es.read.metadata' = 'true',

'es.mapping.names' = 'name:name, city:city');

SELECT

    a.province,

    b.city,

    b.name

from join_tests a LEFT JOIN join_tests1 b on a.city = b.city;

3，运行结果

结束语

推荐一个useful 的工具， apache Hue, 可以用来管理hdfs 文件，hive 操作。mysql 操作等。

hive 学习系列五（hive 和elasticsearch 的交互，很详细哦，我又来吹liubi了）的更多相关文章

hive 学习系列六 hive 去重办法的思考
方法1,建立临时表,利用hive的collect_set 进行去重. create table if not exists tubutest ( name1 string, name2 string ...
hive 学习系列之七 hive 常用数据清洗函数
1,case when 的利用,清洗诸如评分等的内容,用例如下. case when new.comment_grade = '五星商户' then 50 when new.comment_grade ...
Hive学习系列博客
原 Hive作业优化原 Hive学习六:HIVE日志分析(用户画像) 原 Hive学习五--日志案例分析原 Hive学习三原 Hive学习二原 Hive学习一博客来源,https://blo ...
scrapy爬虫学习系列五：图片的抓取和下载
系列文章列表: scrapy爬虫学习系列一:scrapy爬虫环境的准备: http://www.cnblogs.com/zhaojiedi1992/p/zhaojiedi_python_00 ...
大数据学习系列之五 ----- Hive整合HBase图文详解
引言在上一篇大数据学习系列之四 ----- Hadoop+Hive环境搭建图文详解(单机) 和之前的大数据学习系列之二 ----- HBase环境搭建(单机) 中成功搭建了Hive和HBase的环 ...
Hadoop Hive概念学习系列之hive三种方式区别和搭建、HiveServer2环境搭建、HWI环境搭建和beeline环境搭建（五）
说在前面的话以下三种情况,最好是在3台集群里做,比如,master.slave1.slave2的master和slave1都安装了hive,将master作为服务端,将slave1作为服务端. 以 ...
Hive学习之六《Hive进阶— —hive jdbc》详解
接Hive学习五 http://www.cnblogs.com/invban/p/5331159.html 一.配置环境变量 hive jdbc的开发,在开发环境中,配置Java环境变量修改/etc ...
【Hive学习之八】Hive 调优【重要】
环境虚拟机:VMware 10 Linux版本:CentOS-6.5-x86_64 客户端:Xshell4 FTP:Xftp4 jdk8 hadoop-3.1.1 apache-hive-3.1.1 ...
【Hive学习之一】Hive简介
环境虚拟机:VMware 10 Linux版本:CentOS-6.5-x86_64 客户端:Xshell4 FTP:Xftp4 jdk8 hadoop-3.1.1 apache-hive-3.1.1 ...

随机推荐

《ArcGIS Runtime SDK for Android开发笔记》——（12）、自定义方式加载Bundle格式缓存数据
随着ArcGIS 10.3的正式发布,Esri推出了新的紧凑型缓存格式以增强用户的访问体验.新的缓存格式下,Esri将缓存的索引信息.bundlx包含在了缓存的切片文件.bundle中.具体如下图所示 ...
BIEE入门（四）展现层
BIEE里最终面向最终用户(业务界面使用者的)叫做BIEE的Presentation Layer也即展现层,展现层的定义将是最终用户Web报表开发界面里能够看见的完全一样的样子,所以展现层一般将是以最 ...
Selenium2学习（二）-- 操作浏览器基本方法
前面已经把环境搭建好了,这从这篇开始,正式学习selenium的webdriver框架.我们平常说的 selenium自动化,其实它并不是类似于QTP之类的有GUI界面的可视化工具,我们要学的是web ...
Nginx-php-mysql
1.依赖包 yum -y install pcre* openssl*2.phprpm -Uvh https://mirror.webtatic.com/yum/el6/latest.rpmyum i ...
查看oracle固定目录下日志和trace文件大小脚本
python刚入门,在Oracle官网看到个小脚本,感觉挺有意思,经过测试切实可行. [oracle@ycr python]$ more 5.py import datetimeimport osim ...
ABAP和Java里关于DEFAULT(默认)机制的一些语言特性
ABAP 740的新语法: 上图的代码相当于: DATA: ls_data LIKE LINE OF it_data. READ TABLE it_data INTO ls_data WITH KEY ...
mysql数据库 BETWEEN 语法的用法和边界值解析
between用法: 用于where表达式中,选取两个值之间的数据,如: SELECT id FROM user WHERE id BETWEEN value1 AND value2; 当betwee ...
bootstrap table 分页只显示分页不显示总页数等数据
搜了下没找到解决方案,就用CSS来解决了. 把paginationDetailHAlign:"right",使pagination-detail的class为.pull-right ...
【洛谷5288】[HNOI2019] 多边形（二叉树模型）
点此看题面大致题意: 给你一个多边形,用若干不重合.不相交的线段将其划分为若干三角形区域,并定义旋转操作$(a,c)$为选定$4$个点$a,b,c,d$满足\(a<b<c&l ...
HDU 4165 卡特兰
题意:有n个药片,每次吃半片,吃2n天,那么有多少种吃法. 分析:如果说吃半片,那么一定要吃过一整片,用 ) 表示吃半片,用 ( 表示吃整片,那么就是求一个正确的括号匹配方案数,即卡特兰数. 卡特兰数 ...

hive 学习系列五（hive 和elasticsearch 的交互，很详细哦，我又来吹liubi了）

hive 操作elasticsearch

一，从hive 表格向elasticsearch 导入数据

1，首先，创建elasticsearch 索引，索引如下

2，查看elasticsearch版本，下载相应的elasticsearch-hive-hadoop jar 包

3， 把下载下来的jar 包上传到hdfs 路径下。

4，哦了，建表，用起来

5, 往里面填充数据，就O了。

二，已知elasticsearch 索引，然后，建立hive 表格和elasticsearch 进行交互。可以join 哦，一个字，liubi

1,先看一下索引和数据

2，建立表格，写一堆有毒的sql 语句。

3，运行结果

结束语

hive 学习系列五（hive 和elasticsearch 的交互，很详细哦，我又来吹liubi了）的更多相关文章

随机推荐

热门专题

3，把下载下来的jar 包上传到hdfs 路径下。