根据JSON创建对应的HIVE表

　　本文提供一种用SCALA把JSON串转换为HIVE表的方法，由于比较简单，只贴代码，不做解释。有问题可以留言探讨

package com.gabry.hive
import org.json4s._
import org.json4s.native.JsonMethods._
import scala.io.Source

class Json2Hive{

  /**

    * sealed abstract class JValue

    *case object JNothing extends JValue // 'zero' for JValue

    *case object JNull extends JValue

    *case class JString(s: String) extends JValue

    *case class JDouble(num: Double) extends JValue

    *case class JDecimal(num: BigDecimal) extends JValue

    *case class JInt(num: BigInt) extends JValue

    *case class JBool(value: Boolean) extends JValue

    *case class JObject(obj: List[JField]) extends JValue

    *case class JArray(arr: List[JValue]) extends JValue

    *type JField = (String, JValue)

    *create table student_test(id INT, info struct< name:string,age:INT >)

    *jsonString:{ "people_type":1,"people":{"person_id": 5,"test_count": 5,"para":{"name":"jack","age":6}}}

    */

  private def fieldDelimiter(level:Int) = if ( level == 2 ) " " else ":"

  private def decodeJson(jv: Any,level:Int,hql:StringBuilder) :Unit = {

    jv match {

      case js:JString => hql.append(fieldDelimiter(level)+"string,")

      case jdo:JDouble => hql.append(fieldDelimiter(level)+"double,")

      case jde:JDecimal => hql.append(fieldDelimiter(level)+"decimal,")

      case ji:JInt => hql.append(fieldDelimiter(level)+"bigint,")

      case jb:JBool => hql.append(fieldDelimiter(level)+"int,")

      case jf:JField=>

        hql.append(jf._1)

        decodeJson(jf._2,level+1,hql)

      case ja:JArray=>

          hql.append(level + " struct<")

          ja.arr.foreach(decodeJson(_,level+1,hql))

          hql.append(">")

      case jo:JObject=>

          if (level !=0) hql.append(" struct<")

          jo.obj.foreach(decodeJson(_,level+1,hql))

          if ( hql.endsWith(",") ) hql.deleteCharAt(hql.length-1)

          if (level !=0) hql.append(">,")

      case JNull=> hql.append(fieldDelimiter(level)+"string,")

      case _ =>println(jv)

    }

  }

  def toHive(jsonStr:String,tableName:String):String = {

    val jsonObj = parse(jsonStr)

    val hql = new StringBuilder()

    decodeJson(jsonObj,0,hql)

    "create table %s ( %s )".format(tableName,hql.toString())

  }

}

object Json2Hive{

  val json2hive = new Json2Hive()

  def main (args :Array[String]) : Unit = {

    if ( args.length != 2 ) println("usage : json2hive jsonFile hiveTableName")

    val jsonFile = args(0)

    val hiveTableName = args(1)

    //val jsonstr ="{ \"people_type\":0,\"people_num\":0.1,\"people\":{\"person_id\": 5,\"test_count\": 5,\"para\":{\"name\":\"jack\",\"age\":6}},\"gender\":1}"

    //val jsonstr ="{ \"people_type\":0,\"object\":{\"f1\":1,\"f2\":1},\"gender\":1}"
/* 由于JSON串不容易用参数传递，故此处以json文件代替 */

    val file = Source.fromFile(jsonFile,"UTF-8")
/* 将文件中的json串转换为对应的HIVE表 */

    file.getLines().foreach(line=>println(json2hive.toHive(line.toString,hiveTableName)))

    file.close()

  }

}

以下是测试结果

create table example ( people_type bigint,people_num double,people struct<person_id:bigint,test_count:bigint,para struct<name:string,age:bigint>>,gender bigint )

根据JSON创建对应的HIVE表的更多相关文章

创建function实现hive表结果导出到mysql
1. 创建临时function (这里两个包都是hive自带的,不需要自己开发的,可以根据名称查找对应的版本) add jar /opt/local/hive/lib/hive-contrib-.ja ...
flume的sink写入hive表
flume的配置文件如下: a1.sources=r1 a1.channels=c1 a1.sinks=s1 a1.sources.r1.type=netcat a1.sources.r1.bind= ...
hive中创建hive-json格式的表及查询
在hive中对于json的数据格式,可以使用get_json_object或json_tuple先解析然后查询. 也可以直接在hive中创建json格式的表结构,这样就可以直接查询,实战如下(hive ...
【原】创建Hive表，分号分隔符“；”引起的异常
[障碍再现] 在创建支持Map数据结构的Hive表时,抛出如下异常 hive> create table tab_map(name string,info map<string,strin ...
批量导入数据到hive表中：假设我有60张主子表如何批量创建导入数据
背景:根据业务需要需要把60张主子表批量入库到hive表. 创建测试数据: def createBatchTestFile(): Unit = { to ) { val sWriter = new P ...
Hive表中Partition的创建
作用: 在Hive Select查询中一般会扫描整个表内容,会消耗很多时间做没必要的工作.有时候只需要扫描表中关心的一部分数据,在对应的partition里面去查找就可以,减少查询时间. 1. 创建表 ...
hive 将hive表数据查询出来转为json对象和json数组输出
一.将hive表数据查询出来转为json对象输出 1.将查询出来的数据转为一行一行,并指定分割符的数据 2.使用UDF函数,将每一行数据作为string传入UDF函数中转换为json再返回 1.准备数 ...
[Hive]使用HDFS文件夹数据创建Hive表分区
描写叙述: Hive表pms.cross_sale_path建立以日期作为分区,将hdfs文件夹/user/pms/workspace/ouyangyewei/testUsertrack/job1Ou ...
用puthivestreaming把hdfs里的数据流到hive表
全景图: 1. 创建hive表 CREATE TABLE IF NOT EXISTS newsinfo.test( name STRING ) CLUSTERED BY (name)INTO 3 ...

随机推荐

python_ 学习笔记（基本数据类型）
python3有6中标准数据类型:Number(数字).String(字符串).List(列表).Tuple(元组).Dictionary(字典).Set(集合)不可变数据:Number.String ...
Linux学习笔记记录（补充）
三 , lnmp 一键包安装使用
安装打包环境 #https://lnmp.org/----------------------------------------------------#安装wget -c http://soft ...
PAT 1134 Vertex Cover
A vertex cover of a graph is a set of vertices such that each edge of the graph is incident to at le ...
IDEA下tomcat中web项目乱码，控制台乱码解决指南
若是由于过滤器,request ,response等原因,不适用. 原文作者:http://www.kafeitu.me/tools/2013/03/26/intellij-deal-chinese- ...
51Nod——T 1631 小鲨鱼在51nod小学
https://www.51nod.com/onlineJudge/questionCode.html#!problemId=1631 基准时间限制:1 秒空间限制:131072 KB 分值: 20 ...
洛谷—— P1825 [USACO11OPEN]玉米田迷宫Corn Maze
https://www.luogu.org/problem/show?pid=1825 题目描述 This past fall, Farmer John took the cows to visit ...
Java的23种设计模式（转）
设计模式(Design pattern)是一套被反复使用.多数人知晓的.经过分类编目的.代码设计经验的总结.使用设计模式是为了可重用代码.让代码更容易被他人理解.保证代码可靠性. 毫无疑问,设计模式于 ...
ssh2项目整合 struts2.1+hibernate3.3+spring3 基于hibernate注解和struts2注解
项目文件夹结构例如以下: watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQveW9uZ3poaWFu/font/5a6L5L2T/fontsize/400/fi ...
Scope Is the Enemy of Success
Scope Is the Enemy of Success Dave Quick SCopE REFERS To A pRojECT'S SizE. How much time, effort, ...

根据JSON创建对应的HIVE表

根据JSON创建对应的HIVE表的更多相关文章

随机推荐

热门专题