使用spark将内存中的数据写入到hive表中

hive-site.xml

<?xml version="1.0" encoding="UTF-8" standalone="no"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!--

   Licensed to the Apache Software Foundation (ASF) under one or more

   contributor license agreements.  See the NOTICE file distributed with

   this work for additional information regarding copyright ownership.

   The ASF licenses this file to You under the Apache License, Version 2.0

   (the "License"); you may not use this file except in compliance with

   the License.  You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software

   distributed under the License is distributed on an "AS IS" BASIS,

   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

   See the License for the specific language governing permissions and

   limitations under the License.

-->

<configuration>

    <!--hive 的元数据服务, 供spark SQL 使用-->

    <property>

        　　　　<name>hive.metastore.uris</name>

        　　　　<value>thrift://master:9083</value>

        　　　　<description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>

        　　</property>

    <!--配置mysql数据库的链接URL和数据库名metastore,?后面的表达式代表如果这个数据库

    不存在,会自动创建-->

    <property>

        <name>javax.jdo.option.ConnectionURL</name>

        <value>jdbc:mysql://master:3306/metastore?createDatabaseIfNotExist=true</value>

        <description>JDBC connect string for a JDBC metastore</description>

    </property>

    <!--指定mysql的链接驱动,配置jdbc的驱动-->

    <property>

        <name>javax.jdo.option.ConnectionDriverName</name>

        <value>com.mysql.jdbc.Driver</value>

        <description>Driver class name for a JDBC metastore</description>

    </property>

    <!--配置mysql的用户名和密码-->

    <property>

        <name>javax.jdo.option.ConnectionUserName</name>

        <value>root</value>

        <description>username to use against metastore database</description>

    </property>

    <property>

        <name>javax.jdo.option.ConnectionPassword</name>

        <value>123456</value>

        <description>password to use against metastore database</description>

    </property>

    <property>

        <name>hive.cli.print.header</name>

        <value>true</value>

        <description>Whether to print the names of the columns in query output.</description>

    </property>

    <property>

        <name>hive.cli.print.current.db</name>

        <value>true</value>

        <description>Whether to include the current database in the Hive prompt.</description>

    </property>

</configuration>

下面是示例代码

package spark_sql

import org.apache.spark.sql.SparkSession

import org.apache.spark.sql.types.{StringType, StructField, StructType}

import test.ProductData

/**

  * @Program: spark01

  * @Author: 努力就是魅力

  * @Since: 2018-10-19 08:30

  *         Description:

  *

  *         使用spark将内存中的数据写入到hive表中，这是一个可以完整运行的例子

  *

  *

  *    下面是hive表查询的结果

  *         hive (hadoop10)> select * from data_block;

  *         OK

  *         data_block.ip	data_block.time	data_block.phonenum

  *         40.234.66.122	2018-10-12 09:35:21

  *         5.150.203.160	2018-10-03 14:41:09	13389202989

  *

  **/

case class Datablock(ip: String, time:String, phoneNum:String)

object WriteTabletoHive {

  def main(args: Array[String]): Unit = {

    val spark = SparkSession

      .builder()

      .master("local[*]")

      .appName("WriteTableToHive")

      .config("spark.sql.warehouse.dir","D:\\reference-data\\spark01\\spark-warehouse")

      .enableHiveSupport()

      .getOrCreate()

    import spark.implicits._

    val schemaString = "ip time phoneNum"

    val fields = schemaString.split(" ")

      .map(fieldName => StructField(fieldName, StringType,nullable = true))

    val schema = StructType(fields)

   // val datablockDS = Seq(Datablock(ProductData.getRandomIp,ProductData.getRecentAMonthRandomTime("yyyy-MM-dd HH:mm:ss"),ProductData.getRandomPhoneNumber)).toDS()

 // val datablockDS = Seq(Datablock("192.168.40.122","2018-01-01 12:25:25","18866556699")).toDS()

    datablockDS.show()

    datablockDS.toDF().createOrReplaceTempView("dataBlock")

      spark.sql("select * from dataBlock")

        .write.mode("append")

        .saveAsTable("hadoop10.data_block")

  }

}

使用spark将内存中的数据写入到hive表中的更多相关文章

hbase使用MapReduce操作4（实现将 HDFS 中的数据写入到 HBase 表中）
实现将 HDFS 中的数据写入到 HBase 表中 Runner类 package com.yjsj.hbase_mr2; import com.yjsj.hbase_mr2.ReadFruitFro ...
将DataFrame数据如何写入到Hive表中
1.将DataFrame数据如何写入到Hive表中?2.通过那个API实现创建spark临时表?3.如何将DataFrame数据写入hive指定数据表的分区中? 从spark1.2 到spark1.3 ...
vlookup函数基本使用--如何将两个Excel表中的数据匹配；excel表中vlookup函数使用方法将一表引到另一表
vlookup函数基本使用--如何将两个Excel表中的数据匹配:excel表中vlookup函数使用方法将一表引到另一表一.将几个学生的籍贯匹配出来‘ 二.使用查找与引用函数 vlookup 三. ...
sql之将一个表中的数据注入另一个表中
sql之将一个表中的数据注入另一个表中需求:现有两张表t1,t2,现需要将t2的数据通过XZQHBM相同对应放入t1表中 t1: t2: 思路:left join 语句: select * from ...
SQL语句的使用,SELECT - 从数据库表中获取数据 UPDATE - 更新数据库表中的数据 DELETE - 从数据库表中删除数据 INSERT INTO - 向数据库表中插入数据
SQL DML 和 DDL 可以把 SQL 分为两个部分:数据操作语言 (DML) 和数据定义语言 (DDL). SQL (结构化查询语言)是用于执行查询的语法. 但是 SQL 语言也包含用于更新. ...
mysql从一个表中拷贝数据到另一个表中sql语句
这一段在找新的工作,今天面试时,要做一套题,其中遇到这么一句话,从一个表中拷贝所有的数据到另一个表中的sql是什么? 原来我很少用到,也没注意过这个问题,面试后我上网查查,回来自己亲手写了写,测试了下 ...
用sqoop将mysql的数据导入到hive表中
1:先将mysql一张表的数据用sqoop导入到hdfs中准备一张表需求将 bbs_product 表中的前100条数据导导出来只要id brand_id和 name 这3个字段数据存 ...
11.把文本文件的数据导入到Hive表中
先在hive里面创建一个表 create table mydb2.t3(id int,name string,age int) row format delimited fields terminat ...
将从数据库中获取的数据写入到Excel表中
pom.xml文件写入代码,maven自动加载poi-3.1-beta2.jar  & ...

随机推荐

tomcat安装证书https
操作步骤(阿里云官网) 解压已下载保存到本地的Tomcat证书文件. 解压后您将看到文件夹中有2个文件,您可为两个证书文件重命名. 证书文件(domain name.pfx):以.pfx为后缀或文件类 ...
java8新特性LocalDate、LocalTime、LocalDateTime的学习
以前操作时间都是使用SimpleDateFormat类改变Date的时间格式,使用Calendar类操作时间.但是SimpleDateFormat是线程不安全的,源码如下: private Strin ...
巧妙使用MathType快速编写数学函数公式
在我们日常的工作与学习中,你是否也会遇到过无法在电脑中编写数学函数公式的情况呢? 简单的数学函数公式或许经过我们不懈的努力也可以成功的编写,不过这会耽误我们大把的时间. 想象一下,假如你的老板急着催你 ...
centons 7 清机脚本
#/bin/bash##################################初始化系统###################setenforce 0 yum install -y yum- ...
django绕过admin登录设置
在admin.py文件添加以下函数本文是转载:#绕过admin登录def allow_anonymous_user(): from django.contrib.auth.models import ...
django（django学习）两张表创建插入数据
pycharm中直接创建django项目输入创建项目名(如first_django) 在此输入应用名(如g_tu) 此为项目总目录将first_django中settings.py中第58行修 ...
Docker实战 | 第四篇：Docker启用TLS加密解决暴露2375端口引发的安全漏洞，被黑掉三台云主机的教训总结
一. 前言在之前的文章中 IDEA集成Docker插件实现一键自动打包部署微服务项目,其中开放了服务器2375端口监听,此做法却引发出来一个安全问题,在上篇文章评论也有好心的童鞋提示,但自己心存侥幸 ...
Python中可迭代对象是什么？
Python中可迭代对象(Iterable)并不是指某种具体的数据类型,它是指存储了元素的一个容器对象,且容器中的元素可以通过__iter__( )方法或__getitem__( )方法访问. __i ...
关于我 About Me
重庆某大学计算机专业大三学渣 CTF酱油选手 web安全菜鸡 SRC低危小子精通多门语言 hello world 输出和 windows linux单词拼写扣扣:MjU4NTYxNDQ2NA== ...
刷题记录：[GWCTF 2019]枯燥的抽奖
目录刷题记录:[GWCTF 2019]枯燥的抽奖知识点 php伪随机性刷题记录:[GWCTF 2019]枯燥的抽奖题目复现链接:https://buuoj.cn/challenges 参考链接 ...

使用spark将内存中的数据写入到hive表中

使用spark将内存中的数据写入到hive表中

hive-site.xml

下面是示例代码

使用spark将内存中的数据写入到hive表中的更多相关文章

随机推荐

热门专题