Spark SQL External Data Sources JDBC官方实现写测试

通过Spark SQL External Data Sources JDBC实现将RDD的数据写入到MySQL数据库中。

jdbc.scala重要API介绍：

/**

 * Save this RDD to a JDBC database at `url` under the table name `table`.

 * This will run a `CREATE TABLE` and a bunch of `INSERT INTO` statements.

 * If you pass `true` for `allowExisting`, it will drop any table with the

 * given name; if you pass `false`, it will throw if the table already

 * exists.

 */

def createJDBCTable(url: String, table: String, allowExisting: Boolean) 

/**

 * Save this RDD to a JDBC database at `url` under the table name `table`.

 * Assumes the table already exists and has a compatible schema.  If you

 * pass `true` for `overwrite`, it will `TRUNCATE` the table before

 * performing the `INSERT`s.

 *

 * The table must already exist on the database.  It must have a schema

 * that is compatible with the schema of this RDD; inserting the rows of

 * the RDD in order via the simple statement

 * `INSERT INTO table VALUES (?, ?, ..., ?)` should not fail.

 */

def insertIntoJDBC(url: String, table: String, overwrite: Boolean)

import org.apache.spark.sql.SQLContext

import org.apache.spark.sql.Row

import org.apache.spark.sql.types._

val sqlContext  = new SQLContext(sc)

import sqlContext._

#数据准备

val url = "jdbc:mysql://hadoop000:3306/test?user=root&password=root"

val arr2x2 = Array[Row](Row.apply("dave", 42), Row.apply("mary", 222))

val arr1x2 = Array[Row](Row.apply("fred", 3))

val schema2 = StructType(StructField("name", StringType) :: StructField("id", IntegerType) :: Nil)

val arr2x3 = Array[Row](Row.apply("dave", 42, 1), Row.apply("mary", 222, 2))

val schema3 = StructType(StructField("name", StringType) :: StructField("id", IntegerType) :: StructField("seq", IntegerType) :: Nil) 

import org.apache.spark.sql.jdbc._

================================CREATE======================================

val srdd = sqlContext.applySchema(sc.parallelize(arr2x2), schema2)

srdd.createJDBCTable(url, "person", false)

sqlContext.jdbcRDD(url, "person").collect.foreach(println)

[dave,42]

[mary,222]

==============================CREATE with overwrite========================================

val srdd = sqlContext.applySchema(sc.parallelize(arr2x3), schema3)

srdd.createJDBCTable(url, "person2", false)

sqlContext.jdbcRDD(url, "person2").collect.foreach(println)

[mary,222,2]

[dave,42,1]

val srdd2 = sqlContext.applySchema(sc.parallelize(arr1x2), schema2)

srdd2.createJDBCTable(url, "person2", true)

sqlContext.jdbcRDD(url, "person2").collect.foreach(println)

[fred,3]

================================CREATE then INSERT to append======================================

val srdd = sqlContext.applySchema(sc.parallelize(arr2x2), schema2)

val srdd2 = sqlContext.applySchema(sc.parallelize(arr1x2), schema2)

srdd.createJDBCTable(url, "person3", false)

sqlContext.jdbcRDD(url, "person3").collect.foreach(println)

[mary,222]

[dave,42]

srdd2.insertIntoJDBC(url, "person3", false)

sqlContext.jdbcRDD(url, "person3").collect.foreach(println)

[mary,222]

[dave,42]

[fred,3]

================================CREATE then INSERT to truncate======================================

val srdd = sqlContext.applySchema(sc.parallelize(arr2x2), schema2)

val srdd2 = sqlContext.applySchema(sc.parallelize(arr1x2), schema2)

srdd.createJDBCTable(url, "person4", false)

sqlContext.jdbcRDD(url, "person4").collect.foreach(println)

[dave,42]

[mary,222]

srdd2.insertIntoJDBC(url, "person4", true)

[fred,3]

================================Incompatible INSERT to append======================================

val srdd = sqlContext.applySchema(sc.parallelize(arr2x2), schema2)

val srdd2 = sqlContext.applySchema(sc.parallelize(arr2x3), schema3)

srdd.createJDBCTable(url, "person5", false)

srdd2.insertIntoJDBC(url, "person5", true)

    java.sql.SQLException: Column count doesn't match value count at row 1

Spark SQL External Data Sources JDBC官方实现写测试的更多相关文章

Spark SQL External Data Sources JDBC官方实现读测试
在最新的master分支上官方提供了Spark JDBC外部数据源的实现,先尝为快. 通过spark-shell测试: import org.apache.spark.sql.SQLContext v ...
Spark SQL External Data Sources JDBC简易实现
在spark1.2版本中最令我期待的功能是External Data Sources,通过该API可以直接将External Data Sources注册成一个临时表,该表可以和已经存在的表等通过sq ...
Spark SQL 之 Data Sources
#Spark SQL 之 Data Sources 转载请注明出处:http://www.cnblogs.com/BYRans/ 数据源(Data Source) Spark SQL的DataFram ...
Spark(3) - External Data Source
Introduction Spark provides a unified runtime for big data. HDFS, which is Hadoop's filesystem, is t ...
Spark SQL External DataSource简介
随着Spark1.2的发布,Spark SQL开始正式支持外部数据源.这使得Spark SQL支持了更多的类型数据源,如json, parquet, avro, csv格式.只要我们愿意,我们可以开发 ...
How to: Provide Credentials for the Dashboards Module when Using External Data Sources
XAF中使用dashboard模块时,如果使用了sql数据源,可以使用此方法提供连接信息 https://www.devexpress.com/Support/Center/Question/Deta ...
【转载】Spark SQL之External DataSource外部数据源
http://blog.csdn.net/oopsoom/article/details/42061077 一.Spark SQL External DataSource简介随着Spark1.2的发 ...
Apache Spark 2.2.0 中文文档 - Spark SQL, DataFrames and Datasets Guide | ApacheCN
Spark SQL, DataFrames and Datasets Guide Overview SQL Datasets and DataFrames 开始入门起始点: SparkSession ...
What’s new for Spark SQL in Apache Spark 1.3（中英双语）
文章标题 What’s new for Spark SQL in Apache Spark 1.3 作者介绍 Michael Armbrust 文章正文 The Apache Spark 1.3 re ...

随机推荐

Java设计模式系列1--原型模式（Prototype Method）
2014-02-14 11:27:33 声明:本文不仅是本人自己的成果,有些东西取自网上各位大神的思想,虽不能一一列出,但在此一并感谢! 原型模式,从名字即可看出,该模式的思想就是将一个对象作为原型, ...
2、android Service 详细用法
定义一个服务在项目中定义一个服务,新建一个ServiceTest项目,然后在这个项目中新增一个名为MyService的类,并让它继承自Service,完成后的代码如下所示: ? 1 2 3 4 5 ...
LVDS，MIPI，EDP
一.背景介绍: 随着显示分辨率的越来越高,传统的VGA.DVI等接口逐渐不能满足人们的视觉需求.随后就产生了以HDMI.DisplayPort为代表的新型数字接口,外部接口方面HDMI占据了较大市场优 ...
说说chrome上的JS调试
步骤:审查元素 ->source->选择js文件 ->设置断点 ->触发函数 ->进入调试点step over 到elements目录下双击某一块代码还可以立刻修改,ct ...
遇到tomcat端口被占用问题解决方案
1) 启动Eclipse的Tomcat5.0时,报以下错误: 2)根据以上提示显示:Tomcat Server 的8080端口已经被占用.查看它被哪个占用,方法如下: 3)可以看到占用此端口的PID为 ...
Java连接mysql数据库
1.先创建一个Java项目testMysql(我使用的是intellij编辑器). 2.导入mysql的驱动包. (1) (2) (4) 3.编写代码 import java.sql.Connecti ...
PAT (Basic Level) Practise：1010. 一元多项式求导
[题目链接] 设计函数求一元多项式的导数.(注:xn(n为整数)的一阶导数为n*xn-1.) 输入格式:以指数递降方式输入多项式非零项系数和指数(绝对值均为不超过1000的整数).数字间以空格分隔. ...
linux下安装mongodb（php版本5.3）
转:原文出处忘记了. 1. 下载MongoDB 2. 解压文件到某目录下,然后重命名: [root@localhost var]# tar -xzvf mongodb-linux-i686-2.0.1 ...
创建Windows截图工具的快捷方式
日常生活中我们会用到好多截图,一般截图我们用QQ自带截图较多,但许多人都忽视了电脑自带截图功能.在我的电脑-->附件-->截图工具按照上述方式找截图工具比较繁琐,今天我们可以通过快捷 ...
java文件下载
/** * zip 导出 * @param response * @param zipName * @throws Exception */ private void outZip(HttpServl ...

Spark SQL External Data Sources JDBC官方实现写测试

Spark SQL External Data Sources JDBC官方实现写测试的更多相关文章

随机推荐

热门专题