CDH6.3.2下安装部署Qualitis数据质量分析的计算服务Linkis1.3.2
快速搭建Qualitis手册
一、基础软件安装
Gradle (4.6)
MySQL (5.5+)
JDK (1.8.0_141) Linkis(1.0.0+), 必装Spark引擎。如何安装Linkis
DataSphereStudio (1.0.0+) 可选. 如果你想使用工作流,必装DataSphereStudio 如何安装DataSphereStudio?
(1)安装 Gradle (4.6)
腾讯云镜像: https://mirrors.cloud.tencent.com/gradle/gradle-4.6-bin.zip
(1.1)克隆源码最新的1.0.0:git clone https://gitee.com/WeBank/Qualitis.git
(1.2)Gradle编译源码:gradle clean distZip
(1.3)编译过后得到安装包: 项目跟目录下出现一个build文件夹: qualitis-1.0.0.zip 就是项目安装包
解压安装包:修改
zip包
unzip qualitis-{version}.zip
tar包
tar -zxvf qualitis-{VERSION}.tar.gz
(1.4)编辑连接MySQL,插入初始数据
mysql -u {USERNAME} -p {PASSWORD} -h {IP} --default-character-set=utf8
// mysql -u qualitis -p qualitis -h 10.130.1.75 --default-character-set=utf8
use qualitis source conf/database/init.sql
报错:mysql 连接不成功
ALTER USER 'qualitis'@'%' IDENTIFIED WITH 'mysql_native_password' BY 'qualitis';
FLUSH PRIVILEGES;
(1.5)修改配置文件
vim conf/application-dev.yml
spring:
datasource:
username: qualitis
password: qualitis
url: jdbc:mysql://10.130.1.75:3306/qualitis?createDatabaseIfNotExist=true&useUnicode=true&characterEncoding=utf-8
driver-class-name: com.mysql.jdbc.Driver
type: com.zaxxer.hikari.HikariDataSource
task:
persistent:
type: jdbc
username: qualitis
password: qualitis
address: jdbc:mysql://10.130.1.75:3306/qualitis?createDatabaseIfNotExist=true&useUnicode=true&characterEncoding=utf-8
tableName: qualitis_application_task_result
execute:
limit_thread: 10
rule_size: 10
front_end:
home_page: http://127.0.0.1:8090/#/dashboard
domain_name: http://127.0.0.1:8090
local: zh_CN
center: dev
(2)安装 MySQL (5.5+)
略。。。。。。
安装JDK1.8
tar -zxvf jdk-8u192-linux-x64.tar.gz
vim /etc/profile
export JAVA_HOME=/usr/java/jdk1.8.0_192
export PATH=$PATH:$JAVA_HOME/bin
source /etc/profile
java -version
(3)安装Linkis
查看服务安装CDH的大数据三件套:Spark 、 Hive、Hadoop
克隆源代码:gitclone Apache Linkis 1.3.2
修改Maven的仓库地址:方便源代码Jar依赖加载
<mirrors>
<mirror>
<id>nexus-aliyun</id>
<mirrorOf>*,!cloudera</mirrorOf>
<name>Nexus aliyun</name>
<url>http://maven.aliyun.com/nexus/content/groups/public</url>
</mirror>
<mirror>
<id>aliyunmaven</id>
<mirrorOf>*,!cloudera</mirrorOf>
<name>阿里云公共仓库</name>
<url>https://maven.aliyun.com/repository/public</url>
</mirror>
<mirror>
<id>aliyunmaven</id>
<mirrorOf>*,!cloudera</mirrorOf>
<name>spring-plugin</name>
<url>https://maven.aliyun.com/repository/spring-plugin</url>
</mirror>
<mirror>
<id>maven-default-http-blocker</id>
<mirrorOf>external:http:*</mirrorOf>
<name>Pseudo repository to mirror external repositories initially using HTTP.</name>
<url>http://0.0.0.0/</url>
<blocked>true</blocked>
</mirror>
</mirrors>
<profiles>
<profile>
<id>aliyun</id>
<repositories>
<repository>
<id>aliyun</id>
<url>http://maven.aliyun.com/nexus/content/groups/public/</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>true</enabled>
<updatePolicy>always</updatePolicy>
</snapshots>
</repository>
</repositories>
</profile>
<profile>
<id>maven-central</id>
<repositories>
<repository>
<id>maven-central</id>
<url>https://repo.maven.apache.org/maven2</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>true</enabled>
<updatePolicy>always</updatePolicy>
</snapshots>
</repository>
</repositories>
</profile>
<profile>
<id>cloudera</id>
<repositories>
<repository>
<id>cloudera</id>
<url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>true</enabled>
<updatePolicy>always</updatePolicy>
</snapshots>
</repository>
</repositories>
</profile>
</profiles>
<activeProfiles>
<activeProfile>aliyun</activeProfile>
<activeProfile>maven-central</activeProfile>
<activeProfile>cloudera</activeProfile>
</activeProfiles>
修改插件Jar匹配自己liunx的大数据套件:这里是CDH6.3.2
<!-- spark:匹配 -->
<spark.version>2.4.0</spark.version>
<!-- hive:匹配 -->
<hive.version>2.1.1-cdh6.3.2</hive.version>
<package.hive.version>2.1.1_cdh6.3.2</package.hive.version>
<!-- hadoop:匹配 -->
<hadoop.version>3.0.0-cdh6.3.2</hadoop.version>
<hadoop-hdfs-client.artifact>hadoop-hdfs-client</hadoop-hdfs-client.artifact>
<hadoop-hdfs-client-shade.version>3.0.0-cdh6.3.2</hadoop-hdfs-client-shade.version>
<!-- zookeeper:匹配 -->
<zookeeper.version>3.4.5-cdh6.3.2</zookeeper.version>
<!-- Maven和JDK匹配 -->
<java.version>1.8</java.version>
<maven.version>3.8.1</maven.version>
<!-- Scala版本匹配 -->
<scala.version>2.11.12</scala.version>
<scala.binary.version>2.11</scala.binary.version>
修改hive的版本匹配 linkis-engineconn-plugins/hive/src/main/assembly/distribution.xml:$
<outputDirectory>/dist/${package.hive.version}/lib</outputDirectory>
<fileSet>
<directory>${basedir}/src/main/resources</directory>
<includes>
<include>*</include>
</includes>
<fileMode>0777</fileMode>
<outputDirectory>dist/${package.hive.version}/conf</outputDirectory>
<lineEnding>unix</lineEnding>
</fileSet>
<fileSet>
<directory>${basedir}/target</directory>
<includes>
<include>*.jar</include>
</includes>
<excludes>
<exclude>*doc.jar</exclude>
</excludes>
<fileMode>0777</fileMode>
<outputDirectory>plugin/${package.hive.version}</outputDirectory>
</fileSet>
代码中修改HIve和SPark匹配:
linkis-computation-governance/linkis-manager/linkis-label-common/src/main/java/org/apache/linkis/manager/label/conf/LabelCommonConfig.java
public static final CommonVars<String> SPARK_ENGINE_VERSION =
CommonVars.apply("wds.linkis.spark.engine.version", "2.4.0");
public static final CommonVars<String> HIVE_ENGINE_VERSION =
CommonVars.apply("wds.linkis.hive.engine.version", "2.1.1_cdh6.3.2");
linkis-computation-governance/linkis-computation-governance-common/src/main/scala/org/apache/linkis/governance/common/conf/GovernaceCommonConf.scala
val SPARK_ENGINE_VERSION = CommonVars("wds.linkis.spark.engine.version", "2.4.0")
val HIVE_ENGINE_VERSION = CommonVars("wds.linkis.hive.engine.version", "2.1.1_cdh6.3.2")
Maven 刷新为不报错
Maven clean 打包
- mvn clean
- mvn -N install
- mvn clean install '-Dmaven.test.skip=true'
成功之后有安装包:linkis-dist\target\apache-linkis-1.3.2-bin.tar.gz
核心修改的配置:
/opt/linkis_unbintargz/deploy-config/db.sh
### Used to store user's custom variables, user's configuration, UDFs and functions, while providing the JobHistory service
MYSQL_HOST=10.130.1.37
MYSQL_PORT=3306
MYSQL_DB=linkis_test
MYSQL_USER=root
MYSQL_PASSWORD=123456 ### Provide the DB information of Hive metadata database.
### Attention! If there are special characters like "&", they need to be enclosed in quotation marks.
HIVE_META_URL="jdbc:mysql://10.130.1.37:3306/hive?useUnicode=true&characterEncoding=UTF-8"
HIVE_META_USER="hive"
HIVE_META_PASSWORD="123456"
/opt/linkis_unbintargz/deploy-config/linkis-env.sh
#!/bin/bash
# SSH_PORT=22
### deploy user
deployUser=hadoop
##If you don't set it, a random password string will be generated during installation
deployPwd=
LINKIS_SERVER_VERSION=v1
WORKSPACE_USER_ROOT_PATH=file:///tmp/linkis/
HDFS_USER_ROOT_PATH=hdfs:///tmp/linkis
### Path to store started engines and engine logs, must be local
ENGINECONN_ROOT_PATH=/linkis/tmp
RESULT_SET_ROOT_PATH=hdfs:///tmp/linkis
YARN_RESTFUL_URL="http://10.130.1.37:8088"
HADOOP_HOME==${HADOOP_HOME:-"/opt/cloudera/parcels/CDH/lib/hadoop"}
HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/opt/cloudera/parcels/CDH/lib/hadoop/etc/hadoop"}
HADOOP_KERBEROS_ENABLE=${HADOOP_KERBEROS_ENABLE:-"false"}
HADOOP_KEYTAB_PATH=${HADOOP_KEYTAB_PATH:-"/appcom/keytab/"}
## Hadoop env version
HADOOP_VERSION=${HADOOP_VERSION:-"3.0.0"}
#Hive
HIVE_HOME=/opt/cloudera/parcels/CDH/lib/hive
HIVE_CONF_DIR=/opt/cloudera/parcels/CDH/lib/hive/conf
#Spark
SPARK_HOME=/opt/cloudera/parcels/CDH/lib/spark
SPARK_CONF_DIR=/opt/cloudera/parcels/CDH/lib/spark/conf
## Engine version conf
#SPARK_VERSION
SPARK_VERSION=2.4.0
##HIVE_VERSION
HIVE_VERSION=2.1.1_cdh6.3.2
#PYTHON_VERSION=python2
EUREKA_PORT=20303
export EUREKA_PREFER_IP=false
EUREKA_HEAP_SIZE="256M"
GATEWAY_PORT=9501
MANAGER_PORT=9101
ENGINECONNMANAGER_PORT=9102
ENTRANCE_PORT=9104
PUBLICSERVICE_PORT=9105
export SERVER_HEAP_SIZE="512M"
##The decompression directory and the installation directory need to be inconsistent
LINKIS_HOME=/opt/linkis_bin
##The extended lib such mysql-connector-java-*.jar
#LINKIS_EXTENDED_LIB=/appcom/common/linkisExtendedLib
LINKIS_VERSION=1.3.2
# for install
LINKIS_PUBLIC_MODULE=lib/linkis-commons/public-module
export PROMETHEUS_ENABLE=false
export ENABLE_HDFS=true
export ENABLE_HIVE=true
export ENABLE_SPARK=true
/opt/linkis_unbintargz/linkis-package/db/linkis_dml.sql
-- 变量:
SET @SPARK_LABEL="spark-2.4.0";
SET @HIVE_LABEL="hive-2.1.1_cdh6.3.2";
/opt/linkis_unbintargz/linkis-package/db/linkis_ddl.sql
修改:linkis_cg_ec_resource_info_record 表字段 metrics类型为 longtext
安装Linkis:/opt/linkis_unbintargz/bin :sh start.sh 第一次选 1 第二次 选 2
1.1. 添加mysql驱动。
可以去下载一个符合当前版本的mysql驱动包,我的mysql版本是8.0+的
cp mysql-connector-java-8.0.30.jar /opt/linkis_bin/lib/linkis-spring-cloud-services/linkis-mg-gateway/
cp mysql-connector-java-8.0.30.jar /opt/linkis_bin/lib/linkis-commons/public-module/
添加Zookeeper驱动:
<zookeeper.version>3.4.5-cdh6.3.2</zookeeper.version>
放入:$LINKIS_HOME/lib/linkis-engines/hive/dist/version/lib
mv /opt/cloudera/parcels/CDH/lib/spark/jars/guava-11.0.2.jar /opt/cloudera/parcels/CDH/lib/spark/jars/guava-11.0.2.jar.back
切换用户启动
每一台节点都需要配置hadoop用户
root@cdh-01 sbin]# su hadoop
[hadoop@cdh-01 sbin]$ klist
Ticket cache: FILE:/tmp/krb5cc_1000
Default principal: hadoop@CQCZ.COM
Valid starting Expires Service principal
2024-01-15T02:07:40 2024-01-16T02:07:40 krbtgt/CQCZ.COM@CQCZ.COM
renew until 2024-01-20T11:41:36
运行:/opt/linkis_bin/sbin :sh linkis-start-all.sh
http://IP:20303/ Eureka 注册中心六个服务
安装前段:安装NGINX:配置NGINX
linkis的nginx配置文件默认是 在/etc/nginx/conf.d/linkis.conf nginx的日志文件在 /var/log/nginx/access.log 和/var/log/nginx/error.log
修改为网关的地址:
重启NGINX: nginx -s reload
http://IP:8188/ 账号密码在安装成功就打印了
测试Linkis安装成功:
(一)测试成功:sh bin/linkis-cli -submitUser hadoop -engineType shell-1 -codeType shell -code "whoami"
[root@cdh-01 linkis_bin]# sh bin/linkis-cli -submitUser hadoop -engineType shell-1 -codeType shell -code "whoami"
=Java Start Command=
exec /usr/java/jdk1.8.0_181-cloudera/bin/java -server -Xms32m -Xmx2048m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/opt/linkis_bin/logs/linkis-cli -XX:ErrorFile=/opt/linkis_bin/logs/linkis-cli/ps_err_pid%p.log -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=80 -XX:+DisableExplicitGC -classpath /opt/linkis_bin/conf/linkis-cli:/opt/linkis_bin/lib/linkis-computation-governance/linkis-client/linkis-cli/:/opt/linkis_bin/lib/linkis-commons/public-module/:.:/usr/java/jdk1.8.0_181-cloudera/lib:/usr/java/jdk1.8.0_181-cloudera/jre/lib: -Dconf.root=/opt/linkis_bin/conf/linkis-cli -Dconf.file=linkis-cli.properties -Dlog.path=/opt/linkis_bin/logs/linkis-cli -Dlog.file=linkis-client.root.log.20240111180441418462103 org.apache.linkis.cli.application.LinkisClientApplication '-submitUser hadoop -engineType shell-1 -codeType shell -code whoami'
[INFO] LogFile path: /opt/linkis_bin/logs/linkis-cli/linkis-client.root.log.20240111180441418462103
[INFO] User does not provide usr-configuration file. Will use default config
[INFO] user does not specify proxy-user, will use current submit-user "hadoop" by default.
[INFO] connecting to linkis gateway:http://127.0.0.1:9501
JobId:1
TaskId:1
ExecId:exec_id018011linkis-cg-entrancecdh-01:9104LINKISCLI_hadoop_shell_0
[INFO] Job is successfully submitted!
2024-01-11 18:04:48.004 INFO Job with jobId : 1 and execID : LINKISCLI_hadoop_shell_0 submitted
2024-01-11 18:04:48.004 INFO Your job is Scheduled. Please wait it to run.
2024-01-11 18:04:48.004 INFO You have submitted a new job, script code (after variable substitution) is
SCRIPT CODE
whoami
SCRIPT CODE
2024-01-11 18:04:48.004 INFO Your job is accepted, jobID is LINKISCLI_hadoop_shell_0 and jobReqId is 1 in ServiceInstance(linkis-cg-entrance, cdh-01:9104). Please wait it to be scheduled
2024-01-11 18:04:48.004 INFO job is scheduled.
2024-01-11 18:04:48.004 INFO Your job is being scheduled by orchestrator.
2024-01-11 18:04:48.004 INFO Your job is Running now. Please wait it to complete.
2024-01-11 18:04:48.004 INFO job is running.
2024-01-11 18:04:48.004 INFO JobRequest (1) was submitted to Orchestrator.
2024-01-11 18:04:48.004 INFO Background is starting a new engine for you,execId TaskID_1_otJobId_astJob_0_codeExec_0 mark id is mark_0, it may take several seconds, please wait
2024-01-11 18:05:18.005 INFO Succeed to create new ec : ServiceInstance(linkis-cg-engineconn, cdh-01:40180)
2024-01-11 18:05:18.005 INFO Task submit to ec: ServiceInstance(linkis-cg-engineconn, cdh-01:40180) get engineConnExecId is: 1
2024-01-11 18:05:18.005 INFO EngineConn local log path: ServiceInstance(linkis-cg-engineconn, cdh-01:40180) /linkis/tmp/hadoop/20240111/shell/5a57511a-b2ff-4aee-a88d-21afff176d8a/logs
cdh-01:40180_0 >> whoami
hadoop
2024-01-11 18:05:23.005 INFO Congratulations! Your job : LINKISCLI_hadoop_shell_0 executed with status succeed and 2 results.
2024-01-11 18:05:23.005 INFO Task creation time(任务创建时间): 2024-01-11 18:04:45, Task scheduling time(任务调度时间): 2024-01-11 18:04:48, Task start time(任务开始时间): 2024-01-11 18:04:48, Mission end time(任务结束时间): 2024-01-11 18:05:23
2024-01-11 18:05:23.005 INFO Task submit to Orchestrator time:2024-01-11 18:04:48, Task request EngineConn time:2024-01-11 18:04:48, Task submit to EngineConn time:2024-01-11 18:05:18
2024-01-11 18:05:23.005 INFO Your mission(您的任务) 1 The total time spent is(总耗时时间为): 38.2 s
2024-01-11 18:05:23.005 INFO Congratulations. Your job completed with status Success.
2024-01-11 18:05:23.005 INFO job is completed.
[INFO] Job execute successfully! Will try get execute result
Result:====
TaskId:1
ExecId: exec_id018011linkis-cg-entrancecdh-01:9104LINKISCLI_hadoop_shell_0
User:hadoop
Current job status:SUCCEED
extraMsg:
result:
[INFO] Retrieving result-set, may take time if result-set is large, please do not exit program.
============ RESULT SET 1 ============
hadoop
############Execute Success!!!########
(二)测试成功:sh bin/linkis-cli -submitUser hadoop -engineType hive-2.1.1_cdh6.3.2 -codeType hql -code "show tables"
[root@cdh-01 linkis_bin]# sh bin/linkis-cli -submitUser hadoop -engineType hive-2.1.1_cdh6.3.2 -codeType hql -code "show tables"
=Java Start Command=
exec /usr/java/jdk1.8.0_181-cloudera/bin/java -server -Xms32m -Xmx2048m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/opt/linkis_bin/logs/linkis-cli -XX:ErrorFile=/opt/linkis_bin/logs/linkis-cli/ps_err_pid%p.log -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=80 -XX:+DisableExplicitGC -classpath /opt/linkis_bin/conf/linkis-cli:/opt/linkis_bin/lib/linkis-computation-governance/linkis-client/linkis-cli/:/opt/linkis_bin/lib/linkis-commons/public-module/:.:/usr/java/jdk1.8.0_181-cloudera/lib:/usr/java/jdk1.8.0_181-cloudera/jre/lib:.:/usr/java/jdk1.8.0_181-cloudera/lib:/usr/java/jdk1.8.0_181-cloudera/jre/lib:.:/usr/java/jdk1.8.0_181-cloudera/lib:/usr/java/jdk1.8.0_181-cloudera/jre/lib:.:/usr/java/jdk1.8.0_181-cloudera/lib:/usr/java/jdk1.8.0_181-cloudera/jre/lib:.:/usr/java/jdk1.8.0_181-cloudera/lib:/usr/java/jdk1.8.0_181-cloudera/jre/lib:.:/usr/java/jdk1.8.0_181-cloudera/lib:/usr/java/jdk1.8.0_181-cloudera/jre/lib:.:/usr/java/jdk1.8.0_181-cloudera/lib:/usr/java/jdk1.8.0_181-cloudera/jre/lib:.:/usr/java/jdk1.8.0_181-cloudera/lib:/usr/java/jdk1.8.0_181-cloudera/jre/lib: -Dconf.root=/opt/linkis_bin/conf/linkis-cli -Dconf.file=linkis-cli.properties -Dlog.path=/opt/linkis_bin/logs/linkis-cli -Dlog.file=linkis-client.root.log.20240112102740414902455 org.apache.linkis.cli.application.LinkisClientApplication '-submitUser hadoop -engineType hive-2.1.1_cdh6.3.2 -codeType hql -code show tables'
[INFO] LogFile path: /opt/linkis_bin/logs/linkis-cli/linkis-client.root.log.20240112102740414902455
[INFO] User does not provide usr-configuration file. Will use default config
[INFO] user does not specify proxy-user, will use current submit-user "hadoop" by default.
[INFO] connecting to linkis gateway:http://127.0.0.1:9501
JobId:2
TaskId:2
ExecId:exec_id018011linkis-cg-entrancecdh-01:9104LINKISCLI_hadoop_hive_0
[INFO] Job is successfully submitted!
2024-01-12 10:27:46.027 INFO Job with jobId : 2 and execID : LINKISCLI_hadoop_hive_0 submitted
2024-01-12 10:27:46.027 INFO Your job is Scheduled. Please wait it to run.
2024-01-12 10:27:46.027 INFO You have submitted a new job, script code (after variable substitution) is
SCRIPT CODE
show tables
SCRIPT CODE
2024-01-12 10:27:46.027 INFO Your job is accepted, jobID is LINKISCLI_hadoop_hive_0 and jobReqId is 2 in ServiceInstance(linkis-cg-entrance, cdh-01:9104). Please wait it to be scheduled
2024-01-12 10:27:46.027 INFO job is scheduled.
2024-01-12 10:27:46.027 INFO Your job is being scheduled by orchestrator.
2024-01-12 10:27:46.027 INFO Your job is Running now. Please wait it to complete.
2024-01-12 10:27:46.027 INFO job is running.
2024-01-12 10:27:46.027 INFO JobRequest (2) was submitted to Orchestrator.
2024-01-12 10:27:46.027 INFO Background is starting a new engine for you,execId TaskID_2_otJobId_astJob_0_codeExec_0 mark id is mark_0, it may take several seconds, please wait
2024-01-12 10:28:09.028 INFO Succeed to create new ec : ServiceInstance(linkis-cg-engineconn, cdh-01:34976)
2024-01-12 10:28:09.028 INFO Task submit to ec: ServiceInstance(linkis-cg-engineconn, cdh-01:34976) get engineConnExecId is: 1
2024-01-12 10:28:10.028 INFO EngineConn local log path: ServiceInstance(linkis-cg-engineconn, cdh-01:34976) /linkis/tmp/hadoop/20240112/hive/55b6d890-65e9-4edb-922b-2379948c0e88/logs
HiveEngineExecutor_0 >> show tables
Time taken: 825 ms, begin to fetch results.
Fetched 1 col(s) : 1 row(s) in hive
2024-01-12 10:28:10.014 WARN [Linkis-Default-Scheduler-Thread-2] org.apache.linkis.engineconn.computation.executor.hook.executor.ExecuteOnceHook 53 beforeExecutorExecute - execute once become effective, register lock listener
2024-01-12 10:28:12.028 INFO Congratulations! Your job : LINKISCLI_hadoop_hive_0 executed with status succeed and 2 results.
2024-01-12 10:28:12.028 INFO Task creation time(任务创建时间): 2024-01-12 10:27:43, Task scheduling time(任务调度时间): 2024-01-12 10:27:46, Task start time(任务开始时间): 2024-01-12 10:27:46, Mission end time(任务结束时间): 2024-01-12 10:28:12
2024-01-12 10:28:12.028 INFO Task submit to Orchestrator time:2024-01-12 10:27:46, Task request EngineConn time:2024-01-12 10:27:46, Task submit to EngineConn time:2024-01-12 10:28:09
2024-01-12 10:28:12.028 INFO Your mission(您的任务) 2 The total time spent is(总耗时时间为): 28.8 s
2024-01-12 10:28:12.028 INFO Congratulations. Your job completed with status Success.
2024-01-12 10:28:12.028 INFO job is completed.
[INFO] Job execute successfully! Will try get execute result
Result:====
TaskId:2
ExecId: exec_id018011linkis-cg-entrancecdh-01:9104LINKISCLI_hadoop_hive_0
User:hadoop
Current job status:SUCCEED
extraMsg:
result:
[INFO] Retrieving result-set, may take time if result-set is large, please do not exit program.
============ RESULT SET 1 ============
----------- META DATA ------------
columnName comment dataType
tab_name from deserializer string
------------ END OF META DATA ------------
student
############Execute Success!!!########
(三)测试成功:sh bin/linkis-cli -submitUser hadoop -engineType spark-2.4.0 -codeType sql -code "show tables"
赋权HDFS的目录
[root@cdh-01 linkis_bin]# su hdfs
[hdfs@cdh-01 linkis_bin]$ hdfs dfs -chmod -R 777 /user[root@cdh-01 logs]# su hdfs
登录: kinit hadoop
查看登录:klist
[hdfs@cdh-01 linkis_bin]$ kinit hadoop
Password for hadoop@CQCZ.COM:
[hdfs@cdh-01 linkis_bin]$ klist
Ticket cache: FILE:/tmp/krb5cc_995
Default principal: hadoop@CQCZ.COM Valid starting Expires Service principal
2024-01-15T15:51:57 2024-01-16T15:51:57 krbtgt/CQCZ.COM@CQCZ.COM
renew until 2024-01-22T15:51:57
赋权限:
hdfs dfs -chown -R hadoop:hadoop /tmp/linkis/
[hdfs@cdh-01 linkis_bin]$ hdfs dfs -chown -R hadoop:hadoop /tmp/linkis/
chown: changing ownership of '/tmp/linkis': Permission denied. user=hadoop is not the owner of inode=/tmp/linkis
赋权:安装目录
sudo chown -R hadoop:hadoop /opt/linkis_bin
sudo chown -R hadoop:hadoop /tmp/linkis/
赋权hadoop用户 Kerberos
chown hadoop:hadoop /home/hadoop.keytab
ls -l /home/hadoop.keytab
[root@cdh-01 linkis_bin]# sh bin/linkis-cli -submitUser hadoop -engineType spark-2.4.0 -codeType sql -code "show tables"
[hadoop@cdh-01 linkis_bin]$ sh bin/linkis-cli -submitUser hadoop -engineType spark-2.4.0 -codeType sql -code "show tables"
=Java Start Command=
exec /usr/java/jdk1.8.0_181-cloudera/bin/java -server -Xms32m -Xmx2048m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/opt/linkis_bin/logs/linkis-cli -XX:ErrorFile=/opt/linkis_bin/logs/linkis-cli/ps_err_pid%p.log -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=80 -XX:+DisableExplicitGC -classpath /opt/linkis_bin/conf/linkis-cli:/opt/linkis_bin/lib/linkis-computation-governance/linkis-client/linkis-cli/:/opt/linkis_bin/lib/linkis-commons/public-module/:.:/usr/java/jdk1.8.0_181-cloudera/lib:/usr/java/jdk1.8.0_181-cloudera/jre/lib: -Dconf.root=/opt/linkis_bin/conf/linkis-cli -Dconf.file=linkis-cli.properties -Dlog.path=/opt/linkis_bin/logs/linkis-cli -Dlog.file=linkis-client.hadoop.log.20240116180612983607979 org.apache.linkis.cli.application.LinkisClientApplication '-submitUser hadoop -engineType spark-2.4.0 -codeType sql -code show tables'
[INFO] LogFile path: /opt/linkis_bin/logs/linkis-cli/linkis-client.hadoop.log.20240116180612983607979
[INFO] User does not provide usr-configuration file. Will use default config
[INFO] user does not specify proxy-user, will use current submit-user "hadoop" by default.
[INFO] connecting to linkis gateway:http://127.0.0.1:9501
JobId:2
TaskId:2
ExecId:exec_id018011linkis-cg-entrancecdh-01:9104LINKISCLI_hadoop_spark_1
[INFO] Job is successfully submitted!
2024-01-16 18:06:15.006 INFO Program is substituting variables for you
2024-01-16 18:06:15.006 INFO Variables substitution ended successfully
2024-01-16 18:06:15.006 WARN The code you submit will not be limited by the limit
2024-01-16 18:06:15.006 INFO Your job is Scheduled. Please wait it to run.
2024-01-16 18:06:15.006 INFO Job with jobId : 2 and execID : LINKISCLI_hadoop_spark_1 submitted
2024-01-16 18:06:15.006 INFO You have submitted a new job, script code (after variable substitution) is
SCRIPT CODE
show tables
SCRIPT CODE
2024-01-16 18:06:15.006 INFO Your job is accepted, jobID is LINKISCLI_hadoop_spark_1 and jobReqId is 2 in ServiceInstance(linkis-cg-entrance, cdh-01:9104). Please wait it to be scheduled
2024-01-16 18:06:15.006 INFO job is scheduled.
2024-01-16 18:06:15.006 INFO Your job is being scheduled by orchestrator.
2024-01-16 18:06:15.006 INFO Your job is Running now. Please wait it to complete.
2024-01-16 18:06:15.006 INFO job is running.
2024-01-16 18:06:15.006 INFO JobRequest (2) was submitted to Orchestrator.
2024-01-16 18:06:16.006 INFO Background is starting a new engine for you,execId TaskID_2_otJobId_astJob_1_codeExec_1 mark id is mark_1, it may take several seconds, please wait
2024-01-16 18:07:05.007 INFO Succeed to create new ec : ServiceInstance(linkis-cg-engineconn, cdh-01:42520)
2024-01-16 18:07:05.007 INFO Task submit to ec: ServiceInstance(linkis-cg-engineconn, cdh-01:42520) get engineConnExecId is: 1
2024-01-16 18:07:05.007 INFO EngineConn local log path: ServiceInstance(linkis-cg-engineconn, cdh-01:42520) /linkis/tmp/hadoop/20240116/spark/b7fdd9c4-0137-44e0-b7ed-a4203773a310/logs
2024-01-16 18:07:05.007 INFO yarn application id: application_1705390532662_0002
cdh-01:42520 >> show tables
cdh-01:42520 >> Time taken: 952 ms, Fetched 1 row(s).
2024-01-16 18:07:05.337 WARN [Linkis-Default-Scheduler-Thread-2] org.apache.linkis.engineconn.computation.executor.hook.executor.ExecuteOnceHook 53 beforeExecutorExecute - execute once become effective, register lock listener
2024-01-16 18:07:06.530 WARN [Linkis-Default-Scheduler-Thread-2] com.cloudera.spark.lineage.LineageWriter 69 logWarning - Lineage directory /var/log/spark/lineage doesn't exist or is not writable. Lineage for this application will be disabled.
2024-01-16 18:07:12.645 WARN [Linkis-Default-Scheduler-Thread-2] org.apache.linkis.engineplugin.spark.executor.SQLSession$ 149 showDF - Time taken: 952 ms, Fetched 1 row(s).
2024-01-16 18:07:12.007 INFO Congratulations! Your job : LINKISCLI_hadoop_spark_1 executed with status succeed and 2 results.
2024-01-16 18:07:12.007 INFO Task creation time(任务创建时间): 2024-01-16 18:06:15, Task scheduling time(任务调度时间): 2024-01-16 18:06:15, Task start time(任务开始时间): 2024-01-16 18:06:15, Mission end time(任务结束时间): 2024-01-16 18:07:12
2024-01-16 18:07:12.007 INFO Task submit to Orchestrator time:2024-01-16 18:06:15, Task request EngineConn time:2024-01-16 18:06:16, Task submit to EngineConn time:2024-01-16 18:07:05
2024-01-16 18:07:12.007 INFO Your mission(您的任务) 2 The total time spent is(总耗时时间为): 57.6 s
2024-01-16 18:07:12.007 INFO Congratulations. Your job completed with status Success.
2024-01-16 18:07:12.007 INFO job is completed.
[INFO] Job execute successfully! Will try get execute result
Result:====
TaskId:2
ExecId: exec_id018011linkis-cg-entrancecdh-01:9104LINKISCLI_hadoop_spark_1
User:hadoop
Current job status:SUCCEED
extraMsg:
result:
[INFO] Retrieving result-set, may take time if result-set is large, please do not exit program.
============ RESULT SET 1 ============
----------- META DATA ------------
columnName comment dataType
database NULL string
tableName NULL string
isTemporary NULL boolean
------------ END OF META DATA ------------
default student false
############Execute Success!!!########
HIve增加错误记录数据库:
create database admin_ind
报错:JDBCDriver
在spark jars目录也添加
cp mysql-connector-java-8.0.30.jar /opt/cloudera/parcels/CDH/lib/spark/jars
vim /opt/cloudera/parcels/CDH/lib/spark/conf/spark-defaults.conf
spark.driver.extraClassPath=/opt/cloudera/parcels/CDH/lib/spark/jars/mysql-connector-java-8.0.30.jar
spark.executor.extraClassPath=/opt/cloudera/parcels/CDH/lib/spark/jars/mysql-connector-java-8.0.30.jar
spark.jars=/opt/cloudera/parcels/CDH/lib/spark/jars/mysql-connector-java-8.0.30.jar
执行有问题一般都重启cg-enginconnmanager这一个服务就行了
sh sbin/linkis-daemon.sh restart cg-linkismanager
调试:sql
spark-shell;
spark.sql("select count(*) from default.student where (name = 'lisi') and (sex is null)").show
spark.sql("select * from default.student").show
增加引擎:JDBC
单独编译引擎插件(需要有
maven
环境)# 编译
cd ${linkis_code_dir}/linkis-engineconn-plugins/jdbc/
mvn clean install
# 编译出来的引擎插件包,位于如下目录中
${linkis_code_dir}/linkis-engineconn-plugins/jdbc/target/out/
上传JDBC引擎:
${LINKIS_HOME}/lib/linkis-engineconn-plugins
linkis-engineconn-plugins/
├── jdbc
│ ├── dist
│ │ └── 4
│ │ ├── conf
│ │ └── lib
│ └── plugin
│ └── 4
通过重启 linkis-cg-linkismanager
服务刷新引擎
cd ${LINKIS_HOME}/sbin
sh linkis-daemon.sh restart cg-linkismanager
[hadoop@cdh-01 linkis_bin]$ JDBC的引擎加载测试
sh ./bin/linkis-cli -engineType jdbc-4
-codeType jdbc -code "show tables"
-submitUser hadoop -proxyUser hadoop
-runtimeMap wds.linkis.jdbc.connect.url=jdbc:mysql://127.0.0.1:3306/linkis_test
-runtimeMap wds.linkis.jdbc.driver=com.mysql.jdbc.Driver
-runtimeMap wds.linkis.jdbc.username=root
-runtimeMap wds.linkis.jdbc.password=123456
=Java Start Command=
exec /usr/java/jdk1.8.0_181-cloudera/bin/java -server -Xms32m -Xmx2048m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/opt/linkis_bin/logs/linkis-cli -XX:ErrorFile=/opt/linkis_bin/logs/linkis-cli/ps_err_pid%p.log -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=80 -XX:+DisableExplicitGC -classpath /opt/linkis_bin/conf/linkis-cli:/opt/linkis_bin/lib/linkis-computation-governance/linkis-client/linkis-cli/:/opt/linkis_bin/lib/linkis-commons/public-module/:.:/usr/java/jdk1.8.0_181-cloudera/lib:/usr/java/jdk1.8.0_181-cloudera/jre/lib: -Dconf.root=/opt/linkis_bin/conf/linkis-cli -Dconf.file=linkis-cli.properties -Dlog.path=/opt/linkis_bin/logs/linkis-cli -Dlog.file=linkis-client.hadoop.log.20240113154250491733531 org.apache.linkis.cli.application.LinkisClientApplication '-engineType jdbc-4 -codeType jdbc -code show tables -submitUser hadoop -proxyUser hadoop -runtimeMap wds.linkis.jdbc.connect.url=jdbc:mysql://127.0.0.1:3306/linkis_test -runtimeMap wds.linkis.jdbc.driver=com.mysql.jdbc.Driver -runtimeMap wds.linkis.jdbc.username=root -runtimeMap wds.linkis.jdbc.password=123456'
[INFO] LogFile path: /opt/linkis_bin/logs/linkis-cli/linkis-client.hadoop.log.20240113154250491733531
[INFO] User does not provide usr-configuration file. Will use default config
[INFO] connecting to linkis gateway:http://127.0.0.1:9501
JobId:9
TaskId:9
ExecId:exec_id018011linkis-cg-entrancecdh-01:9104LINKISCLI_hadoop_jdbc_0
[INFO] Job is successfully submitted!
2024-01-13 15:42:52.042 INFO Program is substituting variables for you
2024-01-13 15:42:52.042 INFO Variables substitution ended successfully
2024-01-13 15:42:53.042 WARN The code you submit will not be limited by the limit
2024-01-13 15:42:53.042 INFO Job with jobId : 9 and execID : LINKISCLI_hadoop_jdbc_0 submitted
2024-01-13 15:42:53.042 INFO Your job is Scheduled. Please wait it to run.
2024-01-13 15:42:53.042 INFO You have submitted a new job, script code (after variable substitution) is
SCRIPT CODE
show tables
SCRIPT CODE
2024-01-13 15:42:53.042 INFO Your job is accepted, jobID is LINKISCLI_hadoop_jdbc_0 and jobReqId is 9 in ServiceInstance(linkis-cg-entrance, cdh-01:9104). Please wait it to be scheduled
2024-01-13 15:42:53.042 INFO job is scheduled.
2024-01-13 15:42:53.042 INFO Your job is being scheduled by orchestrator.
2024-01-13 15:42:53.042 INFO Your job is Running now. Please wait it to complete.
2024-01-13 15:42:53.042 INFO job is running.
2024-01-13 15:42:53.042 INFO JobRequest (9) was submitted to Orchestrator.
2024-01-13 15:42:53.042 INFO Background is starting a new engine for you,execId TaskID_9_otJobId_astJob_3_codeExec_3 mark id is mark_3, it may take several seconds, please wait
2024-01-13 15:43:16.043 INFO Succeed to create new ec : ServiceInstance(linkis-cg-engineconn, cdh-01:44029)
2024-01-13 15:43:16.043 INFO Task submit to ec: ServiceInstance(linkis-cg-engineconn, cdh-01:44029) get engineConnExecId is: 1
2024-01-13 15:43:16.043 INFO EngineConn local log path: ServiceInstance(linkis-cg-engineconn, cdh-01:44029) /linkis/tmp/hadoop/20240113/jdbc/bc7cd343-b072-4800-a01b-6392d4c12211/logs
hdfs:///tmp/linkis/hadoop/linkis/2024-01-13/154253/LINKISCLI/9/_0.dolphin
2024-01-13 15:43:19.043 INFO Congratulations! Your job : LINKISCLI_hadoop_jdbc_0 executed with status succeed and 2 results.
2024-01-13 15:43:19.043 INFO Task creation time(任务创建时间): 2024-01-13 15:42:52, Task scheduling time(任务调度时间): 2024-01-13 15:42:53, Task start time(任务开始时间): 2024-01-13 15:42:53, Mission end time(任务结束时间): 2024-01-13 15:43:19
2024-01-13 15:43:19.043 INFO Task submit to Orchestrator time:2024-01-13 15:42:53, Task request EngineConn time:2024-01-13 15:42:53, Task submit to EngineConn time:2024-01-13 15:43:16
2024-01-13 15:43:19.043 INFO Your mission(您的任务) 9 The total time spent is(总耗时时间为): 26.6 s
2024-01-13 15:43:19.043 INFO Congratulations. Your job completed with status Success.
2024-01-13 15:43:19.043 INFO job is completed.
[INFO] Job execute successfully! Will try get execute result
Result:====
TaskId:9
ExecId: exec_id018011linkis-cg-entrancecdh-01:9104LINKISCLI_hadoop_jdbc_0
User:hadoop
Current job status:SUCCEED
extraMsg:
result:
[INFO] Retrieving result-set, may take time if result-set is large, please do not exit program.
============ RESULT SET 1 ============
----------- META DATA ------------
columnName comment dataType
Tables_in_linkis_test string
------------ END OF META DATA ------------
linkis_cg_ec_resource_info_record
linkis_cg_engine_conn_plugin_bml_resources
linkis_cg_manager_engine_em
linkis_cg_manager_label
linkis_cg_manager_label_resource
linkis_cg_manager_label_service_instance
linkis_cg_manager_label_user
linkis_cg_manager_label_value_relation
linkis_cg_manager_linkis_resources
linkis_cg_manager_lock
linkis_cg_manager_metrics_history
linkis_cg_manager_service_instance
linkis_cg_manager_service_instance_metrics
linkis_cg_rm_external_resource_provider
linkis_cg_tenant_label_config
linkis_cg_user_ip_config
linkis_mg_gateway_auth_token
linkis_ps_bml_project
linkis_ps_bml_project_resource
linkis_ps_bml_project_user
linkis_ps_bml_resources
linkis_ps_bml_resources_permission
linkis_ps_bml_resources_task
linkis_ps_bml_resources_version
linkis_ps_common_lock
linkis_ps_configuration_category
linkis_ps_configuration_config_key
linkis_ps_configuration_config_value
linkis_ps_configuration_key_engine_relation
linkis_ps_cs_context_history
linkis_ps_cs_context_id
linkis_ps_cs_context_listener
linkis_ps_cs_context_map
linkis_ps_cs_context_map_listener
linkis_ps_datasource_access
linkis_ps_datasource_field
linkis_ps_datasource_import
linkis_ps_datasource_lineage
linkis_ps_datasource_table
linkis_ps_datasource_table_info
linkis_ps_dm_datasource
linkis_ps_dm_datasource_env
linkis_ps_dm_datasource_type
linkis_ps_dm_datasource_type_key
linkis_ps_dm_datasource_version
linkis_ps_error_code
linkis_ps_instance_info
linkis_ps_instance_label
linkis_ps_instance_label_relation
linkis_ps_instance_label_value_relation
linkis_ps_job_history_detail
linkis_ps_job_history_group_history
linkis_ps_resources_download_history
linkis_ps_udf_baseinfo
linkis_ps_udf_manager
linkis_ps_udf_shared_group
linkis_ps_udf_shared_info
linkis_ps_udf_tree
linkis_ps_udf_user_load
linkis_ps_udf_version
linkis_ps_variable_key
linkis_ps_variable_key_user
############Execute Success!!!########
您在 /var/spool/mail/root 中有新邮件
QUalitis接入:Linkis
-- 查看Token
select * from linkis_mg_gateway_auth_token;
-- 查看已经安装的插件
select * from linkis_cg_engine_conn_plugin_bml_resources;
CDH6.3.2下安装部署Qualitis数据质量分析的计算服务Linkis1.3.2的更多相关文章
- Linux Centos7.x下安装部署VNC的实操详述
VNC (Virtual Network Console)是虚拟网络控制台的缩写.它 是一款优秀的远程控制工具软件,由著名的AT&T的欧洲研究实验室开发的.VNC 是在基于 UNIX和 Lin ...
- linux下安装部署ansible
linux下安装部署ansible 介绍 Ansible是一种批量部署工具,现在运维人员用的最多的三种开源集中化管理工具有:puppet,saltstack,ansible,各有各的优缺点,其中sal ...
- centos7 下 安装部署nginx
centos7 下 安装部署nginx 1.nginx安装依赖于三个包,注意安装顺序 a.SSL功能需要openssl库,直接通过yum安装: #yum install openssl b.gzip模 ...
- Ubuntu下安装部署MongoDB以及设置允许远程连接
最近因为项目原因需要在阿里云服务器上部署MongoDB,操作系统为Ubuntu,网上查阅了一些资料,特此记录一下步骤. 1.运行apt-get install mongodb命令安装MongoDB服务 ...
- hadoop2 Ubuntu 下安装部署
搭建Hadoop环境( 我以hadoop 2.7.3 为例, 系统为 64bit Ubuntu14.04 ) hadoop 2.7.3 官网下载 , 选择自己要安装的版本.注意每个版本对应两个下载选项 ...
- CentOS 7.4 下安装部署Nagios监控系统详细攻略(三)
Nagios是一个流行的电脑系统和网络监控程序,它检测主机和服务,当异常发生和解除时能提醒用户.它是基于GPLv2开发的开源软件,可免费获得及使用. nagios工作原理 nagios的功能是监控服务 ...
- Linux Centos6.9下安装部署VNC的实操详述
VNC (Virtual Network Console)是虚拟网络控制台的缩写.它 是一款优秀的远程控制工具软件,由著名的AT&T的欧洲研究实验室开发的.VNC 是在基于 UNIX和 Lin ...
- Linux下安装部署Samba共享盘的操作手册
简述 Samba是在Linux和UNIX系统上实现SMB协议的一个免费软件,由服务器及客户端程序构成.SMB(Server Messages Block,信息服务块)是一种在局域网上共享文件和打印机的 ...
- Linux Centos7.x下安装部署Jira和confluence以及破解方法详述
简述 JIRA是Atlassian公司出品的项目与事务跟踪工具,被广泛应用于缺陷跟踪.客户服务.需求收集.流程审批.任务跟踪.项目跟踪和敏捷管理等工作领域. Confluence是一个专业的企业知识管 ...
- Centos7下安装部署MXNET
Centos下安装MXNET,参考官方文档http://mxnet.io/get_started/setup.html#prerequisites, 一.安装python的mxnet步骤如下: #!/ ...
随机推荐
- 一个R包—reticulate—在R中调用Python
R语言和python语言是生信行业经常使用的两个计算机语言,R语言具有统计和画图方面的优势,但是R语言在文件读写上的速度实在不敢恭维:Python具有较快的文件读写功能,但是其统计和画图却不如R语言用 ...
- CDS标准视图:ABC标识文本 I_ABCIndicatorText
视图名称:ABC标识文本 I_ABCIndicatorText 视图类型:基础视图 视图代码: 点击查看代码 @EndUserText.label: 'ABC Indicator - Text' @V ...
- H5播放音频和视频
H5播放音频和视频: <!DOCTYPE html> <html> <head> <meta charset="UTF-8"> &l ...
- NoSQL和SQL的区别、使用场景与选型比较
什么是NoSQL NoSQL,指的是非关系型的数据库.NoSQL有时也称作Not Only SQL的缩写,是对不同于传统的关系型数据库的数据库管理系统的统称,它具有非关系型.分布式.不提供ACID的数 ...
- w3cschool-Python3 高级教程
https://www.w3cschool.cn/python3/python3-reg-expressions.html Python3 正则表达式 re.match 函数 re.match 尝试从 ...
- MongoDB:文档基本CRUD
- 天翼云GPU云主机:共享信息技术与虚拟机的完美融合
本文分享自天翼云开发者社区<天翼云GPU云主机:共享信息技术与虚拟机的完美融合>,作者:不知不觉 在云计算领域,GPU云主机已经成为了一个备受瞩目的焦点.它的出现改变了传统IT架构的方式, ...
- golang轻量级版本管理工具g安装使用
使用 g 可以在 windows 上切换使用不同版本的 golang GitHub仓库地址 https://github.com/voidint/g GitHub下载连接 https://github ...
- 免费的天气接口api(腾讯)
请求URL: https://wis.qq.com/weather/common请求方式: GET参数: 参数名 必选 类型 说明 source 是 string pc weather_type 是 ...
- Nmap 语法及示例
Nmap 语法及示例 基本语法 Nmap的基本语法结构如下: nmap [scan types] [options] [target] [scan types]: 标识扫描类型,如:TCP.UDP等. ...