注1:griffin 官网 doc 部分教程内容较少,主要在 griffin 源文件的 griffin-doc/ 目录下。

注2:在 griffin 源文件的 griffin-doc/ 目录下含有详细部署教程,建议先大致浏览一遍再安装。


1 准备

为了减少安装中出现的错误,下载 griffin 后,查看下 measure/pom.xmlservice/pom.xml文件,确定需要提前安装的软件的版本。

软件 版本
JDK jdk1.8.0_301
Scala scala2.11.8
MySQL mysql5.6.24
Hadoop hadoop-2.7.3
Hive hive-2.3.9
Spark spark-2.2.1
Livy livy-0.7.1
ES es-7.16.2
MAVEN maven-3.8.4
griffin griffin0.5.0

官方教程上写,可以按如下命令提前安装 npm,暂未测试。

$ sudo apt install nodejs
$ sudo apt install npm
$ node -v
$ npm -v

mysql

安装方法见 file 目录下的 尚硅谷MySQL安装

在 mysql 中创建数据库 quartz,然后执行 Init_quartz_mysql_innodb.sql 脚本初始化表信息:

mysql> create database quartz;
Query OK, 1 row affected (0.00 sec) mysql> use quartz;
Database changed mysql> source /root/Init_quartz_mysql_innodb.sql;
Query OK, 0 rows affected, 1 warning (0.01 sec)
... mysql> show tables;
+--------------------------+
| Tables_in_quartz |
+--------------------------+
| QRTZ_BLOB_TRIGGERS |
| QRTZ_CALENDARS |
| QRTZ_CRON_TRIGGERS |
| QRTZ_FIRED_TRIGGERS |
| QRTZ_JOB_DETAILS |
| QRTZ_LOCKS |
| QRTZ_PAUSED_TRIGGER_GRPS |
| QRTZ_SCHEDULER_STATE |
| QRTZ_SIMPLE_TRIGGERS |
| QRTZ_SIMPROP_TRIGGERS |
| QRTZ_TRIGGERS |
+--------------------------+
11 rows in set (0.00 sec)

hive

hive 安装方法点 这里

hive-site.xml 中添加如下配置:

<property>
<name>hive.metastore.uris</name>
<value>thrift://bigdata101:9083</value>
</property>

启动服务

nohup /opt/hive-2.3.9/bin/hive --service metastore &
nohup /opt/hive-2.3.9/bin/hive --service hiveserver2 &

hive2.x 版本需要启动两个服务 metastore 和 hiveserver2,否则会报错 Exception in thread "main" java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient

hadoop

hadoop 安装方法点 这里

spark

下载解压后,进行如下配置即可。

[root@bigdata101 conf]# cat spark-defaults.conf
spark.eventLog.enabled true
spark.eventLog.dir hdfs://bigdata101:9000/spark_directory
spark.master yarn-client
spark.serializer org.apache.spark.serializer.KryoSerializer
spark.yarn.jars hdfs://bigdata101:9000/home/spark_lib/*
spark.yarn.dist.files hdfs://bigdata101:9000/home/spark_conf/hive-site.xml
spark.sql.broadcastTimeout 500 [root@bigdata101 conf]# cat spark-env.sh
HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
SPARK_MASTER_HOST=bigdata101
SPARK_MASTER_PORT=7077
SPARK_MASTER_WEBUI_PORT=8082
SPARK_LOCAL_IP=bigdata101
YARN_CONF_DIR=/opt/hadoop-2.7.3/etc/hadoop
export SPARK_HISTORY_OPTS="-Dspark.history.ui.port=18080 -Dspark.history.retainedApplications=30 -Dspark.history.fs.logDirectory=hdfs://bigdata101:9000/spark_directory"

复制如下配置文件

# 把 hive 中 lib/datanucleus-*.jar 包拷贝到 spark 的 jars/ 路径下。
[root@bigdata101 opt]# cp /opt/hive-2.3.9/lib/datanucleus-*.jar $SPARK_HOME/jars/ # 把 hive 中 conf/hive-site.xml 包拷贝到 spark 的 conf/ 路径下。
[root@bigdata101 opt]# cp /opt/hive-2.3.9/conf/hive-site.xml $SPARK_HOME/conf/ # 在 hdfs 上创建 `/home/spark_conf` 目录,并将 `hive-site.xml` 上传到该目录下
[root@bigdata101 opt]# hadoop fs -mkdir -p /home/spark_conf [root@bigdata101 opt]# hadoop fs -put /opt/hive-2.3.9/conf/hive-site.xml /home/spark_conf/ [root@bigdata101 opt]# hadoop fs -ls /home/spark_conf
Found 1 items
-rw-r--r-- 3 root supergroup 1259 2021-12-25 18:18 /home/spark_conf/hive-site.xml # 在 hdfs 上创建 `/home/spark_lib` 目录,并将 spark `jars/` 目录下的文件上传到该目录下
[root@bigdata101 opt]# hadoop fs -mkdir /home/spark_lib [root@bigdata101 opt]# hadoop fs -put $SPARK_HOME/jars/* hdfs://bigdata101:9000/home/spark_lib/

如果在 hadoop 中配置了 lzo,则需要把 hadoop-lzo-0.4.21-SNAPSHOT.jar 文件复制到 spark 的 conf/ 目录中。

livy

下载:http://archive.apache.org/dist/incubator/livy/

[root@bigdata101 livy-0.7.1]# cat conf/livy.conf
livy.server.host = bigdata101
livy.spark.master = yarn
livy.spark.deployMode = client
livy.repl.enableHiveContext = true
livy.server.port = 8998

安装 spark 后再启动 livy,否则会报错。

执行命令 bin/livy-server start 启动

es

安装方法点 这里

启动 es 后,在 es 里创建 griffin 索引,并查看

[zgg@bigdata101 es-7.16.2]$ curl -k -H "Content-Type: application/json" -X PUT http://bigdata101:9200/griffin?include_type_name=true \
> -d '{
> "aliases": {},
> "mappings": {
> "accuracy": {
> "properties": {
> "name": {
> "fields": {
> "keyword": {
> "ignore_above": 256,
> "type": "keyword"
> }
> },
> "type": "text"
> },
> "tmst": {
> "type": "date"
> }
> }
> }
> },
> "settings": {
> "index": {
> "number_of_replicas": "2",
> "number_of_shards": "5"
> }
> }
> }'
{"acknowledged":true,"shards_acknowledged":true,"index":"griffin"} # 查看所有索引
[zgg@bigdata101 es-7.16.2]$ curl -X GET http://bigdata101:9200/_cat/indices
green open .geoip_databases HDSgUJ23Qwe-TI6KZ4e6Rw 1 1 43 46 224.8mb 111.9mb
green open griffin h9fUJ3IpT9Sxcr4RbZKEKQ 5 2 0 0 3.3kb 1.1kb # 查看索引griffin内容
[zgg@bigdata101 es-7.16.2]$ curl -X GET http://bigdata101:9200/griffin/_search?pretty
{
"took" : 27,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}

maven

下载:https://maven.apache.org/download.cgi

[root@bigdata101 maven-3.8.4]# cat conf/settings.xml
<!-- 添加阿里云镜像-->
<mirror>
<id>nexus-aliyun</id>
<mirrorOf>central</mirrorOf>
<name>Nexus aliyun</name>
<url>http://maven.aliyun.com/nexus/content/groups/public</url>
</mirror>
<mirror>
<id>UK</id>
<name>UK Central</name>
<url>http://uk.maven.org/maven2</url>
<mirrorOf>central</mirrorOf>
</mirror>
<mirror>
<id>repo1</id>
<mirrorOf>central</mirrorOf>
<name>Human Readable Name for this Mirror.</name>
<url>http://repo1.maven.org/maven2/</url>
</mirror>
<mirror>
<id>repo2</id>
<mirrorOf>central</mirrorOf>
<name>Human Readable Name for this Mirror.</name>
<url>http://repo2.maven.org/maven2/</url>
</mirror>

配置环境变量

[root@bigdata101 opt]# cat /etc/profile
export JAVA_HOME=/opt/jdk1.8.0_301
export ES_JAVA_HOME=/opt/elasticsearch-7.16.2/jdk export HADOOP_HOME=/opt/hadoop-2.7.3
export MAVEN_HOME=/opt/maven-3.8.4
export SPARK_HOME=/opt/spark-2.2.1
export SCALA_HOME=/opt/scala-2.11.8
export LIVY_HOME=/opt/livy-0.7.1
export HIVE_HOME=/opt/hive-2.3.9
export GRIFFIN_HOME=/opt/griffin-0.5.0
export HADOOP_COMMON_HOME=/opt/hadoop-2.7.3
export HADOOP_COMMON_LIB_NATIVE_DIR=/opt/hadoop-2.7.3/lib/native
export HADOOP_HDFS_HOME=/opt/hadoop-2.7.3
export HADOOP_INSTALL=/opt/hadoop-2.7.3
export HADOOP_MAPRED_HOME=/opt/hadoop-2.7.3
export HADOOP_USER_CLASSPATH_FIRST=true
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_HOME=/opt/hadoop-2.7.3
export PATH=.:$LIVY_HOME/bin:$GRIFFIN_HOME/bin:$MAVEN_HOME/bin:$SPARK_HOME/bin:$SCALA_HOME/bin:$HIVE_HOME/bin:$HIVE_HOME/sbin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$JAVA_HOME/bin:$PATH

2 安装griffin

下载:https://griffin.apache.org/docs/download.html

配置配置文件

下载解压后,配置配置文件,如下:

配置 service/src/main/resources/application.properties

# griffin server启动默认端口8080
server.port=8090
spring.application.name=griffin_service
spring.datasource.url=jdbc:mysql://bigdata101:3306/quartz?autoReconnect=true&useSSL=false
spring.datasource.username=root
spring.datasource.password=000000
spring.jpa.generate-ddl=true
spring.datasource.driver-class-name=com.mysql.jdbc.Driver
spring.jpa.show-sql=true
# Hive metastore
hive.metastore.uris=thrift://bigdata101:9083
hive.metastore.dbname=default
hive.hmshandler.retry.attempts=15
hive.hmshandler.retry.interval=2000ms
#Hive jdbc
hive.jdbc.className=org.apache.hive.jdbc.HiveDriver
hive.jdbc.url=jdbc:hive2://localhost:10000/
hive.need.kerberos=false
hive.keytab.user=xxx@xx.com
hive.keytab.path=/path/to/keytab/file
# Hive cache time
cache.evict.hive.fixedRate.in.milliseconds=900000
# Kafka schema registry
kafka.schema.registry.url=http://bigdata101:8081
# Update job instance state at regular intervals
jobInstance.fixedDelay.in.milliseconds=60000
# Expired time of job instance which is 7 days that is 604800000 milliseconds.Time unit only supports milliseconds
jobInstance.expired.milliseconds=604800000
# schedule predicate job every 5 minutes and repeat 12 times at most
#interval time unit s:second m:minute h:hour d:day,only support these four units
predicate.job.interval=5m
predicate.job.repeat.count=12
# external properties directory location
external.config.location=
# external BATCH or STREAMING env
external.env.location=
# login strategy ("default" or "ldap")
login.strategy=default
# ldap
ldap.url=ldap://hostname:port
ldap.email=@example.com
ldap.searchBase=DC=org,DC=example
ldap.searchPattern=(sAMAccountName={0})
# hdfs default name
fs.defaultFS=hdfs://bigdata101:9000
# elasticsearch
elasticsearch.host=bigdata101
elasticsearch.port=9200
elasticsearch.scheme=http
# elasticsearch.user = user
# elasticsearch.password = password
# livy
livy.uri=http://bigdata101:8998/batches
livy.need.queue=false
livy.task.max.concurrent.count=20
livy.task.submit.interval.second=3
livy.task.appId.retry.count=3
livy.need.kerberos=false
livy.server.auth.kerberos.principal=livy/kerberos.principal
livy.server.auth.kerberos.keytab=/path/to/livy/keytab/file
# yarn url
yarn.uri=http://bigdata102:8088
# griffin event listener
internal.event.listeners=GriffinJobEventHook logging.file=logs/griffin-service.log

配置 service/src/main/resources/quartz.properties

org.quartz.scheduler.instanceName=spring-boot-quartz
org.quartz.scheduler.instanceId=AUTO
org.quartz.threadPool.threadCount=5
org.quartz.jobStore.class=org.quartz.impl.jdbcjobstore.JobStoreTX
# If you use postgresql as your database,set this property value to org.quartz.impl.jdbcjobstore.PostgreSQLDelegate
# If you use mysql as your database,set this property value to org.quartz.impl.jdbcjobstore.StdJDBCDelegate
# If you use h2 as your database, it's ok to set this property value to StdJDBCDelegate, PostgreSQLDelegate or others
org.quartz.jobStore.driverDelegateClass=org.quartz.impl.jdbcjobstore.PostgreSQLDelegate
org.quartz.jobStore.useProperties=true
org.quartz.jobStore.misfireThreshold=60000
org.quartz.jobStore.tablePrefix=QRTZ_
org.quartz.jobStore.isClustered=true
org.quartz.jobStore.clusterCheckinInterval=20000

配置 service/src/main/resources/sparkProperties.json

{
"file": "hdfs://bigdata101:9000/griffin/griffin-measure.jar",
"className": "org.apache.griffin.measure.Application",
"queue": "default",
"numExecutors": 2,
"executorCores": 1,
"driverMemory": "1g",
"executorMemory": "1g",
"conf": {
"spark.yarn.dist.files": "hdfs://bigdata101:9000/home/spark_conf/hive-site.xml"
},
"files": [
]
}

配置 service/src/main/resources/env/env_batch.json

也可以单独创建配置,在执行任务的时候指向这个文件,env_streaming.json 文件同理。

{
"spark": {
"log.level": "INFO"
},
"sinks": [
{
"name": "console",
"type": "CONSOLE",
"config": {
"max.log.lines": 10
}
},
{
"name": "hdfs",
"type": "HDFS",
"config": {
"path": "hdfs://bigdata101:9000/griffin/persist",
"max.persist.lines": 10000,
"max.lines.per.file": 10000
}
},
{
"name": "elasticsearch",
"type": "ELASTICSEARCH",
"config": {
"method": "post",
"api": "hdfs://bigdata101:9000/griffin/accuracy",
"connection.timeout": "1m",
"retry": 10
}
}
],
"griffin.checkpoint": []
}

配置 service/src/main/resources/env/env_streaming.json

{
"spark": {
"log.level": "WARN",
"checkpoint.dir": "hdfs://bigdata101:9000/griffin/checkpoint/${JOB_NAME}",
"init.clear": true,
"batch.interval": "1m",
"process.interval": "5m",
"config": {
"spark.default.parallelism": 4,
"spark.task.maxFailures": 5,
"spark.streaming.kafkaMaxRatePerPartition": 1000,
"spark.streaming.concurrentJobs": 4,
"spark.yarn.maxAppAttempts": 5,
"spark.yarn.am.attemptFailuresValidityInterval": "1h",
"spark.yarn.max.executor.failures": 120,
"spark.yarn.executor.failuresValidityInterval": "1h",
"spark.hadoop.fs.hdfs.impl.disable.cache": true
}
},
"sinks": [
{
"type": "CONSOLE",
"config": {
"max.log.lines": 100
}
},
{
"type": "HDFS",
"config": {
"path": "hdfs://bigdata101:9000/griffin/persist",
"max.persist.lines": 10000,
"max.lines.per.file": 10000
}
},
{
"type": "ELASTICSEARCH",
"config": {
"method": "post",
"api": "hdfs://bigdata101:9000/griffin/accuracy"
}
}
],
"griffin.checkpoint": [
{
"type": "zk",
"config": {
"hosts": "zk:2181",
"namespace": "griffin/infocache",
"lock.path": "lock",
"mode": "persist",
"init.clear": true,
"close.clear": false
}
}
]
}

service/pom.xml

<!-- <dependency>
<groupId>org.postgresql</groupId>
<artifactId>postgresql</artifactId>
<version>${postgresql.version}</version>
</dependency> -->
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
</dependency>
<!-- 注意:版本号删除掉 -->

编译

[root@bigdata101 griffin-0.5.0-src]# mvn clean
[root@bigdata101 griffin-0.5.0-src]# mvn -T2C install -DskipTests [INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Apache Griffin 0.5.0 0.5.0:
[INFO]
[INFO] Apache Griffin 0.5.0 ............................... SUCCESS [ 3.069 s]
[INFO] Apache Griffin :: UI :: Default UI ................. SUCCESS [02:05 min]
[INFO] Apache Griffin :: Web Service ...................... SUCCESS [01:02 min]
[INFO] Apache Griffin :: Measures ......................... SUCCESS [01:39 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 02:08 min (Wall Clock)
[INFO] Finished at: 2021-12-26T12:04:02+08:00
[INFO] ------------------------------------------------------------------------

部署jar包

# 环境变量已在上面配置
[root@bigdata101 opt]# mkdir griffin-0.5.0 # 将 measure-0.5.0.jar 复制到 griffin-0.5.0 目录下,并重命名
[root@bigdata101 griffin-0.5.0-src]# cp measure/target/measure-0.5.0.jar $GRIFFIN_HOME/griffin-measure.jar # 将 service-0.5.0.jar 复制到 griffin-0.5.0 目录下,并重命名
[root@bigdata101 griffin-0.5.0-src]# cp service/target/service-0.5.0.jar $GRIFFIN_HOME/griffin-service.jar [root@bigdata101 griffin-0.5.0]# ls
griffin-measure.jar griffin-service.jar [root@bigdata101 griffin-0.5.0]# hadoop fs -mkdir -p /griffin/persist [root@bigdata101 griffin-0.5.0]# hadoop fs -mkdir -p /griffin/checkpoint # 将 griffin-measure.jar 上传到 hdfs
[root@bigdata101 griffin-0.5.0]# hadoop fs -put $GRIFFIN_HOME/griffin-measure.jar /griffin/ [root@bigdata101 griffin-0.5.0]# hadoop fs -ls /griffin
Found 3 items
drwxr-xr-x - root supergroup 0 2021-12-26 12:27 /griffin/checkpoint
-rw-r--r-- 3 root supergroup 46538384 2021-12-26 12:27 /griffin/griffin-measure.jar
drwxr-xr-x - root supergroup 0 2021-12-26 12:27 /griffin/persist
[root@bigdata101 griffin-0.5.0]# nohup java -jar $GRIFFIN_HOME/griffin-service.jar >$GRIFFIN_HOME/service.out 2>&1 &

[root@bigdata101 griffin-0.5.0]# jps
2098 DataNode
2996 Master
3748 RunJar
2646 -- process information unavailable
2374 NodeManager
5206 griffin-service.jar
5271 Jps
3064 Worker
1994 NameNode
2555 LivyServer

浏览器输入 http://bigdata101:8090/,查看:

默认账户和密码都是无,直接点登录即可。

界面操作示例点这里

3 批处理测试

获取测试数据:https://griffin.apache.org/data/batch

[root@bigdata101 griffin-0.5.0]# mkdir data/

wget https://griffin.apache.org/data/batch/create-table.hql
wget https://griffin.apache.org/data/batch/delta_src
wget https://griffin.apache.org/data/batch/delta_tgt
wget https://griffin.apache.org/data/batch/demo_basic
wget https://griffin.apache.org/data/batch/demo_src
wget https://griffin.apache.org/data/batch/demo_tgt
wget https://griffin.apache.org/data/batch/gen-hive-data.sh
wget https://griffin.apache.org/data/batch/gen_delta_src.sh
wget https://griffin.apache.org/data/batch/gen_demo_data.sh
wget https://griffin.apache.org/data/batch/insert-data.hql.template [root@bigdata101 data]# ls
create-table.hql delta_tgt demo_src gen_delta_src.sh gen-hive-data.sh
delta_src demo_basic demo_tgt gen_demo_data.sh insert-data.hql.template [root@bigdata101 data]# chmod 777 * # 修改 create-table.hql 中的 LOCATION 路径
[root@bigdata101 data]# vi create-table.hql
...
LOCATION
'hdfs://bigdata101:9000/griffin/data/batch/demo_src';
...
LOCATION
'hdfs://bigdata101:9000/griffin/data/batch/demo_tgt'; # 生成测试数据
[root@bigdata101 data]# ./gen-hive-data.sh hive> select * from demo_src limit 5;
OK
0 1 1 20211226 13
0 2 2 20211226 13
0 3 3 20211226 13
0 4 4 20211226 13
0 5 5 20211226 13 hive> desc demo_src;
OK
id bigint
age int
desc string
dt string
hour string # Partition Information
# col_name data_type comment dt string
hour string hive> select * from demo_tgt limit 5;
OK
0 1 1 20211226 13
0 2 2 20211226 13
0 3 3 20211226 13
0 4 4 20211226 13
0 5 5 20211226 13 hive> desc demo_tgt;
OK
id bigint
age int
desc string
dt string
hour string # Partition Information
# col_name data_type comment dt string
hour string
Time taken: 0.468 seconds, Fetched: 11 row(s)

准确度度量 Accuracy Measure

创建规则

[root@bigdata101 griffin-0.5.0]# mkdir measures
[root@bigdata101 measures]# vi dq.json
{
"name": "batch_accu",
"process.type": "batch",
"data.sources": [
{
"name": "src",
"baseline": true,
"connectors": [
{
"type": "hive",
"version": "2.3",
"config": {
"database": "default",
"table.name": "demo_src"
}
}
]
}, {
"name": "tgt",
"connectors": [
{ "type": "hive",
"version": "2.3",
"config": {
"database": "default",
"table.name": "demo_tgt"
}
}
]
}
],
"evaluate.rule": {
"rules": [
{
"dsl.type": "griffin-dsl",
"dq.type": "accuracy",
"out.dataframe.name": "accu",
"rule": "src.id = tgt.id AND src.age = tgt.age AND src.desc = tgt.desc",
"details": {
"source": "src",
"target": "tgt",
"miss": "miss_count",
"total": "total_count",
"matched": "matched_count"
},
"out": [
{
"type": "metric",
"name": "accu"
},
{
"type": "record",
"name": "missRecords"
}
]
}
]
},
"sinks": ["CONSOLE", "HDFS"]
}

指定结果存储路径

[root@bigdata101 measures]# vi env.json
{
"spark": {
"log.level": "WARN"
},
"sinks": [
{
"type": "CONSOLE",
"config": {
"max.log.lines": 10
}
},
{
"type": "HDFS",
"config": {
"path": "hdfs://bigdata101:9000/griffin/persist",
"max.persist.lines": 10000,
"max.lines.per.file": 10000
}
},
{
"type": "ELASTICSEARCH",
"config": {
"method": "post",
"api": "http://bigdata101:9200/griffin/accuracy",
"connection.timeout": "1m",
"retry": 10
}
}
],
"griffin.checkpoint": []
}

提交job

[root@bigdata101 griffin-0.5.0]# ls
data griffin-measure.jar griffin-service.jar measures service.out
[root@bigdata101 griffin-0.5.0]# ls measures/
dq.json env.json [root@bigdata101 griffin-0.5.0]# spark-submit --class org.apache.griffin.measure.Application --master yarn --deploy-mode client --queue default --driver-memory 1g --executor-memory 1g --num-executors 2 griffin-measure.jar measures/env.json measures/dq.json
......
data source timeRanges: src -> (1640586475986, 1640586475986], tgt -> (1640586475986, 1640586475986]
[1640586475986] batch_accu start: application_1640586390189_0001
batch_accu [1640586475986] metrics:
{"name":"batch_accu","tmst":1640586475986,"value":{"total_count":375000,"miss_count":1475,"matched_count":373525,"matchedFraction":0.9960666666666667},"applicationId":"application_1640586390189_0001"}
[1640586475986] 1640586626794: process using time: 150808 ms
[1640586475986] batch_accu finish

查看 hdfs 上的 /griffin/persist/ 目录

[root@bigdata101 opt]# hadoop fs -ls /griffin/persist/batch_accu/1640586475986
Found 5 items
-rw-r--r-- 1 root supergroup 0 2021-12-27 14:30 /griffin/persist/batch_accu/1640586475986/_FINISH
-rw-r--r-- 1 root supergroup 138 2021-12-27 14:30 /griffin/persist/batch_accu/1640586475986/_LOG
-rw-r--r-- 1 root supergroup 200 2021-12-27 14:30 /griffin/persist/batch_accu/1640586475986/_METRICS
-rw-r--r-- 1 root supergroup 30 2021-12-27 14:28 /griffin/persist/batch_accu/1640586475986/_START
-rw-r--r-- 1 root supergroup 128324 2021-12-27 14:30 /griffin/persist/batch_accu/1640586475986/missRecords [root@bigdata101 opt]# hadoop fs -cat /griffin/persist/batch_accu/1640586475986/_START
application_1640586390189_0001 [root@bigdata101 opt]# hadoop fs -cat /griffin/persist/batch_accu/1640586475986/_LOG
================ log of Mon Dec 27 14:27:55 CST 2021 ================
--- Mon Dec 27 14:30:26 CST 2021 ---
process using time: 150808 ms [root@bigdata101 opt]# hadoop fs -cat /griffin/persist/batch_accu/1640586475986/_METRICS
{"name":"batch_accu","tmst":1640586475986,"value":{"total_count":375000,"miss_count":1475,"matched_count":373525,"matchedFraction":0.9960666666666667},"applicationId":"application_1640586390189_0001"} [root@bigdata101 opt]# hadoop fs -cat /griffin/persist/batch_accu/1640586475986/missRecords
....
{"id":124,"age":1628,"desc":"1628","dt":"20211227","hour":"13","__tmst":1640586475986}
{"id":124,"age":1628,"desc":"1628","dt":"20211227","hour":"14","__tmst":1640586475986}
{"id":124,"age":1479,"desc":"1479","dt":"20211227","hour":"15","__tmst":1640586475986}
{"id":124,"age":1479,"desc":"1479","dt":"20211227","hour":"15","__tmst":1640586475986}
{"id":124,"age":1555,"desc":"1555","dt":"20211227","hour":"14","__tmst":1640586475986}
{"id":124,"age":1555,"desc":"1555","dt":"20211227","hour":"13","__tmst":1640586475986}

性能分析度量 Profiling Measure

查看 demo_src 表的最大值、总个数、空值个数、最大长度。

src.id.count() 这里是 "name": "src"

创建规则

[root@bigdata101 measures]# vi dq2.json
{
"name": "batch_Prof",
"process.type": "batch",
"data.sources": [
{
"name": "src",
"baseline": true,
"connectors": [
{
"type": "hive",
"version": "2.3",
"config": {
"database": "default",
"table.name": "demo_src"
}
}
]
}
],
"evaluate.rule": {
"rules": [
{
"dsl.type": "griffin-dsl",
"dq.type": "profiling",
"out.dataframe.name": "prof",
"rule": "src.id.count() as id_count,src.age.max() as age_max,src.desc.length().max() as desc_length_max",
"out": [
{
"type": "metric",
"name": "prof"
}
]
}
]
},
"sinks": ["CONSOLE", "HDFS"]
}

使用测试1的 env.json,提交 job

[root@bigdata101 griffin-0.5.0]# spark-submit --class org.apache.griffin.measure.Application --master yarn --deploy-mode client --queue default --driver-memory 1g --executor-memory 1g --num-executors 2 griffin-measure.jar measures/env.json measures/dq2.json
...
{"name":"batch_prof","tmst":1640589832672,"value":{"id_count":375000,"age_max":2000,"desc_length_max":4},"applicationId":"application_1640586390189_0004"}

Rule 规则编写

在 rule 里可以写 sql,比如将 demo_src 表按 age 降序输出。

注意查询时,不是查询表名而是查询 name 名称。

创建规则

[root@bigdata101 measures]# vi dq3.json
{
"name": "batch_prof",
"process.type": "batch",
"data.sources": [
{
"name": "src",
"baseline": true,
"connectors": [
{
"type": "hive",
"version": "2.3",
"config": {
"database": "default",
"table.name": "demo_src"
}
}
]
}
],
"evaluate.rule": {
"rules": [
{
"dsl.type": "griffin-dsl",
"dq.type": "profiling",
"out.dataframe.name": "prof",
"rule": "select age from src order by age desc limit 5",
"out": [
{
"type": "metric",
"name": "prof"
}
]
}
]
},
"sinks": ["CONSOLE", "HDFS"]
}

使用测试1的 env.json,提交 job

[root@bigdata101 griffin-0.5.0]# spark-submit --class org.apache.griffin.measure.Application --master yarn --deploy-mode client --queue default --driver-memory 1g --executor-memory 1g --num-executors 2 griffin-measure.jar measures/env.json measures/dq3.json
...
{"name":"batch_prof","tmst":1640590110604,"value":{"prof":[{"age":2000},{"age":2000},{"age":2000},{"age":1999},{"age":1999}]},"applicationId":"application_1640586390189_0005"}

4 流处理测试【有问题,待解决】

获取测试数据:https://griffin.apache.org/data/streaming/

wget https://griffin.apache.org/data/streaming/gen-data.sh --no-check-certificate
wget https://griffin.apache.org/data/streaming/source.temp --no-check-certificate
wget https://griffin.apache.org/data/streaming/streaming-data.sh --no-check-certificate
wget https://griffin.apache.org/data/streaming/target.temp --no-check-certificate [root@bigdata101 data]# chmod 777 * # 先查看下 streaming-data.sh 和 gen-data.sh 文件,根据实际情况,做修改 # execute `./streaming-data.sh` to generate json string data file and produce them into kafka topics minutely.

创建规则

[root@bigdata101 measures]# vi dq4.json
{
"name": "streaming_accu",
"process.type": "streaming",
"data.sources": [
{
"name": "src",
"baseline": true,
"connectors": [
{
"type": "kafka",
"version": "2.3",
"config": {
"kafka.config": {
"bootstrap.servers": "bigdata101:9092,bigdata102:9092,bigdata103:9092",
"group.id": "griffin",
"auto.offset.reset": "largest",
"auto.commit.enable": "false"
},
"topics": "source",
"key.type": "java.lang.String",
"value.type": "java.lang.String"
},
"pre.proc": [
{
"dsl.type": "df-opr",
"rule": "from_json"
}
]
}
],
"checkpoint": {
"type": "json",
"file.path": "hdfs://bigdata101:9000/griffin/streaming/dump/source",
"info.path": "source",
"ready.time.interval": "10s",
"ready.time.delay": "0",
"time.range": ["-5m", "0"],
"updatable": true
}
}, {
"name": "tgt",
"connectors": [
{
"type": "kafka",
"version": "2.3",
"config": {
"kafka.config": {
"bootstrap.servers": "bigdata101:9092,bigdata102:9092,bigdata103:9092",
"group.id": "griffin",
"auto.offset.reset": "largest",
"auto.commit.enable": "false"
},
"topics": "target",
"key.type": "java.lang.String",
"value.type": "java.lang.String"
},
"pre.proc": [
{
"dsl.type": "df-opr",
"rule": "from_json"
}
]
}
],
"checkpoint": {
"type": "json",
"file.path": "hdfs://bigdata101:9000/griffin/streaming/dump/target",
"info.path": "target",
"ready.time.interval": "10s",
"ready.time.delay": "0",
"time.range": ["-1m", "0"]
}
}
],
"evaluate.rule": {
"rules": [
{
"dsl.type": "griffin-dsl",
"dq.type": "accuracy",
"out.dataframe.name": "accu",
"rule": "src.id = tgt.id AND src.name = tgt.name AND src.color = tgt.color AND src.time = tgt.time",
"details": {
"source": "src",
"target": "tgt",
"miss": "miss_count",
"total": "total_count",
"matched": "matched_count"
},
"out":[
{
"type":"metric",
"name": "accu"
},
{
"type":"record",
"name": "missRecords"
}
]
}
]
},
"sinks": ["CONSOLE", "HDFS"]
}

env.json

[root@bigdata101 measures]# vi env.json
{
"spark": {
"log.level": "WARN",
"checkpoint.dir": "hdfs://bigdata101:9000/griffin/checkpoint",
"batch.interval": "20s",
"process.interval": "1m",
"init.clear": true,
"config": {
"spark.default.parallelism": 4,
"spark.task.maxFailures": 5,
"spark.streaming.kafkaMaxRatePerPartition": 1000,
"spark.streaming.concurrentJobs": 4,
"spark.yarn.maxAppAttempts": 5,
"spark.yarn.am.attemptFailuresValidityInterval": "1h",
"spark.yarn.max.executor.failures": 120,
"spark.yarn.executor.failuresValidityInterval": "1h",
"spark.hadoop.fs.hdfs.impl.disable.cache": true
}
},
"sinks": [
{
"type": "console"
},
{
"type": "hdfs",
"config": {
"path": "hdfs://bigdata101:9000/griffin/persist"
}
},
{
"type": "elasticsearch",
"config": {
"method": "post",
"api": "http://bigdata101:9200/griffin/accuracy"
}
}
],
"griffin.checkpoint": [
{
"type": "zk",
"config": {
"hosts": "bigdata101:2181,bigdata102:2182,bigdata103:2181",
"namespace": "griffin/infocache",
"lock.path": "lock",
"mode": "persist",
"init.clear": true,
"close.clear": false
}
}
]
}

启动 kafka,执行 ./streaming-data.sh 生产数据。

[root@bigdata101 data]# ./streaming-data.sh
>>>>>>>>>>>>>>>>>>>>>>insert data at 2021-12-27_16:15:54
>>>>>>>>>>>>>>>>>>>>>>insert data at 2021-12-27_16:16:57
....

提交 job 【可能是内存不足的原因,待验证】

[root@bigdata101 griffin-0.5.0]# spark-submit --class org.apache.griffin.measure.Application --master yarn --deploy-mode client --queue default --driver-memory 1g --executor-memory 1g --num-executors 2 griffin-measure.jar measures/env.json measures/dq4.json
....
21/12/27 21:38:41 ERROR offset.OffsetCheckpointInZK: delete /lock error: KeeperErrorCode = NoNode for /griffin/infocache/streaming_accu/lock
21/12/27 21:38:41 ERROR offset.OffsetCheckpointInZK: delete /info.final error: KeeperErrorCode = NoNode for /griffin/infocache/streaming_accu/info.final
21/12/27 21:38:41 ERROR offset.OffsetCheckpointInZK: delete /info error: KeeperErrorCode = NoNode for /griffin/infocache/streaming_accu/info
21/12/27 21:38:49 WARN streaming.Checkpoint: Checkpoint directory hdfs://bigdata101:9000/griffin/checkpoint does not exist
21/12/27 21:39:10 WARN offset.OffsetCheckpointInZK: read /info.final/last.proc.time warn: KeeperErrorCode = NoNode for /griffin/infocache/streaming_accu/info.final/last.proc.time
21/12/27 21:39:10 WARN offset.OffsetCheckpointInZK: read /info.final/ready.time warn: KeeperErrorCode = NoNode for /griffin/infocache/streaming_accu/info.final/ready.time
21/12/27 21:39:14 WARN offset.OffsetCheckpointInZK: read /info/source/old.cache.index warn: KeeperErrorCode = NoNode for /griffin/infocache/streaming_accu/info/source/old.cache.index
21/12/27 21:39:14 WARN offset.OffsetCheckpointInZK: read /info.final/last.proc.time warn: KeeperErrorCode = NoNode for /griffin/infocache/streaming_accu/info.final/last.proc.time
21/12/27 21:39:14 WARN offset.OffsetCheckpointInZK: read /info.final/ready.time warn: KeeperErrorCode = NoNode for /griffin/infocache/streaming_accu/info.final/ready.time
data source timeRanges: src -> (-300001, -1], tgt -> (-60001, -1]
[1640612329475] streaming_accu start: application_1640606908996_0015
已杀死

5 问题

问题1

[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Apache Griffin 0.6.0 0.6.0:
[INFO]
[INFO] Apache Griffin 0.6.0 ............................... SUCCESS [ 6.164 s]
[INFO] Apache Griffin :: UI :: Default UI ................. FAILURE [06:54 min]
[INFO] Apache Griffin :: Web Service ...................... SUCCESS [02:10 min]
[INFO] Apache Griffin :: Measures ......................... SUCCESS [02:44 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 07:01 min (Wall Clock)
[INFO] Finished at: 2021-12-26T11:55:52+08:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal com.github.eirslett:frontend-maven-plugin:1.6:npm (npm install) on project ui: Failed to run task: 'npm install' failed. org.apache.commons.exec.ExecuteException: Process exited with an error: 1 (Exit value: 1) -> [Help 1]

修改 ui\pom.xml 的如下内容:

原内容:

<!-- It will install nodejs and npm -->
<execution>
<id>install node and npm</id>
<goals>
<goal>install-node-and-npm</goal>
</goals>
<configuration>
<nodeVersion>${node.version}</nodeVersion>
<npmVersion>${npm.version}</npmVersion>
</configuration>
</execution>

修改后:

<!-- It will install nodejs and npm -->
<execution>
<id>install node and npm</id>
<goals>
<goal>install-node-and-npm</goal>
</goals>
<configuration>
<nodeVersion>v10.16.3</nodeVersion>
<npmVersion>6.11.3</npmVersion>
<nodeDownloadRoot>http://nodejs.org/dist/</nodeDownloadRoot>
<!-- <npmDownloadRoot>http://registry.npmjs.org/npm/</npmDownloadRoot> -->
<npmDownloadRoot>http://registry.npm.taobao.org/npm/</npmDownloadRoot>
</configuration>
</execution>

如果出现了如下问题,把 <nodeDownloadRoot><npmDownloadRoot> 两行删掉。

[ERROR] Failed to execute goal com.github.eirslett:frontend-maven-plugin:1.6:install-node-and-npm (install node and npm) on project ui: Could not download npm: Got error code 404 from the server. -> [Help 1]

问题2

[ERROR]
[ERROR] ERROR in /opt/soft/griffin-0.5.0/ui/angular/node_modules/@types/jquery/JQuery.d.ts (4137,26): Cannot find name 'SVGElementTagNameMap'.
[ERROR] ERROR in /opt/soft/griffin-0.5.0/ui/angular/node_modules/@types/jquery/JQuery.d.ts (4137,89): Cannot find name 'SVGElementTagNameMap'.
[ERROR]
[ERROR] npm ERR! Linux 3.10.0-514.el7.x86_64
[ERROR] npm ERR! argv "/opt/soft/griffin-0.5.0/ui/.tmp/node/node" "/opt/soft/griffin-0.5.0/ui/.tmp/node/node_modules/npm/bin/npm-cli.js" "run" "build "
[ERROR] npm ERR! node v6.11.3
[ERROR] npm ERR! npm v3.10.10
[ERROR] npm ERR! code ELIFECYCLE
[ERROR] npm ERR! griffin@0.0.0 build: `ng build`
[ERROR] npm ERR! Exit status 1
[ERROR] npm ERR!
[ERROR] npm ERR! Failed at the griffin@0.0.0 build script 'ng build'.
[ERROR] npm ERR! Make sure you have the latest version of node.js and npm installed.
[ERROR] npm ERR! If you do, this is most likely a problem with the griffin package,
[ERROR] npm ERR! not with npm itself.
[ERROR] npm ERR! Tell the author that this fails on your system:
[ERROR] npm ERR! ng build
[ERROR] npm ERR! You can get information on how to open an issue for this project with:
[ERROR] npm ERR! npm bugs griffin
[ERROR] npm ERR! Or if that isn't available, you can get their info via:
[ERROR] npm ERR! npm owner ls griffin
[ERROR] npm ERR! There is likely additional logging output above.
[ERROR]
[ERROR] npm ERR! Please include the following file with any support request:
[ERROR] npm ERR! /opt/soft/griffin-0.5.0/ui/angular/npm-debug.log
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Apache Griffin 0.5.0 0.5.0:
[INFO]
[INFO] Apache Griffin 0.5.0 ............................... SUCCESS [ 16.497 s]
[INFO] Apache Griffin :: UI :: Default UI ................. FAILURE [24:12 min]
[INFO] Apache Griffin :: Web Service ...................... SKIPPED
[INFO] Apache Griffin :: Measures ......................... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 24:28 min
[INFO] Finished at: 2020-06-30T15:04:04+08:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal com.github.eirslett:frontend-maven-plugin:1.6:npm (npm build) on project ui: Failed to run task: 'npm run build' faile d. org.apache.commons.exec.ExecuteException: Process exited with an error: 1 (Exit value: 1) -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command

vi /opt/griffin-0.5.0-source/ui/angular/node_modules/@types/jquery/JQuery.d.ts

删除4137这行

find<K extends keyof SVGElementTagNameMap>(selector_element: K | JQuery<K>): JQuery<SVGElementTagNameMap[K]>;

问题3

[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Apache Griffin 0.5.0 0.5.0:
[INFO]
[INFO] Apache Griffin 0.5.0 ............................... SUCCESS [ 9.715 s]
[INFO] Apache Griffin :: UI :: Default UI ................. SKIPPED
[INFO] Apache Griffin :: Web Service ...................... FAILURE [02:55 min]
[INFO] Apache Griffin :: Measures ......................... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 03:08 min (Wall Clock)
[INFO] Finished at: 2019-05-27T14:27:07+08:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal on project service: Could not resolve dependencies for project org.apache.griffin:service:jar:0.5.0: Could not find artifact io.confluent:kafka-schema-registry-client:jar:3.2.0 in nexus-aliyun (http://maven.aliyun.com/nexus/content/groups/public) -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR] mvn <goals> -rf :service

下载 kafka-schema-registry-client,手动添加

mvn install:install-file -DgroupId=io.confluent -DartifactId=kafka-schema-registry-client -Dversion=3.2.0 -Dpackaging=jar -Dfile=kafka-schema-registry-client-3.2.0.jar

问题4【待解决】

griffin-0.6.0 编译成功,但在执行 nohup java -jar $GRIFFIN_HOME/griffin-service.jar >$GRIFFIN_HOME/service.out 2>&1 & 时,依次出现如下错误:

/opt/griffin-0.6.0/griffin-service.jar中没有主清单属性

可以在 pom 中添加一个 SpringBoot 的构建的插件,然后重新运行 mvn install 即可。(未测试,来自这里

Error: A JNI error has occurred, please check your installation and try again
Exception in thread "main" java.lang.NoClassDefFoundError: javax/servlet/Filter Caused by: java.lang.ClassNotFoundException: javax.servlet.Filter

使用 0.5.0 版本没有出现此问题。

问题5

在测试提交 spark job 时,出现如下错误:

NoSuchMethodError: org.apache.spark.SparkConf.setAll(Lscala/collection/Traversable;)Lorg/apache/spark/SparkConf

安装的 spark 的版本与 griffin 内部指定的版本不一致,查看 measure/pom.xml 文件,更换对应的 spark 版本。

问题6

在测试提交 spark job 时,出现如下错误:

ClassNotFoundException: Class com.hadoop.compression.lzo.LzoCodec not found

hadoop-lzo-0.4.21-SNAPSHOT.jar 文件复制到 spark 的 conf/ 目录中。

问题7

在测试流数据处理时,出现 java.io.CharConversionException: Invalid UTF-32 character 0x8000000(above 10ffff) at char #1, byte #7) 异常的原因就是没有在 spark 中配置支持 lzo


参考:

总结了很多报错问题:https://blog.csdn.net/weixin_43160819/article/details/118298681

https://blog.csdn.net/weixin_40004348/article/details/107191430

https://blog.csdn.net/github_39577257/article/details/90607081

尚硅谷griffin

https://blog.csdn.net/guanghuotainkong/article/details/112168871

griffin环境搭建及功能测试的更多相关文章

  1. web自动化框架之一介绍与环境搭建(Selenium+Eclipse+Python)

    看到一篇环境搭建文章,详细又全面,这里就不一一重复了 http://blog.csdn.net/dyllove98/article/details/9390649 其它: 1.框架介绍      整个 ...

  2. python+selenium自动化软件测试(第1章):环境搭建,你也可以直接用Anaconda!

    1.1 环境搭建 1.1.1 selenium简介Selenium 是用于测试 Web 应用程序用户界面 (UI) 的常用框架.它是一款用于运行端到端功能测试的超强工具.您可以使用多个编程语言编写测试 ...

  3. UI自动化测试简介及Selenium工具的介绍和环境搭建

    自动化测试简介 1.1何为自动化测试? 是把以人为驱动的测试转化为机器执行的一种过程,它是一种以程序测试程序的过程.换言之,就是以程序实现的方式来代替手工测试. 1.2自动化测试分类 分为功能自动化测 ...

  4. UI自动化测试(一)简介及Selenium工具的介绍和环境搭建

    自动化测试简介 1.1何为自动化测试? 是把以人为驱动的测试转化为机器执行的一种过程,它是一种以程序测试程序的过程.换言之,就是以程序实现的方式来代替手工测试. 1.2自动化测试分类 分为功能自动化测 ...

  5. LAMP环境搭建实现网站动静分离[转]

    目录: 1.环境概述 2.动静分离拓扑图 3.各服务器功能规划 4.各服务器基础环境配置 5.httpd安装配置 6.php安装配置及启用opcache加速功能 7.mysql安装配置 8.wordp ...

  6. Selenium自动化测试值环境搭建

    Selenium自动化测试之环境搭建 一.背景介绍 自动化测试近几年在测试领域很火,出去面试要是说不会自动化测试薪资都不好意思往高了要!很多公司做敏捷测试用到自动化,其他一些公司也是跟风,即使用不上自 ...

  7. Uiautomator简介及其环境搭建、测试执行

    UiAutomator框架使用指南 UiAutomator是Google开发的自动化测试工具,通过UI创建自动化测试代码,来测试界面(UI)的有效功能,可以针对应用程序运行在一个或更多的设备上.我们并 ...

  8. python web自动化测试框架搭建(功能&接口)——环境搭建

    自动化测试框架一般需要实现以下通用功能 执行前准备 结束后清理 执行步骤输出 执行结果输出 错误.失败截图 测试报告 发送邮件 日志 需要的软件和python第三方库有: 通用: JDK Eclips ...

  9. 使用IDEA写Python之pytest环境搭建及第一个程序编写

    一.准备篇 Python环境:3.8.3 开发工具:IDEA,对你没有看错 二.IDEA下安装开发环境 1. python的下载 https://www.python.org/downloads/ P ...

随机推荐

  1. 【风控算法】一、变量分箱、WOE和IV值计算

    一.变量分箱 变量分箱常见于逻辑回归评分卡的制作中,在入模前,需要对原始变量值通过分箱映射成woe值.举例来说,如"年龄"这一变量,我们需要找到合适的切分点,将连续的年龄打散到不同 ...

  2. KMP算法思路

    题目 给定一个字符串\(S\),求\(M\)字符串是否是\(S\)字符串中的子串.如果是,返回\(M\)对应\(S\)的第一个下标,否则返回-1. 例如:S串为a b c d a b c d a b ...

  3. Hive(八)【行转列、列转行】

    目录 一.行转列 相关函数 concat concat_ws collect_set collect_list 需求 需求分析 数据准备 写SQL 二.列转行 相关函数 split explode l ...

  4. C/C++ Qt 数据库与SqlTableModel组件应用

    SqlTableModel 组件可以将数据库中的特定字段动态显示在TableView表格组件中,通常设置QSqlTableModel类的变量作为数据模型后就可以显示数据表内容,界面组件中则通过QDat ...

  5. 【leetcode】565. Array Nesting

    You are given an integer array nums of length n where nums is a permutation of the numbers in the ra ...

  6. JDK1.8新特性(一): 接口的默认方法default

    前言 今天在学习mysql分区优化时,发现一个博客专家大神,对其发布的文章简单学习一下: 一:简介 我们通常所说的接口的作用是用于定义一套标准.约束.规范等,接口中的方法只声明方法的签名,不提供相应的 ...

  7. spring cloud config center Git SSH configuration

    Git SSH configuration using properties By default, the JGit library used by Spring Cloud Config Serv ...

  8. Spring Boot 自动扫描组件

    使用@ComponentScan自动扫描组件 案例准备 1.创建一个配置类,在配置类上添加 @ComponentScan 注解.该注解默认会扫描该类所在的包下所有的配置类,相当于之前的 <con ...

  9. Redis哨兵 部署和配置

    目录 一.哨兵简介 哨兵介绍 哨兵原理 二.哨兵部署 环境介绍 哨兵配置 三.使用验证 一.哨兵简介 哨兵介绍 Sentinel(哨兵)是用于监控redis集群中Master状态的工具,其已经被集成在 ...

  10. JUC之Lock接口以及Synchronized回顾

    Lock接口 Synchronized关键字回顾: 多线程编程步骤(上): 创建资源类,在资源类创建属性和操作方法 创建多个线程,调用资源类的操作方法 创建线程的四种方式: 继承Thread 实现Ru ...