1、查询表结构

show create table data_lake_ods.test

CREATE TABLE spark_catalog.data_lake_ods.test (
`user_number` BIGINT NOT NULL,
`subclazz_number` BIGINT NOT NULL,
`clazz_number` BIGINT,
`clazz_lesson_number` BIGINT NOT NULL,
`lesson_live_property` BIGINT,
`lesson_video_property` BIGINT,
`lesson_live_length` BIGINT,
`lesson_video_length` BIGINT,
`lesson_standard_length` BIGINT,
`lesson_length` BIGINT,
`live_learn_duration` BIGINT,
`video_learn_duration` BIGINT,
`learn_duration` BIGINT,
`is_valid_live_learn` BIGINT,
`is_valid_learn` BIGINT,
`companion_learn_duration` BIGINT,
`learn_combine_duration` BIGINT,
`companion_lesson_length` BIGINT,
`is_should_attend_user` BIGINT,
`is_live_attend_user` BIGINT,
`is_combine_valid_learn_user` BIGINT,
`is_black_user` BIGINT)
USING iceberg
LOCATION '/user/hive/warehouse/data_lake_ods.db/test'
TBLPROPERTIES(
'catalog-database' = 'data_lake_ods',
'catalog-name' = 'spark_catalog',
'catalog-table' = 'test',
'catalog-type' = 'hive',
'connector' = 'iceberg',
'current-snapshot-id' = '5677214384524195741',
'format' = 'iceberg/parquet',
'format-version' = '2',
'identifier-fields' = '[clazz_lesson_number,subclazz_number,user_number]',
'table.drop.base-path.enabled' = 'true',
'uri' = 'thrift://127.0.0.1:7004,******',
'write-parallelism' = '16',
'write.distribution-mode' = 'hash',
'write.merge.mode' = 'merge-on-read',
'write.metadata.delete-after-commit.enabled' = 'true',
'write.metadata.metrics.default' = 'full',
'write.metadata.previous-versions-max' = '100',
'write.update.mode' = 'merge-on-read',
'write.upsert.enabled' = 'true')

2、得到表存储路径后查询最新的metadata.json文件

hdfs dfs -ls /user/hive/warehouse/data_lake_ods.db/test/metadata/
得到最新的文件是:/user/hive/warehouse/data_lake_ods.db/test/metadata/00059-1d8e6694-f873-45a3-9697-ce2762f78c3a.metadata.json

3、查看文件内容

hdfs dfs -cat /user/hive/warehouse/data_lake_ods.db/test/metadata/00059-1d8e6694-f873-45a3-9697-ce2762f78c3a.metadata.json

4、找到"current-snapshot-id" : 5677214384524195741。其实和建表中的id一样

"current-snapshot-id" : 5677214384524195741,
{
"sequence-number" : 59,
"snapshot-id" : 5677214384524195741,
"parent-snapshot-id" : 922737561337808536,
"timestamp-ms" : 1701683779323,
"summary" : {
"operation" : "overwrite",
"flink.operator-id" : "e883208d19e3c34f8aaf2a3168a63337",
"flink.job-id" : "000000001de734af0000000000000000",
"flink.max-committed-checkpoint-id" : "59",
"added-data-files" : "16",
"added-equality-delete-files" : "16",
"added-position-delete-files" : "16",
"added-delete-files" : "32",
"added-records" : "611418",
"added-files-size" : "19072474",
"added-position-deletes" : "245062",
"added-equality-deletes" : "366356",
"changed-partition-count" : "1",
"total-records" : "1084028212",
"total-files-size" : "41957333041",
"total-data-files" : "944",
"total-delete-files" : "1680",
"total-position-deletes" : "10749681",
"total-equality-deletes" : "1073278531"
},
"manifest-list" : "/user/hive/warehouse/data_lake_ods.db/test/metadata/snap-5677214384524195741-1-644d4e57-0953-40c0-915d-faf7dd8a3ab2.avro",
"schema-id" : 0
}

5、得到manifest-list文件路径,下载到本地分析

hadoop fs -get /user/hive/warehouse/data_lake_ods.db/test/metadata/snap-5677214384524195741-1-644d4e57-0953-40c0-915d-faf7dd8a3ab2.avro ~/test/iceberg

6、查看avro文件内容

java -jar ./avro-tools-1.8.1.jar tojson ./snap-5677214384524195741-1-644d4e57-0953-40c0-915d-faf7dd8a3ab2.avro | less |grep '5677214384524195741'
内容如下:
{"manifest_path":"/user/hive/warehouse/data_lake_ods.db/test/metadata/644d4e57-0953-40c0-915d-faf7dd8a3ab2-m0.avro","manifest_length":12233,"partition_spec_id":0,"content":0,"sequence_number":59,"min_sequence_number":59,"added_snapshot_id":567721438452419574,"added_data_files_count":16,"existing_data_files_count":0,"deleted_data_files_count":0,"added_rows_count":611418,"existing_rows_count":0,"deleted_rows_count":0,"partitions":{"array":[]}}
{"manifest_path":"/user/hive/warehouse/data_lake_ods.db/test/metadata/644d4e57-0953-40c0-915d-faf7dd8a3ab2-m1.avro","manifest_length":10241,"partition_spec_id":0,"content":1,"sequence_number":59,"min_sequence_number":59,"added_snapshot_id":567721438452419574,"added_data_files_count":32,"existing_data_files_count":0,"deleted_data_files_count":0,"added_rows_count":611418,"existing_rows_count":0,"deleted_rows_count":0,"partitions":{"array":[]}}

7、查询其中一个文件内容

hadoop fs -get /user/hive/warehouse/data_lake_ods.db/test/metadata/644d4e57-0953-40c0-915d-faf7dd8a3ab2-m0.avro ~/test/iceberg
java -jar ./avro-tools-1.8.1.jar tojson ./644d4e57-0953-40c0-915d-faf7dd8a3ab2-m0.avro | less
数据样例:
{"status":1,"snapshot_id":{"long":5677214384524195741},"sequence_number":null,"file_sequence_number":null,"data_file":{"content":0,"file_path":"/user/hive/warehouse/data_lake_ods.db/test/data/00009-0-a2f03c5f-eec9-4a15-bafe-9b360af4fde5-00162.parquet","file_format":"PARQUET","partition":{},"record_count":38138,"file_size_in_bytes":899517,"column_sizes":{"array":[{"key":1,"value":178768},{"key":2,"value":90942},{"key":3,"value":68674},{"key":4,"value":95511},{"key":5,"value":9520},{"key":6,"value":6952},{"key":7,"value":47567},{"key":8,"value":42993},{"key":9,"value":29052},{"key":10,"value":39705},{"key":11,"value":59926},{"key":12,"value":40560},{"key":13,"value":67603},{"key":14,"value":8252},{"key":15,"value":8365},{"key":16,"value":7477},{"key":17,"value":67708},{"key":18,"value":6603},{"key":19,"value":5850},{"key":20,"value":5150},{"key":21,"value":5096},{"key":22,"value":246}]},"value_counts":{"array":[{"key":1,"value":38138},{"key":2,"value":38138},{"key":3,"value":38138},{"key":4,"value":38138},{"key":5,"value":38138},{"key":6,"value":38138},{"key":7,"value":38138},{"key":8,"value":38138},{"key":9,"value":38138},{"key":10,"value":38138},{"key":11,"value":38138},{"key":12,"value":38138},{"key":13,"value":38138},{"key":14,"value":38138},{"key":15,"value":38138},{"key":16,"value":38138},{"key":17,"value":38138},{"key":18,"value":38138},{"key":19,"value":38138},{"key":20,"value":38138},{"key":21,"value":38138},{"key":22,"value":38138}]},"null_value_counts":{"array":[{"key":1,"value":0},{"key":2,"value":0},{"key":3,"value":0},{"key":4,"value":0},{"key":5,"value":11091},{"key":6,"value":11092},{"key":7,"value":11091},{"key":8,"value":11092},{"key":9,"value":11091},{"key":10,"value":11091},{"key":11,"value":11091},{"key":12,"value":11091},{"key":13,"value":11091},{"key":14,"value":11091},{"key":15,"value":11091},{"key":16,"value":11091},{"key":17,"value":11091},{"key":18,"value":11091},{"key":19,"value":0},{"key":20,"value":0},{"key":21,"value":0},{"key":22,"value":0}]},"nan_value_counts":{"array":[]},"lower_bounds":{"array":[{"key":1,"value":"'\u0006\u0002\u0000\u0000\u0000\u0000\u0000"},{"key":2,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":3,"value":"\u0000\u0000<U+008D><Ú\u000B\t\u0000"},{"key":4,"value":",\u0000ß<Ú\u000B\t\u0000"},{"key":5,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":6,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":7,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":8,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":9,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":10,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":11,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":12,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":13,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":14,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":15,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":16,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":17,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":18,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":19,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":20,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":21,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":22,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"}]},"upper_bounds":{"array":[{"key":1,"value":"C<U+0095>\u0014\u0006<U+008E>å\u0000\u0000"},{"key":2,"value":"<U+0080>\u0004 <U+0084>´ÇW\u0000"},{"key":3,"value":"\u0000à`t_º{\u0005"},{"key":4,"value":"\u0000°ÃZ\"<U+0097>|\u0005"},{"key":5,"value":"\u0003\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":6,"value":"\u0003\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":7,"value":"<U+0081>L\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":8,"value":"zL\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":9,"value":"`T\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":10,"value":"$4\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":11,"value":"18\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":12,"value":"£8\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":13,"value":"~:\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":14,"value":"\u0001\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":15,"value":"\u0001\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":16,"value":"<U+0091>\u001A\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":17,"value":"~:\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":18,"value":"B\u001B\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":19,"value":"\u0002\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":20,"value":"\u0001\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":21,"value":"\u0001\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":22,"value":"\u0001\u0000\u0000\u0000\u0000\u0000\u0000\u0000"}]},"key_metadata":null,"split_offsets":{"array":[4]},"equality_ids":null,"sort_order_id":{"int":0}}}

8、根据第六步得到的文件路径查看文件内容

(1)查topN数据

hadoop jar ./parquet-tools-1.11.0.jar head -n 10  /user/hive/warehouse/data_lake_ods.db/test/data/00009-0-a2f03c5f-eec9-4a15-bafe-9b360af4fde5-00162.parquet
数据样例:

user_number = 19100811
subclazz_number = 22388336842768640
clazz_number = 10995946454325888
clazz_lesson_number = 10995946455505548
lesson_live_property = 0
lesson_video_property = 0
lesson_live_length = 6091
lesson_video_length = 0
lesson_standard_length = 7200
lesson_length = 6091
live_learn_duration = 476
video_learn_duration = 0
learn_duration = 476
is_valid_live_learn = 0
is_valid_learn = 0
companion_learn_duration = 0
learn_combine_duration = 476
companion_lesson_length = 0
is_should_attend_user = 0
is_live_attend_user = 1
is_combine_valid_learn_user = 0
is_black_user = 0

(2)查元数据,表格式

hadoop jar ./parquet-tools-1.11.0.jar schema /user/hive/warehouse/data_lake_ods.db/test/data/00009-0-a2f03c5f-eec9-4a15-bafe-9b360af4fde5-00162.parquet
message table {
required int64 user_number = 1;
required int64 subclazz_number = 2;
optional int64 clazz_number = 3;
required int64 clazz_lesson_number = 4;
optional int64 lesson_live_property = 5;
optional int64 lesson_video_property = 6;
optional int64 lesson_live_length = 7;
optional int64 lesson_video_length = 8;
optional int64 lesson_standard_length = 9;
optional int64 lesson_length = 10;
optional int64 live_learn_duration = 11;
optional int64 video_learn_duration = 12;
optional int64 learn_duration = 13;
optional int64 is_valid_live_learn = 14;
optional int64 is_valid_learn = 15;
optional int64 companion_learn_duration = 16;
optional int64 learn_combine_duration = 17;
optional int64 companion_lesson_length = 18;
optional int64 is_should_attend_user = 19;
optional int64 is_live_attend_user = 20;
optional int64 is_combine_valid_learn_user = 21;
optional int64 is_black_user = 22;
}

(3)查看meta信息
hadoop jar /home/hadoop/test/iceberg/parquet-tools-1.11.0.jar meta /user/hive/warehouse/data_lake_ods.db/test/data/00009-0-a2f03c5f-eec9-4a15-bafe-9b360af4fde5-00162.parquet

file:                        /user/hive/warehouse/data_lake_ods.db/test/data/00009-0-a2f03c5f-eec9-4a15-bafe-9b360af4fde5-00162.parquet
creator: parquet-mr version 1.13.1 (build db4183109d5b734ec5930d870cdae161e408ddba)
extra: iceberg.schema = {"type":"struct","schema-id":0,"identifier-field-ids":[1,2,4],"fields":[{"id":1,"name":"user_number","required":true,"type":"long"},{"id":2,"name":"subclazz_number","required":true,"type":"long"},{"id":3,"name":"clazz_number","required":false,"type":"long"},{"id":4,"name":"clazz_lesson_number","required":true,"type":"long"},{"id":5,"name":"lesson_live_property","required":false,"type":"long"},{"id":6,"name":"lesson_video_property","required":false,"type":"long"},{"id":7,"name":"lesson_live_length","required":false,"type":"long"},{"id":8,"name":"lesson_video_length","required":false,"type":"long"},{"id":9,"name":"lesson_standard_length","required":false,"type":"long"},{"id":10,"name":"lesson_length","required":false,"type":"long"},{"id":11,"name":"live_learn_duration","required":false,"type":"long"},{"id":12,"name":"video_learn_duration","required":false,"type":"long"},{"id":13,"name":"learn_duration","required":false,"type":"long"},{"id":14,"name":"is_valid_live_learn","required":false,"type":"long"},{"id":15,"name":"is_valid_learn","required":false,"type":"long"},{"id":16,"name":"companion_learn_duration","required":false,"type":"long"},{"id":17,"name":"learn_combine_duration","required":false,"type":"long"},{"id":18,"name":"companion_lesson_length","required":false,"type":"long"},{"id":19,"name":"is_should_attend_user","required":false,"type":"long"},{"id":20,"name":"is_live_attend_user","required":false,"type":"long"},{"id":21,"name":"is_combine_valid_learn_user","required":false,"type":"long"},{"id":22,"name":"is_black_user","required":false,"type":"long"}]} file schema: table
--------------------------------------------------------------------------------
user_number: REQUIRED INT64 R:0 D:0
subclazz_number: REQUIRED INT64 R:0 D:0
clazz_number: OPTIONAL INT64 R:0 D:1
clazz_lesson_number: REQUIRED INT64 R:0 D:0
lesson_live_property: OPTIONAL INT64 R:0 D:1
lesson_video_property: OPTIONAL INT64 R:0 D:1
lesson_live_length: OPTIONAL INT64 R:0 D:1
lesson_video_length: OPTIONAL INT64 R:0 D:1
lesson_standard_length: OPTIONAL INT64 R:0 D:1
lesson_length: OPTIONAL INT64 R:0 D:1
live_learn_duration: OPTIONAL INT64 R:0 D:1
video_learn_duration: OPTIONAL INT64 R:0 D:1
learn_duration: OPTIONAL INT64 R:0 D:1
is_valid_live_learn: OPTIONAL INT64 R:0 D:1
is_valid_learn: OPTIONAL INT64 R:0 D:1
companion_learn_duration: OPTIONAL INT64 R:0 D:1
learn_combine_duration: OPTIONAL INT64 R:0 D:1
companion_lesson_length: OPTIONAL INT64 R:0 D:1
is_should_attend_user: OPTIONAL INT64 R:0 D:1
is_live_attend_user: OPTIONAL INT64 R:0 D:1
is_combine_valid_learn_user: OPTIONAL INT64 R:0 D:1
is_black_user: OPTIONAL INT64 R:0 D:1 row group 1: RC:38138 TS:1269627 OFFSET:4
--------------------------------------------------------------------------------
user_number: INT64 GZIP DO:4 FPO:112797 SZ:178768/238215/1.33 VC:38138 ENC:BIT_PACKED,PLAIN_DICTIONARY ST:[min: 132647, max: 252398150128963, num_nulls: 0]
subclazz_number: INT64 GZIP DO:178772 FPO:216044 SZ:90942/116042/1.28 VC:38138 ENC:BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 24707901098558592, num_nulls: 0]
clazz_number: INT64 GZIP DO:269714 FPO:287587 SZ:68674/79130/1.15 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 2546306737045504, max: 395114311462215680, num_nulls: 0]
clazz_lesson_number: INT64 GZIP DO:338388 FPO:374110 SZ:95511/115465/1.21 VC:38138 ENC:BIT_PACKED,PLAIN_DICTIONARY ST:[min: 2546306742419500, max: 395357041109217280, num_nulls: 0]
lesson_live_property: INT64 GZIP DO:433899 FPO:433948 SZ:9520/11698/1.23 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 3, num_nulls: 11091]
lesson_video_property: INT64 GZIP DO:443419 FPO:443469 SZ:6952/9243/1.33 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 3, num_nulls: 11092]
lesson_live_length: INT64 GZIP DO:450371 FPO:456873 SZ:47567/64261/1.35 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 19585, num_nulls: 11091]
lesson_video_length: INT64 GZIP DO:497938 FPO:505857 SZ:42993/70204/1.63 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 19578, num_nulls: 11092]
lesson_standard_length: INT64 GZIP DO:540931 FPO:543782 SZ:29052/49418/1.70 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 21600, num_nulls: 11091]
lesson_length: INT64 GZIP DO:569983 FPO:575653 SZ:39705/61757/1.56 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 13348, num_nulls: 11091]
live_learn_duration: INT64 GZIP DO:609688 FPO:625930 SZ:59926/99798/1.67 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 14385, num_nulls: 11091]
video_learn_duration: INT64 GZIP DO:669614 FPO:680817 SZ:40560/80714/1.99 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 14499, num_nulls: 11091]
learn_duration: INT64 GZIP DO:710174 FPO:728644 SZ:67603/106804/1.58 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 14974, num_nulls: 11091]
is_valid_live_learn: INT64 GZIP DO:777777 FPO:777819 SZ:8252/8680/1.05 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 1, num_nulls: 11091]
is_valid_learn: INT64 GZIP DO:786029 FPO:786071 SZ:8365/8705/1.04 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 1, num_nulls: 11091]
companion_learn_duration: INT64 GZIP DO:794394 FPO:795351 SZ:7477/12507/1.67 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 6801, num_nulls: 11091]
learn_combine_duration: INT64 GZIP DO:801871 FPO:820446 SZ:67708/107108/1.58 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 14974, num_nulls: 11091]
companion_lesson_length: INT64 GZIP DO:869579 FPO:869883 SZ:6603/9775/1.48 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 6978, num_nulls: 11091]
is_should_attend_user: INT64 GZIP DO:876182 FPO:876227 SZ:5850/9754/1.67 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 2, num_nulls: 0]
is_live_attend_user: INT64 GZIP DO:882032 FPO:882077 SZ:5150/5094/0.99 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 1, num_nulls: 0]
is_combine_valid_learn_user: INT64 GZIP DO:887182 FPO:887227 SZ:5096/5042/0.99 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 1, num_nulls: 0]
is_black_user: INT64 GZIP DO:892278 FPO:892323 SZ:246/213/0.87 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 1, num_nulls: 0]

spark-sql查询Iceberg时处理流程的更多相关文章

  1. 大数据技术之_19_Spark学习_03_Spark SQL 应用解析 + Spark SQL 概述、解析 、数据源、实战 + 执行 Spark SQL 查询 + JDBC/ODBC 服务器

    第1章 Spark SQL 概述1.1 什么是 Spark SQL1.2 RDD vs DataFrames vs DataSet1.2.1 RDD1.2.2 DataFrame1.2.3 DataS ...

  2. 【原创】大叔经验分享(23)spark sql插入表时的文件个数研究

    spark sql执行insert overwrite table时,写到新表或者新分区的文件个数,有可能是200个,也有可能是任意个,为什么会有这种差别? 首先看一下spark sql执行inser ...

  3. spark sql插入表时的文件个数研究

    spark sql执行insert overwrite table时,写到新表或者新分区的文件个数,有可能是200个,也有可能是任意个,为什么会有这种差别? 首先看一下spark sql执行inser ...

  4. sql查询语句时怎么把几个字段拼接成一个字段

    sql查询语句时怎么把几个字段拼接成一个字段SELECT CAST(COLUMN1 AS VARCHAR(10)) + '-' + CAST(COLUMN2 AS VARCHAR(10) ...) a ...

  5. Spark SQL源代码分析之核心流程

    /** Spark SQL源代码分析系列文章*/ 自从去年Spark Submit 2013 Michael Armbrust分享了他的Catalyst,到至今1年多了,Spark SQL的贡献者从几 ...

  6. MySQL数据库详解(一)执行SQL查询语句时,其底层到底经历了什么?

    一条SQL查询语句是如何执行的? 前言 ​ 大家好,我是WZY,今天我们学习下MySQL的基础框架,看一件事千万不要直接陷入细节里,你应该先鸟瞰其全貌,这样能够帮助你从高维度理解问题.同样,对于MyS ...

  7. 2. 执行Spark SQL查询

    2.1 命令行查询流程 打开Spark shell 例子:查询大于21岁的用户 创建如下JSON文件,注意JSON的格式: {"name":"Michael"} ...

  8. Hibernate通过SQL查询常量时只能返回第一个字符的解决方法

    在Hibernate中如果通过 [java] view plaincopy session.createSQLQuery("select '合计' as name from dual&quo ...

  9. spark sql 查询hive表并写入到PG中

    import java.sql.DriverManager import java.util.Properties import com.zhaopin.tools.{DateUtils, TextU ...

  10. Databricks 第11篇:Spark SQL 查询(行转列、列转行、Lateral View、排序)

    本文分享在Azure Databricks中如何实现行转列和列转行. 一,行转列 在分组中,把每个分组中的某一列的数据连接在一起: collect_list:把一个分组中的列合成为数组,数据不去重,格 ...

随机推荐

  1. 《刚刚问世》系列初窥篇-Java+Playwright自动化测试-6- 元素基础定位方式-上篇 (详细教程)

    1.简介 从这篇文章开始,就开始要介绍UI自动化核心的内容,也是最困难的部分了,就是:定位元素,并去对定位到的元素进行一系列相关的操作.想要对元素进行操作,第一步,也是最重要的一步,就是要找到这个元素 ...

  2. Element-UI 中关于 Table 的几个功能点简介(行列的合并和样式、合计行配置等)

    〇.前言 本文记录了关于 Element 框架中 Table 的几个功能点,后续将持续更新. el-table 官网地址:https://element.eleme.cn/#/zh-CN/compon ...

  3. 为什么Spring官方不推荐使用 @Autowired?

    前言 很多人刚接触 Spring 的时候,对 @Autowired 绝对是爱得深沉. 一个注解,轻松搞定依赖注入,连代码量都省了. 谁不爱呢? 但慢慢地,尤其是跑到稍微复杂点的项目里,@Autowir ...

  4. C++ 实现万年历(原创)

    2020年08月31日 首次分享文档源代码. 2023年11月23日 对文档.代码进行了更新,希望可以帮助到你. 1. 实现功能 提供菜单方式选择,假定输入的年份在1940-2040年之间. 输入一个 ...

  5. uniapp安卓在线更新版本

    实现逻辑 通过获取线上的版本号和app的版本号进行对比 查看是不是最新版 - app版本号小于线上版本号则不是最新版 提示更新 模拟检测更新请求 起一个服务,也就是检测更新的接口 返回值为最新版本号和 ...

  6. ESP32网页示波器+波形发生器

    项目开源地址:https://github.com/guohaomeng/ESP32WebScope ESP32WebScope 只用一块ESP32制作的ESP32网页示波器+波形发生器,可以拿来生成 ...

  7. CentOS8 Failed to start docker.service: Unit docker.service not found处理方式

    出现该问题的原因是 centos8 中的podman导致的,podman是centos8预装的类似docker的软件  不需要所以直接卸载. 解决方式: dnf remove podman 然后重装D ...

  8. 【C#】【平时练习】将左边列表框(List)的内容(月份)添加到右边列表框。最终右侧显示的内容(月份)要保持一定顺序

    Aspx - 点击查看代码 <%@ Page Language="C#" AutoEventWireup="true" CodeBehind=" ...

  9. alpine-jdk17

    # 指定基础镜像 FROM amd64/eclipse-temurin:17.0.5_8-jdk-alpine MAINTAINER "muzhi" RUN sed -i 's/d ...

  10. SQL语句报com.alibaba.druid.sql.parser.ParserException: TODO IDENTIFIER cross

    这个错误根据网络上人员说是解析出错!虽然报错但不影响结果!但是报错了就是看的不爽!把druid包换成druid-1.0.9.jar就解决这个问题了!至于性能暂时还没测试到