1、查询表结构

show create table data_lake_ods.test

CREATE TABLE spark_catalog.data_lake_ods.test (
`user_number` BIGINT NOT NULL,
`subclazz_number` BIGINT NOT NULL,
`clazz_number` BIGINT,
`clazz_lesson_number` BIGINT NOT NULL,
`lesson_live_property` BIGINT,
`lesson_video_property` BIGINT,
`lesson_live_length` BIGINT,
`lesson_video_length` BIGINT,
`lesson_standard_length` BIGINT,
`lesson_length` BIGINT,
`live_learn_duration` BIGINT,
`video_learn_duration` BIGINT,
`learn_duration` BIGINT,
`is_valid_live_learn` BIGINT,
`is_valid_learn` BIGINT,
`companion_learn_duration` BIGINT,
`learn_combine_duration` BIGINT,
`companion_lesson_length` BIGINT,
`is_should_attend_user` BIGINT,
`is_live_attend_user` BIGINT,
`is_combine_valid_learn_user` BIGINT,
`is_black_user` BIGINT)
USING iceberg
LOCATION '/user/hive/warehouse/data_lake_ods.db/test'
TBLPROPERTIES(
'catalog-database' = 'data_lake_ods',
'catalog-name' = 'spark_catalog',
'catalog-table' = 'test',
'catalog-type' = 'hive',
'connector' = 'iceberg',
'current-snapshot-id' = '5677214384524195741',
'format' = 'iceberg/parquet',
'format-version' = '2',
'identifier-fields' = '[clazz_lesson_number,subclazz_number,user_number]',
'table.drop.base-path.enabled' = 'true',
'uri' = 'thrift://127.0.0.1:7004,******',
'write-parallelism' = '16',
'write.distribution-mode' = 'hash',
'write.merge.mode' = 'merge-on-read',
'write.metadata.delete-after-commit.enabled' = 'true',
'write.metadata.metrics.default' = 'full',
'write.metadata.previous-versions-max' = '100',
'write.update.mode' = 'merge-on-read',
'write.upsert.enabled' = 'true')

2、得到表存储路径后查询最新的metadata.json文件

hdfs dfs -ls /user/hive/warehouse/data_lake_ods.db/test/metadata/
得到最新的文件是:/user/hive/warehouse/data_lake_ods.db/test/metadata/00059-1d8e6694-f873-45a3-9697-ce2762f78c3a.metadata.json

3、查看文件内容

hdfs dfs -cat /user/hive/warehouse/data_lake_ods.db/test/metadata/00059-1d8e6694-f873-45a3-9697-ce2762f78c3a.metadata.json

4、找到"current-snapshot-id" : 5677214384524195741。其实和建表中的id一样

"current-snapshot-id" : 5677214384524195741,
{
"sequence-number" : 59,
"snapshot-id" : 5677214384524195741,
"parent-snapshot-id" : 922737561337808536,
"timestamp-ms" : 1701683779323,
"summary" : {
"operation" : "overwrite",
"flink.operator-id" : "e883208d19e3c34f8aaf2a3168a63337",
"flink.job-id" : "000000001de734af0000000000000000",
"flink.max-committed-checkpoint-id" : "59",
"added-data-files" : "16",
"added-equality-delete-files" : "16",
"added-position-delete-files" : "16",
"added-delete-files" : "32",
"added-records" : "611418",
"added-files-size" : "19072474",
"added-position-deletes" : "245062",
"added-equality-deletes" : "366356",
"changed-partition-count" : "1",
"total-records" : "1084028212",
"total-files-size" : "41957333041",
"total-data-files" : "944",
"total-delete-files" : "1680",
"total-position-deletes" : "10749681",
"total-equality-deletes" : "1073278531"
},
"manifest-list" : "/user/hive/warehouse/data_lake_ods.db/test/metadata/snap-5677214384524195741-1-644d4e57-0953-40c0-915d-faf7dd8a3ab2.avro",
"schema-id" : 0
}

5、得到manifest-list文件路径,下载到本地分析

hadoop fs -get /user/hive/warehouse/data_lake_ods.db/test/metadata/snap-5677214384524195741-1-644d4e57-0953-40c0-915d-faf7dd8a3ab2.avro ~/test/iceberg

6、查看avro文件内容

java -jar ./avro-tools-1.8.1.jar tojson ./snap-5677214384524195741-1-644d4e57-0953-40c0-915d-faf7dd8a3ab2.avro | less |grep '5677214384524195741'
内容如下:
{"manifest_path":"/user/hive/warehouse/data_lake_ods.db/test/metadata/644d4e57-0953-40c0-915d-faf7dd8a3ab2-m0.avro","manifest_length":12233,"partition_spec_id":0,"content":0,"sequence_number":59,"min_sequence_number":59,"added_snapshot_id":567721438452419574,"added_data_files_count":16,"existing_data_files_count":0,"deleted_data_files_count":0,"added_rows_count":611418,"existing_rows_count":0,"deleted_rows_count":0,"partitions":{"array":[]}}
{"manifest_path":"/user/hive/warehouse/data_lake_ods.db/test/metadata/644d4e57-0953-40c0-915d-faf7dd8a3ab2-m1.avro","manifest_length":10241,"partition_spec_id":0,"content":1,"sequence_number":59,"min_sequence_number":59,"added_snapshot_id":567721438452419574,"added_data_files_count":32,"existing_data_files_count":0,"deleted_data_files_count":0,"added_rows_count":611418,"existing_rows_count":0,"deleted_rows_count":0,"partitions":{"array":[]}}

7、查询其中一个文件内容

hadoop fs -get /user/hive/warehouse/data_lake_ods.db/test/metadata/644d4e57-0953-40c0-915d-faf7dd8a3ab2-m0.avro ~/test/iceberg
java -jar ./avro-tools-1.8.1.jar tojson ./644d4e57-0953-40c0-915d-faf7dd8a3ab2-m0.avro | less
数据样例:
{"status":1,"snapshot_id":{"long":5677214384524195741},"sequence_number":null,"file_sequence_number":null,"data_file":{"content":0,"file_path":"/user/hive/warehouse/data_lake_ods.db/test/data/00009-0-a2f03c5f-eec9-4a15-bafe-9b360af4fde5-00162.parquet","file_format":"PARQUET","partition":{},"record_count":38138,"file_size_in_bytes":899517,"column_sizes":{"array":[{"key":1,"value":178768},{"key":2,"value":90942},{"key":3,"value":68674},{"key":4,"value":95511},{"key":5,"value":9520},{"key":6,"value":6952},{"key":7,"value":47567},{"key":8,"value":42993},{"key":9,"value":29052},{"key":10,"value":39705},{"key":11,"value":59926},{"key":12,"value":40560},{"key":13,"value":67603},{"key":14,"value":8252},{"key":15,"value":8365},{"key":16,"value":7477},{"key":17,"value":67708},{"key":18,"value":6603},{"key":19,"value":5850},{"key":20,"value":5150},{"key":21,"value":5096},{"key":22,"value":246}]},"value_counts":{"array":[{"key":1,"value":38138},{"key":2,"value":38138},{"key":3,"value":38138},{"key":4,"value":38138},{"key":5,"value":38138},{"key":6,"value":38138},{"key":7,"value":38138},{"key":8,"value":38138},{"key":9,"value":38138},{"key":10,"value":38138},{"key":11,"value":38138},{"key":12,"value":38138},{"key":13,"value":38138},{"key":14,"value":38138},{"key":15,"value":38138},{"key":16,"value":38138},{"key":17,"value":38138},{"key":18,"value":38138},{"key":19,"value":38138},{"key":20,"value":38138},{"key":21,"value":38138},{"key":22,"value":38138}]},"null_value_counts":{"array":[{"key":1,"value":0},{"key":2,"value":0},{"key":3,"value":0},{"key":4,"value":0},{"key":5,"value":11091},{"key":6,"value":11092},{"key":7,"value":11091},{"key":8,"value":11092},{"key":9,"value":11091},{"key":10,"value":11091},{"key":11,"value":11091},{"key":12,"value":11091},{"key":13,"value":11091},{"key":14,"value":11091},{"key":15,"value":11091},{"key":16,"value":11091},{"key":17,"value":11091},{"key":18,"value":11091},{"key":19,"value":0},{"key":20,"value":0},{"key":21,"value":0},{"key":22,"value":0}]},"nan_value_counts":{"array":[]},"lower_bounds":{"array":[{"key":1,"value":"'\u0006\u0002\u0000\u0000\u0000\u0000\u0000"},{"key":2,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":3,"value":"\u0000\u0000<U+008D><Ú\u000B\t\u0000"},{"key":4,"value":",\u0000ß<Ú\u000B\t\u0000"},{"key":5,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":6,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":7,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":8,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":9,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":10,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":11,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":12,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":13,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":14,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":15,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":16,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":17,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":18,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":19,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":20,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":21,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":22,"value":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"}]},"upper_bounds":{"array":[{"key":1,"value":"C<U+0095>\u0014\u0006<U+008E>å\u0000\u0000"},{"key":2,"value":"<U+0080>\u0004 <U+0084>´ÇW\u0000"},{"key":3,"value":"\u0000à`t_º{\u0005"},{"key":4,"value":"\u0000°ÃZ\"<U+0097>|\u0005"},{"key":5,"value":"\u0003\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":6,"value":"\u0003\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":7,"value":"<U+0081>L\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":8,"value":"zL\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":9,"value":"`T\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":10,"value":"$4\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":11,"value":"18\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":12,"value":"£8\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":13,"value":"~:\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":14,"value":"\u0001\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":15,"value":"\u0001\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":16,"value":"<U+0091>\u001A\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":17,"value":"~:\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":18,"value":"B\u001B\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":19,"value":"\u0002\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":20,"value":"\u0001\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":21,"value":"\u0001\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"key":22,"value":"\u0001\u0000\u0000\u0000\u0000\u0000\u0000\u0000"}]},"key_metadata":null,"split_offsets":{"array":[4]},"equality_ids":null,"sort_order_id":{"int":0}}}

8、根据第六步得到的文件路径查看文件内容

(1)查topN数据

hadoop jar ./parquet-tools-1.11.0.jar head -n 10  /user/hive/warehouse/data_lake_ods.db/test/data/00009-0-a2f03c5f-eec9-4a15-bafe-9b360af4fde5-00162.parquet
数据样例:

user_number = 19100811
subclazz_number = 22388336842768640
clazz_number = 10995946454325888
clazz_lesson_number = 10995946455505548
lesson_live_property = 0
lesson_video_property = 0
lesson_live_length = 6091
lesson_video_length = 0
lesson_standard_length = 7200
lesson_length = 6091
live_learn_duration = 476
video_learn_duration = 0
learn_duration = 476
is_valid_live_learn = 0
is_valid_learn = 0
companion_learn_duration = 0
learn_combine_duration = 476
companion_lesson_length = 0
is_should_attend_user = 0
is_live_attend_user = 1
is_combine_valid_learn_user = 0
is_black_user = 0

(2)查元数据,表格式

hadoop jar ./parquet-tools-1.11.0.jar schema /user/hive/warehouse/data_lake_ods.db/test/data/00009-0-a2f03c5f-eec9-4a15-bafe-9b360af4fde5-00162.parquet
message table {
required int64 user_number = 1;
required int64 subclazz_number = 2;
optional int64 clazz_number = 3;
required int64 clazz_lesson_number = 4;
optional int64 lesson_live_property = 5;
optional int64 lesson_video_property = 6;
optional int64 lesson_live_length = 7;
optional int64 lesson_video_length = 8;
optional int64 lesson_standard_length = 9;
optional int64 lesson_length = 10;
optional int64 live_learn_duration = 11;
optional int64 video_learn_duration = 12;
optional int64 learn_duration = 13;
optional int64 is_valid_live_learn = 14;
optional int64 is_valid_learn = 15;
optional int64 companion_learn_duration = 16;
optional int64 learn_combine_duration = 17;
optional int64 companion_lesson_length = 18;
optional int64 is_should_attend_user = 19;
optional int64 is_live_attend_user = 20;
optional int64 is_combine_valid_learn_user = 21;
optional int64 is_black_user = 22;
}

(3)查看meta信息
hadoop jar /home/hadoop/test/iceberg/parquet-tools-1.11.0.jar meta /user/hive/warehouse/data_lake_ods.db/test/data/00009-0-a2f03c5f-eec9-4a15-bafe-9b360af4fde5-00162.parquet

file:                        /user/hive/warehouse/data_lake_ods.db/test/data/00009-0-a2f03c5f-eec9-4a15-bafe-9b360af4fde5-00162.parquet
creator: parquet-mr version 1.13.1 (build db4183109d5b734ec5930d870cdae161e408ddba)
extra: iceberg.schema = {"type":"struct","schema-id":0,"identifier-field-ids":[1,2,4],"fields":[{"id":1,"name":"user_number","required":true,"type":"long"},{"id":2,"name":"subclazz_number","required":true,"type":"long"},{"id":3,"name":"clazz_number","required":false,"type":"long"},{"id":4,"name":"clazz_lesson_number","required":true,"type":"long"},{"id":5,"name":"lesson_live_property","required":false,"type":"long"},{"id":6,"name":"lesson_video_property","required":false,"type":"long"},{"id":7,"name":"lesson_live_length","required":false,"type":"long"},{"id":8,"name":"lesson_video_length","required":false,"type":"long"},{"id":9,"name":"lesson_standard_length","required":false,"type":"long"},{"id":10,"name":"lesson_length","required":false,"type":"long"},{"id":11,"name":"live_learn_duration","required":false,"type":"long"},{"id":12,"name":"video_learn_duration","required":false,"type":"long"},{"id":13,"name":"learn_duration","required":false,"type":"long"},{"id":14,"name":"is_valid_live_learn","required":false,"type":"long"},{"id":15,"name":"is_valid_learn","required":false,"type":"long"},{"id":16,"name":"companion_learn_duration","required":false,"type":"long"},{"id":17,"name":"learn_combine_duration","required":false,"type":"long"},{"id":18,"name":"companion_lesson_length","required":false,"type":"long"},{"id":19,"name":"is_should_attend_user","required":false,"type":"long"},{"id":20,"name":"is_live_attend_user","required":false,"type":"long"},{"id":21,"name":"is_combine_valid_learn_user","required":false,"type":"long"},{"id":22,"name":"is_black_user","required":false,"type":"long"}]} file schema: table
--------------------------------------------------------------------------------
user_number: REQUIRED INT64 R:0 D:0
subclazz_number: REQUIRED INT64 R:0 D:0
clazz_number: OPTIONAL INT64 R:0 D:1
clazz_lesson_number: REQUIRED INT64 R:0 D:0
lesson_live_property: OPTIONAL INT64 R:0 D:1
lesson_video_property: OPTIONAL INT64 R:0 D:1
lesson_live_length: OPTIONAL INT64 R:0 D:1
lesson_video_length: OPTIONAL INT64 R:0 D:1
lesson_standard_length: OPTIONAL INT64 R:0 D:1
lesson_length: OPTIONAL INT64 R:0 D:1
live_learn_duration: OPTIONAL INT64 R:0 D:1
video_learn_duration: OPTIONAL INT64 R:0 D:1
learn_duration: OPTIONAL INT64 R:0 D:1
is_valid_live_learn: OPTIONAL INT64 R:0 D:1
is_valid_learn: OPTIONAL INT64 R:0 D:1
companion_learn_duration: OPTIONAL INT64 R:0 D:1
learn_combine_duration: OPTIONAL INT64 R:0 D:1
companion_lesson_length: OPTIONAL INT64 R:0 D:1
is_should_attend_user: OPTIONAL INT64 R:0 D:1
is_live_attend_user: OPTIONAL INT64 R:0 D:1
is_combine_valid_learn_user: OPTIONAL INT64 R:0 D:1
is_black_user: OPTIONAL INT64 R:0 D:1 row group 1: RC:38138 TS:1269627 OFFSET:4
--------------------------------------------------------------------------------
user_number: INT64 GZIP DO:4 FPO:112797 SZ:178768/238215/1.33 VC:38138 ENC:BIT_PACKED,PLAIN_DICTIONARY ST:[min: 132647, max: 252398150128963, num_nulls: 0]
subclazz_number: INT64 GZIP DO:178772 FPO:216044 SZ:90942/116042/1.28 VC:38138 ENC:BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 24707901098558592, num_nulls: 0]
clazz_number: INT64 GZIP DO:269714 FPO:287587 SZ:68674/79130/1.15 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 2546306737045504, max: 395114311462215680, num_nulls: 0]
clazz_lesson_number: INT64 GZIP DO:338388 FPO:374110 SZ:95511/115465/1.21 VC:38138 ENC:BIT_PACKED,PLAIN_DICTIONARY ST:[min: 2546306742419500, max: 395357041109217280, num_nulls: 0]
lesson_live_property: INT64 GZIP DO:433899 FPO:433948 SZ:9520/11698/1.23 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 3, num_nulls: 11091]
lesson_video_property: INT64 GZIP DO:443419 FPO:443469 SZ:6952/9243/1.33 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 3, num_nulls: 11092]
lesson_live_length: INT64 GZIP DO:450371 FPO:456873 SZ:47567/64261/1.35 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 19585, num_nulls: 11091]
lesson_video_length: INT64 GZIP DO:497938 FPO:505857 SZ:42993/70204/1.63 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 19578, num_nulls: 11092]
lesson_standard_length: INT64 GZIP DO:540931 FPO:543782 SZ:29052/49418/1.70 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 21600, num_nulls: 11091]
lesson_length: INT64 GZIP DO:569983 FPO:575653 SZ:39705/61757/1.56 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 13348, num_nulls: 11091]
live_learn_duration: INT64 GZIP DO:609688 FPO:625930 SZ:59926/99798/1.67 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 14385, num_nulls: 11091]
video_learn_duration: INT64 GZIP DO:669614 FPO:680817 SZ:40560/80714/1.99 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 14499, num_nulls: 11091]
learn_duration: INT64 GZIP DO:710174 FPO:728644 SZ:67603/106804/1.58 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 14974, num_nulls: 11091]
is_valid_live_learn: INT64 GZIP DO:777777 FPO:777819 SZ:8252/8680/1.05 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 1, num_nulls: 11091]
is_valid_learn: INT64 GZIP DO:786029 FPO:786071 SZ:8365/8705/1.04 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 1, num_nulls: 11091]
companion_learn_duration: INT64 GZIP DO:794394 FPO:795351 SZ:7477/12507/1.67 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 6801, num_nulls: 11091]
learn_combine_duration: INT64 GZIP DO:801871 FPO:820446 SZ:67708/107108/1.58 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 14974, num_nulls: 11091]
companion_lesson_length: INT64 GZIP DO:869579 FPO:869883 SZ:6603/9775/1.48 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 6978, num_nulls: 11091]
is_should_attend_user: INT64 GZIP DO:876182 FPO:876227 SZ:5850/9754/1.67 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 2, num_nulls: 0]
is_live_attend_user: INT64 GZIP DO:882032 FPO:882077 SZ:5150/5094/0.99 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 1, num_nulls: 0]
is_combine_valid_learn_user: INT64 GZIP DO:887182 FPO:887227 SZ:5096/5042/0.99 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 1, num_nulls: 0]
is_black_user: INT64 GZIP DO:892278 FPO:892323 SZ:246/213/0.87 VC:38138 ENC:RLE,BIT_PACKED,PLAIN_DICTIONARY ST:[min: 0, max: 1, num_nulls: 0]

spark-sql查询Iceberg时处理流程的更多相关文章

  1. 大数据技术之_19_Spark学习_03_Spark SQL 应用解析 + Spark SQL 概述、解析 、数据源、实战 + 执行 Spark SQL 查询 + JDBC/ODBC 服务器

    第1章 Spark SQL 概述1.1 什么是 Spark SQL1.2 RDD vs DataFrames vs DataSet1.2.1 RDD1.2.2 DataFrame1.2.3 DataS ...

  2. 【原创】大叔经验分享(23)spark sql插入表时的文件个数研究

    spark sql执行insert overwrite table时,写到新表或者新分区的文件个数,有可能是200个,也有可能是任意个,为什么会有这种差别? 首先看一下spark sql执行inser ...

  3. spark sql插入表时的文件个数研究

    spark sql执行insert overwrite table时,写到新表或者新分区的文件个数,有可能是200个,也有可能是任意个,为什么会有这种差别? 首先看一下spark sql执行inser ...

  4. sql查询语句时怎么把几个字段拼接成一个字段

    sql查询语句时怎么把几个字段拼接成一个字段SELECT CAST(COLUMN1 AS VARCHAR(10)) + '-' + CAST(COLUMN2 AS VARCHAR(10) ...) a ...

  5. Spark SQL源代码分析之核心流程

    /** Spark SQL源代码分析系列文章*/ 自从去年Spark Submit 2013 Michael Armbrust分享了他的Catalyst,到至今1年多了,Spark SQL的贡献者从几 ...

  6. MySQL数据库详解(一)执行SQL查询语句时,其底层到底经历了什么?

    一条SQL查询语句是如何执行的? 前言 ​ 大家好,我是WZY,今天我们学习下MySQL的基础框架,看一件事千万不要直接陷入细节里,你应该先鸟瞰其全貌,这样能够帮助你从高维度理解问题.同样,对于MyS ...

  7. 2. 执行Spark SQL查询

    2.1 命令行查询流程 打开Spark shell 例子:查询大于21岁的用户 创建如下JSON文件,注意JSON的格式: {"name":"Michael"} ...

  8. Hibernate通过SQL查询常量时只能返回第一个字符的解决方法

    在Hibernate中如果通过 [java] view plaincopy session.createSQLQuery("select '合计' as name from dual&quo ...

  9. spark sql 查询hive表并写入到PG中

    import java.sql.DriverManager import java.util.Properties import com.zhaopin.tools.{DateUtils, TextU ...

  10. Databricks 第11篇:Spark SQL 查询(行转列、列转行、Lateral View、排序)

    本文分享在Azure Databricks中如何实现行转列和列转行. 一,行转列 在分组中,把每个分组中的某一列的数据连接在一起: collect_list:把一个分组中的列合成为数组,数据不去重,格 ...

随机推荐

  1. Mybatis【19】-- Mybatis自关联多对多查询

    注:代码已托管在GitHub上,地址是:https://github.com/Damaer/Mybatis-Learning ,项目是mybatis-15-oneself-many2many,需要自取 ...

  2. ZCMU-1153

    思路 一个感觉是规律问题的数学问题 因为输入的是n所以要的出有关n的关系或者关系 有关排序,所以可以从位次入手,设双胞胎前一个位置在ai,后一个在bi. Sum(bi-ai)=(2+3+4+5+6+. ...

  3. uniapp 画布

    1.前言 uniapp中的canvas与HTML中的canvas用法并不同,他的使用文档请参考微信小程序画布 2.基本使用 1.准备canvas容器,并为其设置canvas-id和宽高(为了兼容H5, ...

  4. 使用Tesseract进行图片文字识别

    Tesseract介绍 Tesseract 是一个开源的光学字符识别(OCR)引擎,最初由 HP 在 1985 年至 1995 年间开发,后来被 Google 收购并开源.Tesseract 支持多种 ...

  5. React使用useRef调用子组件方法

    前情 公司前端主技术栈是react系,最近在提取组件的时候想到vue可以通过ref获取子组件,再调用子组件的方法,于是想在react中实现同样效果. 实现原理 父组件调用useRef获取ref对象,再 ...

  6. 【Linux】当初的学习笔记

    目录 Linux 笔记 linux基本概念 linux终端四部分 linux的实质 linux系统操作命令 查询用户 who who -H whoami 修改密码 快速切换到用户目录 sshd sys ...

  7. 【pygame】Python小游戏开发之看代码学编程

    话说我学习的时候,英文文档难以理解,中文文档杂乱无章,最终还是觉得,还不如直接看代码学习. 下面是我学习过程中写的代码,注释写的很详细,我想会帮助你理解的 pip install pygame 1.游 ...

  8. Base58在java程序中应用

    Base58是用于Bitcoin中使用的一种独特的编码方式,主要用于产生Bitcoin的钱包地址. 相比Base64,Base58不使用数字"0",字母大写"O" ...

  9. nginx如何配置代理转发

    Nginx是个厉害的服务器,可以配置多个服务器,一个server就是一个服务器server {      listen       80;      server_name  *.yourdomain ...

  10. java double转string去除科学计数法"E" 非tostring()和valueOf()

    在遇到需要将double类型转换string类型时,会出现转成科学计数法的形式,希望字符串能原样输出.直接使用会报java.lang.Double cannot be cast to java.lan ...