Hbase(二)【shell操作】

一.基础操作
- 1.进入shell命令行
- 2.帮助查看命令
二.命名空间操作
三.表操作
四.数据操作
五.总结

一.基础操作

1.进入shell命令行

bin/hbase shell

[hadoop@hadoop102 hbase]$ bin/hbase shell

SLF4J: Class path contains multiple SLF4J bindings.

SLF4J: Found binding in [jar:file:/opt/module/hbase/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in [jar:file:/opt/module/hadoop-3.1.3/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.

SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

HBase Shell

Use "help" to get list of supported commands.

Use "exit" to quit this interactive shell.

For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell

Version 2.0.5, r76458dd074df17520ad451ded198cd832138e929, Mon Mar 18 00:41:49 UTC 2019

Took 0.0064 seconds

2.帮助查看命令

hbase(main):001:0> help

HBase Shell, version 2.0.5, r76458dd074df17520ad451ded198cd832138e929, Mon Mar 18 00:41:49 UTC 2019

Type 'help "COMMAND"', (e.g. 'help "get"' -- the quotes are necessary) for help on a specific command.

Commands are grouped. Type 'help "COMMAND_GROUP"', (e.g. 'help "general"') for help on a command group.

COMMAND GROUPS:

  Group name: general

  Commands: processlist, status, table_help, version, whoami

  Group name: ddl

  Commands: alter, alter_async, alter_status, create, describe, disable, disable_all, drop, drop_all, enable, enable_all, exists, get_table, is_disabled, is_enabled, list, list_regions, locate_region, show_filters

  Group name: namespace

  Commands: alter_namespace, create_namespace, describe_namespace, drop_namespace, list_namespace, list_namespace_tables

  Group name: dml

  Commands: append, count, delete, deleteall, get, get_counter, get_splits, incr, put, scan, truncate, truncate_preserve

  Group name: tools

  Commands: assign, balance_switch, balancer, balancer_enabled, catalogjanitor_enabled, catalogjanitor_run, catalogjanitor_switch, cleaner_chore_enabled, cleaner_chore_run, cleaner_chore_switch, clear_block_cache, clear_compaction_queues, clear_deadservers, close_region, compact, compact_rs, compaction_state, flush, is_in_maintenance_mode, list_deadservers, major_compact, merge_region, move, normalize, normalizer_enabled, normalizer_switch, split, splitormerge_enabled, splitormerge_switch, trace, unassign, wal_roll, zk_dump

  Group name: replication

  Commands: add_peer, append_peer_exclude_namespaces, append_peer_exclude_tableCFs, append_peer_namespaces, append_peer_tableCFs, disable_peer, disable_table_replication, enable_peer, enable_table_replication, get_peer_config, list_peer_configs, list_peers, list_replicated_tables, remove_peer, remove_peer_exclude_namespaces, remove_peer_exclude_tableCFs, remove_peer_namespaces, remove_peer_tableCFs, set_peer_bandwidth, set_peer_exclude_namespaces, set_peer_exclude_tableCFs, set_peer_namespaces, set_peer_replicate_all, set_peer_tableCFs, show_peer_tableCFs, update_peer_config

  Group name: snapshots

  Commands: clone_snapshot, delete_all_snapshot, delete_snapshot, delete_table_snapshots, list_snapshots, list_table_snapshots, restore_snapshot, snapshot

  Group name: configuration

  Commands: update_all_config, update_config

  Group name: quotas

  Commands: list_quota_snapshots, list_quota_table_sizes, list_quotas, list_snapshot_sizes, set_quota

  Group name: security

  Commands: grant, list_security_capabilities, revoke, user_permission

  Group name: procedures

  Commands: list_locks, list_procedures

  Group name: visibility labels

  Commands: add_labels, clear_auths, get_auths, list_labels, set_auths, set_visibility

  Group name: rsgroup

  Commands: add_rsgroup, balance_rsgroup, get_rsgroup, get_server_rsgroup, get_table_rsgroup, list_rsgroups, move_namespaces_rsgroup, move_servers_namespaces_rsgroup, move_servers_rsgroup, move_servers_tables_rsgroup, move_tables_rsgroup, remove_rsgroup, remove_servers_rsgroup

SHELL USAGE:

Quote all names in HBase Shell such as table and column names.  Commas delimit

command parameters.  Type <RETURN> after entering a command to run it.

Dictionaries of configuration used in the creation and alteration of tables are

Ruby Hashes. They look like this:

  {'key1' => 'value1', 'key2' => 'value2', ...}

and are opened and closed with curley-braces.  Key/values are delimited by the

'=>' character combination.  Usually keys are predefined constants such as

NAME, VERSIONS, COMPRESSION, etc.  Constants do not need to be quoted.  Type

'Object.constants' to see a (messy) list of all constants in the environment.

If you are using binary keys or values and need to enter them in the shell, use

double-quote'd hexadecimal representation. For example:

  hbase> get 't1', "key\x03\x3f\xcd"

  hbase> get 't1', "key\003\023\011"

  hbase> put 't1', "test\xef\xff", 'f1:', "\x01\x33\x40"

The HBase shell is the (J)Ruby IRB with the above HBase-specific commands added.

For more on the HBase Shell, see http://hbase.apache.org/book.html

二.命名空间操作

1.创建namespace

hbase(main):004:0> create_namespace 'bigdata'

Took 0.3157 seconds

2.查看namespace

hbase(main):007:0> list_namespace

NAMESPACE

hbase

3 row(s)

Took 0.0408 seconds

3.删除命名空间

hbase(main):008:0> drop_namespace 'bigdata'

Took 0.2925 seconds

注意：只有当命名空间下面没有表才可以删除。

三.表操作

1.查看所有表

hbase(main):009:0> list

TABLE

0 row(s)

Took 0.0078 seconds

=> []

2.创建表

create '命名空间1：表名','列簇1','列簇2'...

预分区

create 'ns1:t1', 'f1', SPLITS => ['10', '20', '30', '40']

hbase(main):014:0> create 'bigdata:student','basic_info','extral_info'

Created table bigdata:student

Took 1.4615 seconds

=> Hbase::Table - bigdata:student

3.查看表详情

describe '命名空间:表名'

hbase(main):015:0> describe 'bigdata:student'

Table bigdata:student is ENABLED

bigdata:student

COLUMN FAMILIES DESCRIPTION

{NAME => 'basic_info', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA

_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE

_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}

{NAME => 'extral_info', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DAT

A_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACH

E_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}

2 row(s)

Took 0.2060 seconds

4.修改表

1）删除列簇信息

alter '命名空间：表名','delete'=>'列簇名'

hbase(main):016:0> alter 'bigdata:student','delete'=>'extral_info'

Updating all regions with the new schema...

1/1 regions updated.

Done.

Took 2.1364 seconds

2）修改版本信息

alter 'bigdata:student',{NAME=>'列簇名'，VERSIONS=>3}

hbase(main):022:0> alter 'bigdata:student',{NAME=>'info1',VERSIONS=>3}

Updating all regions with the new schema...

1/1 regions updated.

Done.

Took 2.0003 seconds

5.删除表

先disable表，再drop表，否则报错。

hbase(main):026:0> disable 'student'

Took 0.4833 seconds

hbase(main):027:0> drop 'student'

Took 0.4740 seconds

四.数据操作

1.增加数据

put '命名空间:表名','rowkey值','列簇名:列限定符','值'

hbase(main):031:0> put 'bigdata:student','1001','basic_info:name','zhangsan'

Took 0.1080 seconds

2.更新数据

语法和增加数据一样，如果rowkey相同就是更新

put '命名空间:表名','rowkey值','列簇名:列限定符','值'

3.Scan查看数据

1）扫描全表

scan '命名空间:表名'

hbase(main):035:0> scan 'bigdata:student'

ROW                    COLUMN+CELL

1001                   column=basic_info:name,timestamp=1594726281691, value=zhangsan  1 row(s)

Took 0.0300 seconds

2）从限定开始位置扫描全表

scan '命名空间:表名',{STARTROW=>'开始的rowkey'}

hbase(main):036:0> scan 'bigdata:student',{STARTROW=>'1001'}

ROW                 COLUMN+CELL

1001                column=basic_info:name, timestamp=1594726281691, value=zhangsan

1 row(s)

Took 0.0254 seconds

3）查询列簇的数据

scan '命名空间:表名',{COLUMNS=>'列簇名'}

4）查询列的数据

scan '命名空间:表名',{COLUMNS=>'列簇名:列限定符'}

4.Get查看数据

1）查询指定rowkey

get '命名空间:表名','rowkey'

2）查询某个列簇

get '命名空间:表名','rowkey','列簇名'

3）查询某个列

get '命名空间:表名','rowkey','列簇名:列限定符'

4）指定rowkey和列限定符

get '命名空间:表名','rowkey','列簇名:列限定符'

5.删除数据

1）指定rowkey

deleteall '命名空间:表名','rowkey'

注意：使用的是deleteall命令而不是delete命令

2）指定rowkey+列簇

delete '命名空间:表名','rowkey','列簇名'

3）指定rowkey+列簇+列限定符

delete '命名空间:表名','rowkey','列簇名:列限定符'

6.统计行数

count '命名空间:表名'

hbase(main):037:0> count 'bigdata:student'

1 row(s)

Took 0.0514 seconds

=> 1

7.清空表数据

truncate 命名空间:表名

五.总结

1、创建命名空间

	create_namespace 命名空间名称

2、创建表

	create 表名,列簇1名,列簇2名,.

3、修改表结构

	新增列簇:alter 表名,{NAME=>列簇名,VERSIONS=>4}

	修改列簇；alter 表名,NAME=>待修改的列簇名,VERSION=>版本号

4、插入数据

	put '命名空间:表名','rowkey','列簇名:列限定符','值'

5、修改数据与插入一样

6、删除数据

	1、删除cell数据: delete '命名空间:表名','rowkey','列簇名:列限定符'

	2、删除正行: deleteall '命名空间:表名','rowkey'

7、删除表：

	1、禁用表； disable '命名空间:表名'

	2、删除: drop '命名空间:表名'

8、清空表数据

	truncate '命名空间:表名'

9、统计表行数

	count '命名空间:表名'

10、删除命名空间

	drop_namespace '命名空间' [命名空间中没有表才能删除]

11、查询数据

	1、根据rowkey查询：

		1、查询正行：get '命名空间:表名','rowkey'

		2、查询某个列簇的数据: get '命名空间:表名','rowkey','列簇名'

		3、查询某个列: get '命名空间:表名','rowkey','列簇名:列限定符'

	2、扫描数据

		1、查询整表数据:scan '命名空间:表名'

		2、查询列簇的数据： scan '命名空间:表名',{COLUMNS=>'列簇名'}

		2、查询列的数据： scan '命名空间:表名',{COLUMNS=>'列簇名:列限定符'}

12、查询所有表： list

13、查询所有命名空间: list_namespace

14、查询表结构信息；describe '命名空间:表名'

memstore flush触发条件

1、region的其中一个memstore的大小达到128M，这region中所有的memstore都会flush

2、当region正处于写高峰的时候，memstore就算达到128M也会延迟flush，直到region中所有的memstore的大小达到512M的时候，阻塞写入,region会flush所有的memstore

3、regionserver中所有region的所有memstore的大小达到 JAVA_heap * gloabmemstore * upperlimit,就会将regionserver中

	的region按照memstore占用的内存大小进行降序排序，优先flush占用内存大的region的memstore。flush到regionserver所有的memstore占用的内存大小<JAVA_heap * gloabmemstore * lowerlimit时停止flush

4、如果regionserver正处于写高峰。也会延迟flush.一直到regionserver中所有的memstore的大小达到JAVA_heap * gloabmemstore，会阻塞写入，进行flush

5、当达到一定的时间[1H]之后，会进行flush

6、HLOG的文件数超过32会进行flush，flush的时候会根据时间排序，优先flush长时间没有flush的region

minor compaction:

	小文件合并成大文件，触发条件就是小文件的个数达到3个

	只是单纯的合并文件，不会删除过期数据以及已经删除的数据

major compaction:

	默认是7天一次。一般生产上都是关闭major compact,自己手动major compaction【major_compact '表名'】

	在合并的过程中会删除标记删除的数据，以及过期数据

split

	0.94版本之前: region中某一个store的大小达到10G，region一分为二

	0.94版本-2.0版本: region中某一个store的大小达到Min(R^3 * flushsize[128M],filesize[10G]),region一分为二

		R是当前region所属表在当前regionserver上的region的个数

	2.0版本，如果当前region所属表在当前regionserver上的region的个数=1,某一个store的大小达到2* flushsize[128M]切分，否则store的大小达到10G的才会切分