Backing Up and Restoring HBase Data

There are two strategies for backing up HBase：
1> Backing it up with a full cluster shutdown
2> Backing it up on a live cluster
3> Backing Up and Restoring HBase Data

A full shutdown backup has to stop HBase (or disable all tables) at first, then use Hadoop's distcp command to copy the contents of an HBase directory to either another directory on the same HDFS, or to a different HDFS. To restore from a full shutdown backup, just copy the backed up files, back to the HBase directory using distcp.

There are several approaches for a live cluster backup:
1> Using the CopyTable utility to copy data from one table to another
2> Exporting an HBase table to HDFS files, and importing the files back to HBase
3> HBase cluster replication

The CopyTable utility could be used to copy data from one table to either another one on the same cluster, or to a different cluster. The Export utility dumps the data of a table to HDFS,which is on the same cluster. As a set of Export, the Import utility is used to restore the data of the dump files.

方法 1：

landen@Master:~/UntarFile/hbase-0.94.12$ bin/hbase org.apache.hadoop.hbase.mapreduce.Export

Usage: Export [-D <property=value>]* <tablename> <outputdir> [<versions> [<starttime> [<endtime>]] [^[regex pattern] or [Prefix] to filter]]

  Note: -D properties will be applied to the conf used.

  For example:

   -D mapred.output.compress=true

   -D mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec

   -D mapred.output.compression.type=BLOCK

  Additionally, the following SCAN properties can be specified

  to control/limit what is exported..

   -D hbase.mapreduce.scan.column.family=<familyName>

   -D hbase.mapreduce.include.deleted.rows=true

For performance consider the following properties:

   -Dhbase.client.scanner.caching=100

   -Dmapred.map.tasks.speculative.execution=false

   -Dmapred.reduce.tasks.speculative.execution=false

landen@Master:~/UntarFile/hbase-0.94.12$ bin/hbase org.apache.hadoop.hbase.mapreduce.Export -D mapred.output.compress=true -D mapred.output.compression.codec=org.apache.hadoop.io.compress.BZip2Codec -D mapred.output.compression.type=BLOCK -D hbase.mapreduce.scan.column.family=IPAddress(可以","添加多个列簇) HiddenIPInfo(对应的HBase需导出的表) /backup/HBaseExport(导出数据时自动创建该目录)
13/12/10 20:12:15 INFO mapreduce.Export: versions=1, starttime=0, endtime=9223372036854775807, keepDeletedCells=false
13/12/10 20:12:15 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.zookeeper.ZooKeeper, using jar /home/landen/UntarFile/hbase-0.94.12/lib/zookeeper-3.4.5.jar
13/12/10 20:12:15 DEBUG mapreduce.TableMapReduceUtil: For class com.google.protobuf.Message, using jar /home/landen/UntarFile/hbase-0.94.12/lib/protobuf-java-2.4.0a.jar
13/12/10 20:12:15 DEBUG mapreduce.TableMapReduceUtil: For class com.google.common.collect.ImmutableSet, using jar /home/landen/UntarFile/hbase-0.94.12/lib/guava-11.0.2.jar
13/12/10 20:12:15 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.hbase.util.Bytes, using jar /home/landen/UntarFile/hbase-0.94.12/hbase-0.94.12.jar
13/12/10 20:12:15 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.io.LongWritable, using jar /home/landen/UntarFile/hbase-0.94.12/lib/hadoop-core-1.0.4.jar
13/12/10 20:12:15 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.io.Text, using jar /home/landen/UntarFile/hbase-0.94.12/lib/hadoop-core-1.0.4.jar
13/12/10 20:12:15 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.hbase.mapreduce.TableInputFormat, using jar /home/landen/UntarFile/hbase-0.94.12/hbase-0.94.12.jar
13/12/10 20:12:15 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.io.LongWritable, using jar /home/landen/UntarFile/hbase-0.94.12/lib/hadoop-core-1.0.4.jar
13/12/10 20:12:15 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.io.Text, using jar /home/landen/UntarFile/hbase-0.94.12/lib/hadoop-core-1.0.4.jar
13/12/10 20:12:15 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.mapreduce.lib.output.TextOutputFormat, using jar /home/landen/UntarFile/hbase-0.94.12/lib/hadoop-core-1.0.4.jar
13/12/10 20:12:15 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.mapreduce.lib.partition.HashPartitioner, using jar /home/landen/UntarFile/hbase-0.94.12/lib/hadoop-core-1.0.4.jar
..........................
13/12/10 20:12:29 DEBUG mapreduce.TableInputFormatBase: getSplits: split -> 0 -> slave1:,
13/12/10 20:12:32 INFO mapred.JobClient: Running job: job_201312042044_0033
13/12/10 20:12:33 INFO mapred.JobClient: map 0% reduce 0%
13/12/10 20:12:53 INFO mapred.JobClient: map 100% reduce 0%
13/12/10 20:12:58 INFO mapred.JobClient: Job complete: job_201312042044_0033
13/12/10 20:12:59 INFO mapred.JobClient: Counters: 29
13/12/10 20:12:59 INFO mapred.JobClient:   Job Counters
13/12/10 20:12:59 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=11992
13/12/10 20:12:59 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
13/12/10 20:12:59 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
13/12/10 20:12:59 INFO mapred.JobClient:     Rack-local map tasks=1
13/12/10 20:12:59 INFO mapred.JobClient:     Launched map tasks=1
13/12/10 20:12:59 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
13/12/10 20:12:59 INFO mapred.JobClient:   HBase Counters
13/12/10 20:12:59 INFO mapred.JobClient:     REMOTE_RPC_CALLS=0
13/12/10 20:12:59 INFO mapred.JobClient:     RPC_CALLS=6
13/12/10 20:12:59 INFO mapred.JobClient:     RPC_RETRIES=0
13/12/10 20:12:59 INFO mapred.JobClient:     NOT_SERVING_REGION_EXCEPTION=0
13/12/10 20:12:59 INFO mapred.JobClient:     NUM_SCANNER_RESTARTS=0
13/12/10 20:12:59 INFO mapred.JobClient:     MILLIS_BETWEEN_NEXTS=6
13/12/10 20:12:59 INFO mapred.JobClient:     BYTES_IN_RESULTS=1493
13/12/10 20:12:59 INFO mapred.JobClient:     BYTES_IN_REMOTE_RESULTS=0
13/12/10 20:12:59 INFO mapred.JobClient:     REGIONS_SCANNED=1
13/12/10 20:12:59 INFO mapred.JobClient:     REMOTE_RPC_RETRIES=0
13/12/10 20:12:59 INFO mapred.JobClient:   File Output Format Counters
13/12/10 20:12:59 INFO mapred.JobClient:     Bytes Written=775
13/12/10 20:12:59 INFO mapred.JobClient:   FileSystemCounters
13/12/10 20:12:59 INFO mapred.JobClient:     HDFS_BYTES_READ=69
13/12/10 20:12:59 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=35024
13/12/10 20:12:59 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=775
13/12/10 20:12:59 INFO mapred.JobClient:   File Input Format Counters
13/12/10 20:12:59 INFO mapred.JobClient:     Bytes Read=0
13/12/10 20:12:59 INFO mapred.JobClient:   Map-Reduce Framework
13/12/10 20:12:59 INFO mapred.JobClient:     Map input records=3
13/12/10 20:12:59 INFO mapred.JobClient:     Physical memory (bytes) snapshot=94224384
13/12/10 20:12:59 INFO mapred.JobClient:     Spilled Records=0
13/12/10 20:12:59 INFO mapred.JobClient:     CPU time spent (ms)=1110
13/12/10 20:12:59 INFO mapred.JobClient:     Total committed heap usage (bytes)=82116608
13/12/10 20:12:59 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=395390976
13/12/10 20:12:59 INFO mapred.JobClient:     Map output records=3
13/12/10 20:12:59 INFO mapred.JobClient:     SPLIT_RAW_BYTES=69
landen@Master:~/UntarFile/hadoop-1.0.4$ bin/hadoop fs -ls /backup/HBaseExport/
Warning: $HADOOP_HOME is deprecated.

Found 3 items
-rw-r--r--   1 landen supergroup          0 2013-12-10 20:12 /backup/HBaseExport/_SUCCESS
drwxr-xr-x   - landen supergroup          0 2013-12-10 20:12 /backup/HBaseExport/_logs
-rw-r--r--   1 landen supergroup        775 2013-12-10 20:12 /backup/HBaseExport/part-m-00000
landen@Master:~/UntarFile/hadoop-1.0.4$

方法 2:

landen@Master:~/UntarFile/hbase-0.94.12$ bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable

Usage: CopyTable [general options] [--starttime=X] [--endtime=Y] [--new.name=NEW] [--peer.adr=ADR] <tablename>

Options:

 rs.class     hbase.regionserver.class of the peer cluster

              specify if different from current cluster

 rs.impl      hbase.regionserver.impl of the peer cluster

 startrow     the start row

 stoprow      the stop row

 starttime    beginning of the time range (unixtime in millis)

              without endtime means from starttime to forever

 endtime      end of the time range.  Ignored if no starttime specified.

 versions     number of cell versions to copy

 new.name     new table's name

 peer.adr     Address of the peer cluster given in the format

              hbase.zookeeer.quorum:hbase.zookeeper.client.port:zookeeper.znode.parent

 families     comma-separated list of families to copy

              To copy from cf1 to cf2, give sourceCfName:destCfName.

              To keep the same name, just give "cfName"

 all.cells    also copy delete markers and deleted cells

Args:

 tablename    Name of the table to copy

Examples:

 To copy 'TestTable' to a cluster that uses replication for a 1 hour window:

 $ bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable --starttime=1265875194289 --endtime=1265878794289
--peer.adr=server1,server2,server3:2181:/hbase(指定另一个所在集群位置) --families=myOldCf:myNewCf,cf2,cf3 TestTable

For performance consider the following general options:

-Dhbase.client.scanner.caching=100

-Dmapred.map.tasks.speculative.execution=false

CopyTable is a utility to copy the data of one table to another table, either on the samecluster, or on a different HBase cluster. You can copy to a table that is on the same cluster; however, if you have another cluster that you want to treat as a backup, you might want to use CopyTable as a live backup option to copy the data of a table to the backup cluster. CopyTable is configurable with a start and an end timestamp. If specified, only the datawith a timestamp in the specific time frame will be copied. This feature makes it possible for incremental backup of an HBase table in some situations.

"Incremental backup" is a method to only back up the data that has been changed during the last backup.

Note: Since the cluster keeps running, there is a risk that edits could be missed during the copy process.

landen@Master:~/UntarFile/hbase-0.94.12$ bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable --families=IPAddress --new.name=BackUpHiddenIPInfo(复制一个表的数据到另一个表进行备份->最好复制到不同集群) HiddenIPInfo(所需复制的数据对应的表)
13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.zookeeper.ZooKeeper, using jar /home/landen/UntarFile/hbase-0.94.12/lib/zookeeper-3.4.5.jar
13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class com.google.protobuf.Message, using jar /home/landen/UntarFile/hbase-0.94.12/lib/protobuf-java-2.4.0a.jar
13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class com.google.common.collect.ImmutableSet, using jar /home/landen/UntarFile/hbase-0.94.12/lib/guava-11.0.2.jar
13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.hbase.util.Bytes, using jar /home/landen/UntarFile/hbase-0.94.12/hbase-0.94.12.jar
13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.io.LongWritable, using jar /home/landen/UntarFile/hbase-0.94.12/lib/hadoop-core-1.0.4.jar
13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.io.Text, using jar /home/landen/UntarFile/hbase-0.94.12/lib/hadoop-core-1.0.4.jar
13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.hbase.mapreduce.TableInputFormat, using jar /home/landen/UntarFile/hbase-0.94.12/hbase-0.94.12.jar
13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.io.LongWritable, using jar /home/landen/UntarFile/hbase-0.94.12/lib/hadoop-core-1.0.4.jar
13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.io.Text, using jar /home/landen/UntarFile/hbase-0.94.12/lib/hadoop-core-1.0.4.jar
13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.mapreduce.lib.output.TextOutputFormat, using jar /home/landen/UntarFile/hbase-0.94.12/lib/hadoop-core-1.0.4.jar
13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.mapreduce.lib.partition.HashPartitioner, using jar /home/landen/UntarFile/hbase-0.94.12/lib/hadoop-core-1.0.4.jar
13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.zookeeper.ZooKeeper, using jar /home/landen/UntarFile/hbase-0.94.12/lib/zookeeper-3.4.5.jar
13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class com.google.protobuf.Message, using jar /home/landen/UntarFile/hbase-0.94.12/lib/protobuf-java-2.4.0a.jar
13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class com.google.common.collect.ImmutableSet, using jar /home/landen/UntarFile/hbase-0.94.12/lib/guava-11.0.2.jar
13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.hbase.util.Bytes, using jar /home/landen/UntarFile/hbase-0.94.12/hbase-0.94.12.jar
13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.hbase.io.ImmutableBytesWritable, using jar /home/landen/UntarFile/hbase-0.94.12/hbase-0.94.12.jar
13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.io.Writable, using jar /home/landen/UntarFile/hbase-0.94.12/lib/hadoop-core-1.0.4.jar
13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.hbase.mapreduce.TableInputFormat, using jar /home/landen/UntarFile/hbase-0.94.12/hbase-0.94.12.jar
13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.hbase.io.ImmutableBytesWritable, using jar /home/landen/UntarFile/hbase-0.94.12/hbase-0.94.12.jar
13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.io.Writable, using jar /home/landen/UntarFile/hbase-0.94.12/lib/hadoop-core-1.0.4.jar
13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.hbase.mapreduce.TableOutputFormat, using jar /home/landen/UntarFile/hbase-0.94.12/hbase-0.94.12.jar
13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.mapreduce.lib.partition.HashPartitioner, using jar /home/landen/UntarFile/hbase-0.94.12/lib/hadoop-core-1.0.4.jar

.................................................

13/12/10 16:16:04 INFO zookeeper.ZooKeeper: Client environment:java.library.path=/home/landen/UntarFile/hadoop-1.0.4/libexec/../lib/native/Linux-i386-32:/home/landen/UntarFile/hbase-0.94.12/lib/native/Linux-i386-32
13/12/10 16:16:04 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
13/12/10 16:16:04 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
13/12/10 16:16:04 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
13/12/10 16:16:04 INFO zookeeper.ZooKeeper: Client environment:os.arch=i386
13/12/10 16:16:04 INFO zookeeper.ZooKeeper: Client environment:os.version=3.2.0-24-generic-pae
13/12/10 16:16:04 INFO zookeeper.ZooKeeper: Client environment:user.name=landen
13/12/10 16:16:04 INFO zookeeper.ZooKeeper: Client environment:user.home=/home/landen
13/12/10 16:16:04 INFO zookeeper.ZooKeeper: Client environment:user.dir=/home/landen/UntarFile/hbase-0.94.12
13/12/10 16:16:04 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=Slave1:2222,Master:2222,Slave2:2222 sessionTimeout=180000 watcher=hconnection
13/12/10 16:16:04 INFO zookeeper.RecoverableZooKeeper: The identifier of this process is 16010@Master
13/12/10 16:16:04 INFO zookeeper.ClientCnxn: Opening socket connection to server Master/10.21.244.79:2222. Will not attempt to authenticate using SASL (unknown error)
13/12/10 16:16:04 INFO zookeeper.ClientCnxn: Socket connection established to Master/10.21.244.79:2222, initiating session
13/12/10 16:16:04 INFO zookeeper.ClientCnxn: Session establishment complete on server Master/10.21.244.79:2222, sessionid = 0x42db7cbd1f0005, negotiated timeout = 180000
13/12/10 16:16:04 DEBUG client.HConnectionManager$HConnectionImplementation: Looked up root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@167a465; serverName=Slave1,60020,1386661855439
13/12/10 16:16:04 DEBUG client.HConnectionManager$HConnectionImplementation: Cached location for .META.,,1.1028785192 is Slave1:60020
13/12/10 16:16:05 DEBUG client.MetaScanner: Scanning .META. starting at row=BackUpHiddenIPInfo,,00000000000000 for max=10 rows using org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@167a465
13/12/10 16:16:05 DEBUG client.HConnectionManager$HConnectionImplementation: Cached location for BackUpHiddenIPInfo,,1386662946878.48312c3f9b8715670432c413ca44f2f6. is Slave1:60020
13/12/10 16:16:05 INFO mapreduce.TableOutputFormat: Created table instance for BackUpHiddenIPInfo
13/12/10 16:16:05 DEBUG client.MetaScanner: Scanning .META. starting at row=HiddenIPInfo,,00000000000000 for max=10 rows using org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@167a465
13/12/10 16:16:05 DEBUG client.HConnectionManager$HConnectionImplementation: Cached location for HiddenIPInfo,,1386509509553.9e1062d691dd4c25cdc030f8c3fc9860. is Slave1:60020
13/12/10 16:16:05 DEBUG client.MetaScanner: Scanning .META. starting at row=HiddenIPInfo,,00000000000000 for max=2147483647 rows using org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@167a465
13/12/10 16:16:05 ERROR mapreduce.TableInputFormatBase: Cannot resolve the host name for /10.21.244.124 because of javax.naming.NameNotFoundException: DNS name not found [response code 3]; remaining name '124.244.21.10.in-addr.arpa'
13/12/10 16:16:05 DEBUG mapreduce.TableInputFormatBase: getSplits: split -> 0 -> slave1:,
13/12/10 16:16:07 INFO mapred.JobClient: Running job: job_201312042044_0030
13/12/10 16:16:08 INFO mapred.JobClient: map 0% reduce 0%
13/12/10 16:16:27 INFO mapred.JobClient: map 100% reduce 0%
13/12/10 16:16:32 INFO mapred.JobClient: Job complete: job_201312042044_0030
13/12/10 16:16:32 INFO mapred.JobClient: Counters: 28
13/12/10 16:16:32 INFO mapred.JobClient:   Job Counters
13/12/10 16:16:32 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=12305
13/12/10 16:16:32 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
13/12/10 16:16:32 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
13/12/10 16:16:32 INFO mapred.JobClient:     Rack-local map tasks=1
13/12/10 16:16:32 INFO mapred.JobClient:     Launched map tasks=1
13/12/10 16:16:32 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
13/12/10 16:16:32 INFO mapred.JobClient:   HBase Counters
13/12/10 16:16:32 INFO mapred.JobClient:     REMOTE_RPC_CALLS=0
13/12/10 16:16:32 INFO mapred.JobClient:     RPC_CALLS=6
13/12/10 16:16:32 INFO mapred.JobClient:     RPC_RETRIES=0
13/12/10 16:16:32 INFO mapred.JobClient:     NOT_SERVING_REGION_EXCEPTION=0
13/12/10 16:16:32 INFO mapred.JobClient:     NUM_SCANNER_RESTARTS=0
13/12/10 16:16:32 INFO mapred.JobClient:     MILLIS_BETWEEN_NEXTS=162
13/12/10 16:16:32 INFO mapred.JobClient:     BYTES_IN_RESULTS=1493
13/12/10 16:16:32 INFO mapred.JobClient:     BYTES_IN_REMOTE_RESULTS=0
13/12/10 16:16:32 INFO mapred.JobClient:     REGIONS_SCANNED=1
13/12/10 16:16:32 INFO mapred.JobClient:     REMOTE_RPC_RETRIES=0
13/12/10 16:16:32 INFO mapred.JobClient:   File Output Format Counters
13/12/10 16:16:32 INFO mapred.JobClient:     Bytes Written=0
13/12/10 16:16:32 INFO mapred.JobClient:   FileSystemCounters
13/12/10 16:16:32 INFO mapred.JobClient:     HDFS_BYTES_READ=69
13/12/10 16:16:32 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=34919
13/12/10 16:16:32 INFO mapred.JobClient:   File Input Format Counters
13/12/10 16:16:32 INFO mapred.JobClient:     Bytes Read=0
13/12/10 16:16:32 INFO mapred.JobClient:   Map-Reduce Framework
13/12/10 16:16:32 INFO mapred.JobClient:     Map input records=3
13/12/10 16:16:32 INFO mapred.JobClient:     Physical memory (bytes) snapshot=83361792
13/12/10 16:16:32 INFO mapred.JobClient:     Spilled Records=0
13/12/10 16:16:32 INFO mapred.JobClient:     CPU time spent (ms)=170
13/12/10 16:16:32 INFO mapred.JobClient:     Total committed heap usage (bytes)=55443456
13/12/10 16:16:32 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=395317248
13/12/10 16:16:32 INFO mapred.JobClient:     Map output records=3
13/12/10 16:16:32 INFO mapred.JobClient:     SPLIT_RAW_BYTES=69
hbase(main):016:0> describe 'BackUpHiddenIPInfo'
DESCRIPTION                                                                   ENABLED
'BackUpHiddenIPInfo', {NAME => 'IPAddress', DATA_BLOCK_ENCODING => 'NONE', B true
LOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSI
ONS => '3', TTL => '2147483647', MIN_VERSIONS => '0', KEEP_DELETED_CELLS =>
'false', BLOCKSIZE => '65536', ENCODE_ON_DISK => 'true', IN_MEMORY => 'false
', BLOCKCACHE => 'true'}
1 row(s) in 0.0670 seconds

hbase(main):017:0> scan 'BackUpHiddenIPInfo'
ROW                            COLUMN+CELL
125.111.251.118               column=IPAddress:city, timestamp=1386597147615, value=Ningbo
125.111.251.118               column=IPAddress:countrycode, timestamp=1386597147615, value=CN
125.111.251.118               column=IPAddress:countryname, timestamp=1386597147615, value=China
125.111.251.118               column=IPAddress:latitude, timestamp=1386597147615, value=29.878204
125.111.251.118               column=IPAddress:longitude, timestamp=1386597147615, value=121.5495
125.111.251.118               column=IPAddress:region, timestamp=1386597147615, value=02
125.111.251.118               column=IPAddress:regionname, timestamp=1386597147615, value=Zhejiang
125.111.251.118               column=IPAddress:timezone, timestamp=1386597147615, value=Asia/Shanghai
221.12.10.218                 column=IPAddress:city, timestamp=1386597147615, value=Hangzhou
221.12.10.218                 column=IPAddress:countrycode, timestamp=1386597147615, value=CN
221.12.10.218                 column=IPAddress:countryname, timestamp=1386597147615, value=China
221.12.10.218                 column=IPAddress:latitude, timestamp=1386597147615, value=30.293594
221.12.10.218                 column=IPAddress:longitude, timestamp=1386597147615, value=120.16141
221.12.10.218                 column=IPAddress:region, timestamp=1386597147615, value=02
221.12.10.218                 column=IPAddress:regionname, timestamp=1386597147615, value=Zhejiang
221.12.10.218                 column=IPAddress:timezone, timestamp=1386597147615, value=Asia/Shanghai
60.180.248.201                column=IPAddress:city, timestamp=1386597147615, value=Wenzhou
60.180.248.201                column=IPAddress:countrycode, timestamp=1386597147615, value=CN
60.180.248.201                column=IPAddress:countryname, timestamp=1386597147615, value=China
60.180.248.201                column=IPAddress:latitude, timestamp=1386597147615, value=27.999405
60.180.248.201                column=IPAddress:longitude, timestamp=1386597147615, value=120.66681
60.180.248.201                column=IPAddress:region, timestamp=1386597147615, value=02
60.180.248.201                column=IPAddress:regionname, timestamp=1386597147615, value=Zhejiang
60.180.248.201                column=IPAddress:timezone, timestamp=1386597147615, value=Asia/Shanghai
3 row(s) in 0.0600 seconds

方法 3：

landen@Master:~/UntarFile/hadoop-1.0.4$ bin/hadoop distcp

Warning: $HADOOP_HOME is deprecated.

distcp [OPTIONS] <srcurl>* <desturl>

OPTIONS:

-p[rbugp]              Preserve status

                       r: replication number

                       b: block size

                       u: user

                       g: group

                       p: permission

                       -p alone is equivalent to -prbugp

-i                     Ignore failures

-log <logdir>          Write logs to <logdir>

-m <num_maps>          Maximum number of simultaneous copies

-overwrite             Overwrite destination

-update                Overwrite if src size different from dst size

-skipcrccheck          Do not use CRC check to determine if src is

                       different from dest. Relevant only if -update

                       is specified

-f <urilist_uri>       Use list at <urilist_uri> as src list

-filelimit <n>         Limit the total number of files to be <= n

-sizelimit <n>         Limit the total size to be <= n bytes

-delete                Delete the files existing in the dst but not in src

-mapredSslConf <f>     Filename of SSL configuration for mapper task

NOTE 1: if -overwrite or -update are set, each source URI is

      interpreted as an isomorphic update to an existing directory.

For example:

hadoop distcp -p -update "hdfs://A:8020/user/foo/bar" "hdfs://B:8020/user/foo/baz"

     would update all descendants of 'baz' also in 'bar'; it would

     *not* update /user/foo/baz/bar

NOTE 2: The parameter <n> in -filelimit and -sizelimit can be

     specified with symbolic representation.  For examples,

       1230k = 1230 * 1024 = 1259520

       891g = 891 * 1024^3 = 956703965184

Generic options supported are

-conf <configuration file>     specify an application configuration file

-D <property=value>            use value for given property

-fs <local|namenode:port>      specify a namenode

-jt <local|jobtracker:port>    specify a job tracker

-files <comma separated list of files>    specify comma separated files to be copied to the map reduce cluster

-libjars <comma separated list of jars>    specify comma separated jar files to include in the classpath.

-archives <comma separated list of archives>    specify comma separated archives to be unarchived on the compute machines.

The general command line syntax is

bin/hadoop command [genericOptions] [commandOptions]

distcp (distributed copy) is a tool provided by Hadoop for copying a large dataset on the same, or different HDFS cluster. It uses MapReduce to copy files in parallel, handle error and recovery, and report the job status. As HBase stores all its files, including system files on HDFS, we can simply use distcp to copy the HBase directory to either another directory on the same HDFS, or to a different HDFS, for backing up the source HBase cluster

landen@Master:~/UntarFile/hadoop-1.0.4$ bin/hadoop distcp /hbase /backup/HBaseBackUp
Warning: $HADOOP_HOME is deprecated.

13/12/10 15:33:09 INFO tools.DistCp: srcPaths=[/hbase]
13/12/10 15:33:09 INFO tools.DistCp: destPath=/backup/HBaseBackUp
13/12/10 15:33:10 INFO tools.DistCp: sourcePathsCount=46
13/12/10 15:33:10 INFO tools.DistCp: filesToCopyCount=17
13/12/10 15:33:10 INFO tools.DistCp: bytesToCopyCount=11.7k
13/12/10 15:33:11 INFO mapred.JobClient: Running job: job_201312042044_0029
13/12/10 15:33:12 INFO mapred.JobClient: map 0% reduce 0%
13/12/10 15:33:37 INFO mapred.JobClient: map 100% reduce 0%
13/12/10 15:33:42 INFO mapred.JobClient: Job complete: job_201312042044_0029
13/12/10 15:33:42 INFO mapred.JobClient: Counters: 22
13/12/10 15:33:42 INFO mapred.JobClient:   Job Counters
13/12/10 15:33:42 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=20465
13/12/10 15:33:42 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
13/12/10 15:33:42 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
13/12/10 15:33:42 INFO mapred.JobClient:     Launched map tasks=1
13/12/10 15:33:42 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
13/12/10 15:33:42 INFO mapred.JobClient:   File Input Format Counters
13/12/10 15:33:42 INFO mapred.JobClient:     Bytes Read=7904
13/12/10 15:33:42 INFO mapred.JobClient:   File Output Format Counters
13/12/10 15:33:42 INFO mapred.JobClient:     Bytes Written=0
13/12/10 15:33:42 INFO mapred.JobClient:   FileSystemCounters
13/12/10 15:33:42 INFO mapred.JobClient:     HDFS_BYTES_READ=20070
13/12/10 15:33:42 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=22644
13/12/10 15:33:42 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=11988
13/12/10 15:33:42 INFO mapred.JobClient:   distcp
13/12/10 15:33:42 INFO mapred.JobClient:     Files copied=17
13/12/10 15:33:42 INFO mapred.JobClient:     Bytes copied=11988
13/12/10 15:33:42 INFO mapred.JobClient:     Bytes expected=11988
13/12/10 15:33:42 INFO mapred.JobClient:   Map-Reduce Framework
13/12/10 15:33:42 INFO mapred.JobClient:     Map input records=45
13/12/10 15:33:42 INFO mapred.JobClient:     Physical memory (bytes) snapshot=36737024
13/12/10 15:33:42 INFO mapred.JobClient:     Spilled Records=0
13/12/10 15:33:42 INFO mapred.JobClient:     CPU time spent (ms)=470
13/12/10 15:33:42 INFO mapred.JobClient:     Total committed heap usage (bytes)=15925248
13/12/10 15:33:42 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=346537984
13/12/10 15:33:42 INFO mapred.JobClient:     Map input bytes=7804
13/12/10 15:33:42 INFO mapred.JobClient:     Map output records=0
13/12/10 15:33:42 INFO mapred.JobClient:     SPLIT_RAW_BYTES=178

Backing Up and Restoring HBase Data的更多相关文章

Restore HBase Data
方法 1: Restoring HBase data by importing dump files from HDFS The HBase Import utility is used to loa ...
using python read/write HBase data
A. operations on Server side 1. ensure hadoop and hbase are working properly 2. install thrift: apt ...
HBase 数据模型（Data Model）
HBase Data Model--HBase 数据模型(翻译) 在HBase中,数据是存储在有行有列的表格中.这是与关系型数据库重复的术语,并不是有用的类比.相反,HBase可以被认为是一个多维度的 ...
HBase学习笔记-高级（一）
HBase1. hbase.id记录了集群的唯一标识:hbase.version记录了文件格式的版本号2. split和.corrupt目录在日志分裂过程中使用,以便保存一些中间结果和损坏的日志在表目 ...
How Google Backs Up The Internet Along With Exabytes Of Other Data
出处:http://highscalability.com/blog/2014/2/3/how-google-backs-up-the-internet-along-with-exabytes-of- ...
hbase集群安装与部署
1.相关环境 centos7 hadoop2.6.5 zookeeper3.4.9 jdk1.8 hbase1.2.4 本篇文章仅涉及hbase集群的搭建,关于hadoop与zookeeper的相关部 ...
【HBase】HBase Getting Started（HBase 入门指南）
入门指南 1. 简介 Quickstart 会让你启动和运行一个单节点单机HBase. 2. 快速启动 – 单点HBase 这部分描述单节点单机HBase的配置.一个单例拥有所有的HBase守护线程- ...
HBASE基础知识
HBASE的集群的搭建HBASE的表设计HBASE的底层存储模型 HBase 是一个高可靠.高性能.面向列.可伸缩的分布式缓存系统.利用HBase 技术可在廉价PC Server上搭建起大规模结构化存 ...
hbase数据迁移-HDFS拷贝
1.把数据表test从hbase下拷出 hdfs dfs -get /hbase/data/default/test /home/hadoop/hbase/test 2.文件放到新集群的系统上 scp ...

随机推荐

2018.08.17 洛谷[POI2010]GRA-The Minima Game（线性dp）
传送门短代码神奇dp. 自己yy的思路居然1A了好高兴啊! 不难想到每个人选择的时候一定是取连续的最大的那一段数,自然需要先排序. 然后可以用dp[i]表示当前最大数是a[i]的时候先手可以获得的最 ...
C++ 动态分配和内存分配和内存释放
动态分配动态分配可以说是指针的关键所在.不需要通过定义变量,就可以将指针指向分配的内存.也许这个概念看起来比较模糊,但是确实比较简单.下面的代码示范如何为一个整数分配内存: int *pNumber ...
【转】关于编译链接——gcc/g++
添加运行时共享库目录运行使用共享库的程序需要加载共享库(不同于G++ 编译时指定的链接库),添加共享库的步骤: 修改文件 /etc/ld.so.conf 添加共享库目录运行 ldconfig 同步 ...
MATLAB实现截位的问题
讨论MATLAB怎样提取10进制中的位的方法,因为做FFT时要用到截位,相去验证它,向同庆请教, 原来只是除以2的N次方,取模取余就行了,可恨我还想了一下午,也没有一个好办法. 接下来的问题是,对于负 ...
Codeforces758A Holiday Of Equality 2017-01-20 10:08 48人阅读评论(0) 收藏
A. Holiday Of Equality time limit per test 1 second memory limit per test 256 megabytes input standa ...
hdu 5033 模拟+单调优化
http://acm.hdu.edu.cn/showproblem.php?pid=5033 平面上有n个建筑,每个建筑由(xi,hi)表示,m组询问在某一个点能看到天空的视角范围大小. 维护一个凸包 ...
ViewGroup onInterceptTouchEvent，ViewGroup onTouchEvent，View onTouchEvent执行顺序说明
今天抽出了一些时间实践了viewgroup和view的触摸事件顺序,之前也试过,总是忘记,今天记下笔记说明一下首先 onInterceptTouchEvent只会出现在viewgroup中,view ...
spring mvc + velocity 搭建实例程序maven版本并且使用的是tomcat容器而不是jetty（step by step）
笔者最近在学习spring mvc 查了很多资料,但用jsp的居多,但项目中需要用velocity,所以说就学习了一下,现将所查资料以及搭建过程陈述如下,供需要的人参考 1.楼主用的是eclipse+ ...
Americans are usually tolerant (Listen speak of Unit 2)
Americans are usually 1) tolerant of non-native speakers who have some 2) trouble understanding Engl ...
cxgrid动态显示行号
uses cxLookAndFeelPainters; type TMyCxGrid = class(TObject) class procedure DrawIndicatorCell( ...

Backing Up and Restoring HBase Data

Backing Up and Restoring HBase Data的更多相关文章

随机推荐

热门专题