Hbase的备份与恢复
1.Hbase的备份与恢复
2.CopyTable
CopyTable实例
首先检查hadoop 的yarn是否启动? yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<!-- 指定YARN的老大(ResourceManager)的地址 -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop-master</value>
</property>
<!-- reducer获取数据的方式是shuffle方式 -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>hadoop-master:8088</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>hadoop-master:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>hadoop-master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>hadoop-master:8031</value>
</property>
</configuration>
启动yarn
cd /usr/local/hadoop/sbin
./start-yarn.sh
把CoprocessorTest备份为CoprocessorTestBak
#创建备份表
[root@hadoop-master bin]# create 'CoprocessorTestBak','cf'
#执行拷贝
[root@hadoop-master bin]# ./hbase org.apache.hadoop.hbase.mapreduce.CopyTable --new.name=CoprocessorTestBak CoprocessorTest
#查看是否备份成功
hbase(main):001:0> scan 'CoprocessorTestBak'
ROW COLUMN+CELL
rowKey2 column=cf:countCol, timestamp=1682237414998, value=100
rowKey2 column=cf:unDeleteCol, timestamp=1682237430291, value=true
1 row(s) in 0.1870 seconds
3.Export/Import
实例:
#把CoprocessorTest表导出到 /bak/CoprocessorTest.db
[root@hadoop-master bin]# ./hbase org.apache.hadoop.hbase.mapreduce.Export CoprocessorTest /bak/CoprocessorTest.db

#清空CoprocessorTest表里的数据
hbase(main):003:0> truncate 'CoprocessorTest'
Truncating 'CoprocessorTest' table (it may take a while):
- Disabling table...
- Truncating table...
0 row(s) in 4.2240 seconds
#查看是否已经清空
hbase(main):004:0> scan 'CoprocessorTest'
ROW COLUMN+CELL
0 row(s) in 0.1270 seconds
#把/bak/CoprocessorTest.db 导入到CoprocessorTest
[root@hadoop-master bin]# ./hbase org.apache.hadoop.hbase.mapreduce.Import CoprocessorTest /bak/CoprocessorTest.db
#查看数据是否已经导入成功
hbase(main):001:0> scan 'CoprocessorTest'
ROW COLUMN+CELL
rowKey2 column=cf:countCol, timestamp=1682319581159, value=100
rowKey2 column=cf:unDeleteCol, timestamp=1682237430291, value=true
1 row(s) in 0.2050 seconds
4.snapshot - Snapshot即快照的意思,作用于表上。通过配置hbase-site.xml开启该功能。 - 可以快速的恢复表至快照指定的状态从而迅速的修复数据(但是会丢失快照之后的数据)。
实例:
在hbase-site.xml中开启快照功能。
<property>
<name>hbase.snapshot.enabled</name>
<value>true</value>
</property>
#查看已有的快照
hbase(main):002:0> list_snapshots
SNAPSHOT TABLE + CREATION TIME
0 row(s) in 0.0750 seconds
=> []
hbase(main):003:0> snapshot 'CoprocessorTest','CoprocessorTest_Snapshot_0424'
0 row(s) in 0.6750 seconds
hbase(main):004:0> list_snapshots
SNAPSHOT TABLE + CREATION TIME
CoprocessorTest_Snapshot_0424 CoprocessorTest (Mon Apr 24 15:11:26 +0800 2023)
1 row(s) in 0.0220 seconds
=> ["CoprocessorTest_Snapshot_0424"]
#清空CoprocessorTest数据
hbase(main):006:0> truncate 'CoprocessorTest'
Truncating 'CoprocessorTest' table (it may take a while):
- Disabling table...
- Truncating table...
0 row(s) in 3.9280 seconds
#查看CoprocessorTest是否已经清空
hbase(main):007:0> scan 'CoprocessorTest'
ROW COLUMN+CELL
0 row(s) in 0.1770 seconds
hbase(main):008:0> disable 'CoprocessorTest'
0 row(s) in 2.2740 seconds
#使用快照恢复数据
hbase(main):009:0> restore_snapshot 'CoprocessorTest_Snapshot_0424'
0 row(s) in 1.3730 seconds
hbase(main):010:0> enable 'CoprocessorTest'
0 row(s) in 1.2750 seconds
hbase(main):011:0> scan 'CoprocessorTest'
ROW COLUMN+CELL
rowKey2 column=cf:countCol, timestamp=1682319581159, value=100
rowKey2 column=cf:unDeleteCol, timestamp=1682237430291, value=true
1 row(s) in 0.1320 seconds
5.replication
在hbase-site.xml中开启replication功能。
<property>
<name>hbase.replication</name>
<value>true</value>
</property>
实例:
# 在源集群及目标集群都创建同名表
# 指定目标集群zk地址和路径
add_peer '1',"zk01:2181:/hbase_backup"
# 标注需要备份的列族信息及备份的目标库地址
#REPLICATION_SCOPE值为上面add_peer 指定的值
alter 'replication_source_table',{NAME=>'fl', REPLICATION_SCOPE=>'1'}