Try hbase-shell with docker image
Summary
- Introduction
- Prerequisites
- Create docker-compose
- Build docker container and start
- Start hbase-shell
- Use hbase-shell
- Conclusion
Introduction
Apache HBase is a key-value type database.
It is developed as part of the Hadoop project and runs on HDFS(Hadoop Distributed File System).
In order to use it in earnest, it is necessary to build a Hadoop cluster consisting of multiple nodes, but you can also try it in your local environment.
In particular, the tool called hbase-shell is based on REPL(Read-Eval-Print-Loop) and is a command line tool for performing administrative tasks such as creating and deleting tables.
This article will use Apache HBase Docker Image and execute hbase-shell in local environment.
There are the following three HBase execution environments.
In this article will run in stand-alone mode, which is sufficient for trying hbase-shell.
- Standalone mode: Run on a local machine without HDFS
- Pseudo-distributed mode: Run on a local machine with HDFS
- Fully-distributed mode: Run on multiple machine with HDFS
Prerequisites
Installed docker and docker-compose
For example, their version are
$ docker -v
Docker version 19.03.6, build 369ce74a3c$ docker-compose -v
docker-compose version 1.17.1, build unknown
Machine spec
Using machine with a total 8 GB memory, but it works fine with about 4 GB.
$ cat /etc/issue
Ubuntu 18.04.2 LTS$ cat /proc/meminfo | grep Mem
MemTotal: 8168284 kB
MemFree: 6812556 kB
MemAvailable: 7545960 kB
Create docker-compose
Prepare the following file(docker-compose.yml).
Here, “blueskyareahm/hbase-base:2.1.3” is specified as the Apache HBase docker image, and “blueskyareahm/hbase-zookeeper:3.4.13” is specified for the zookeeper.
The images have script to start services regarding HBase whenever the container starts.
These images are based on alpine:3.10.
version: '2'
services:
hbase-master:
image: blueskyareahm/hbase-base:2.1.3
command: master
ports:
- 16000:16000
- 16010:16010
hbase-regionserver:
image: blueskyareahm/hbase-base:2.1.3
command: regionserver
ports:
- 16030:16030
- 16201:16201
- 16301:16301
zookeeper:
image: blueskyareahm/hbase-zookeeper:3.4.13
ports:
- 2181:2181From this docker-compose.yml file, one container will be created for each of hbase-master, hbase-regionserver, and zookeeper.
Build docker container and start
Run the following command in the directory where docker-compose.yml is located.
$ docker-compose up --build -d
Starting dockerhbase_hbase-regionserver_1 ...
Starting dockerhbase_zookeeper_1 ...
Starting dockerhbase_hbase-regionserver_1
Starting dockerhbase_zookeeper_1
Starting dockerhbase_hbase-master_1 ...
Starting dockerhbase_hbase-regionserver_1 ... doneYou can confirm with the command that the docker image specified in docker-compose.yml has been downloaded locally.
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
blueskyareahm/hbase-base 2.1.3 bc85bf9cde47 6 days ago 394MB
blueskyareahm/hbase-zookeeper 3.4.13 b84eed2da9c6 6 days ago 150MBYou can also see that containers are started for each of hbase-master, hbase-regionserver and zookeeper.
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
55d31613504a blueskyareahm/hbase-base:2.1.3 "/entrypoint.sh mast…" 5 days ago Up About a minute 2181/tcp, 8080/tcp, 8085/tcp, 9090/tcp, 0.0.0.0:16000->16000/tcp, 9095/tcp, 16030/tcp, 16201/tcp, 16301/tcp, 0.0.0.0:16010->16010/tcp dockerhbase_hbase-master_1
69b00dbc3b83 blueskyareahm/hbase-base:2.1.3 "/entrypoint.sh regi…" 5 days ago Up About a minute 2181/tcp, 8080/tcp, 8085/tcp, 9090/tcp, 9095/tcp, 16000/tcp, 0.0.0.0:16030->16030/tcp, 0.0.0.0:16201->16201/tcp, 16010/tcp, 0.0.0.0:16301->16301/tcp dockerhbase_hbase-regionserver_1
0a9ac6fc557c blueskyareahm/hbase-zookeeper:3.4.13 "/entrypoint.sh" 5 days ago Up About a minute 3181/tcp, 0.0.0.0:2181->2181/tcp, 4181/tcp dockerhbase_zookeeper_1Start hbase-shell
Start hbase-shell with the following command.
Starting with the hbase-shell command in the hbase-master container.
$ docker-compose exec hbase-master hbase shell
--- (skip) ---
hbase(main):001:0>Use hbase-shell
See help
hbase(main):001:0> help
HBase Shell, version 2.1.3, rda5ec9e4c06c537213883cca8f3cc9a7c19daf67, Mon Feb 11 15:45:33 CST 2019
--- (skip) ---
The HBase shell is the (J)Ruby IRB with the above HBase-specific commands added.
For more on the HBase Shell, see http://hbase.apache.org/book.htmlCreate table
Required to specify Table name and Column Family name.
hbase(main):002:0> create 'test', 'cf'
Created table test
Took 2.9629 seconds
=> Hbase::Table - testCheck a table existence
hbase(main):003:0> list 'test'
TABLE
test
1 row(s)
Took 0.1438 seconds
=> ["test"]
hbase(main):004:0> list 'dummy'
TABLE
0 row(s)
Took 0.0259 seconds
=> []List tables
hbase(main):005:0> list
TABLE
t1
test
2 row(s)
Took 0.0276 seconds
=> ["t1", "test"]Check a table constitution
hbase(main):006:0> describe 'test'
Table test is ENABLED
test
COLUMN FAMILIES DESCRIPTION
{NAME => 'cf', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE
=> 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRI
TE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true
', BLOCKSIZE => '65536'}
1 row(s)
Took 0.4172 secondsPut data into table
put ‘table_name’, ‘key’, ‘column_name’, ‘data(value)’
hbase(main):007:0> put 'test', 'row1', 'cf:a', 'value1'
Took 0.3695 seconds
hbase(main):008:0> put 'test', 'row2', 'cf:b', 'value2'
Took 0.0187 seconds
hbase(main):009:0> put 'test', 'row1', 'cf:c', 'value3'
Took 0.0236 secondsThis example put two columns(‘cf:a’, ‘cf:c’) data into row1.
For row2, put 1 column(‘cf:b’) data.
Scan table data
Scan all data on ‘test’ table.
You can see row1 has 2 columns data and row2 has 1 column data.
hbase(main):010:0> scan 'test'
ROW COLUMN+CELL
row1 column=cf:a, timestamp=1596974012058, value=value1
row1 column=cf:c, timestamp=1596974230499, value=value3
row2 column=cf:b, timestamp=1596974204249, value=value2
2 row(s)
Took 0.0885 secondsGet specific row data
You can get the data of a specific row by specifying the row(key).
hbase(main):011:0> get 'test', 'row1'
COLUMN CELL
cf:a timestamp=1596974012058, value=value1
cf:c timestamp=1596974230499, value=value3
1 row(s)
Took 0.0780 secondsDisable a table
If you want to delete any table, need to disable the table at first.
hbase(main):012:0> disable 'test'
Took 0.9336 seconds
hbase(main):014:0> describe 'test'
Table test is DISABLED
test
COLUMN FAMILIES DESCRIPTION
{NAME => 'cf', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE
=> 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRI
TE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true
', BLOCKSIZE => '65536'}
1 row(s)
Took 0.0598 secondsYou can see the table is disabled by describe command.
To enable the table, use enable command.
hbase(main):012:0> enable 'test'Delete a table
drop command with the disabled table.
hbase(main):015:0> drop 'test'
Took 0.5828 seconds
hbase(main):016:0> list 'test'
TABLE
0 row(s)
Took 0.0087 seconds
=> []Exit hbase-shell
hbase(main):017:0> exitConclusion
In actual operation, data insertion and acquisition are mainly performed from applications such as Java.
Regarding the reference of the data on the table, convenient to have a dedicate application to get the data from HBase with something filter condition.
The hbase-shell is a handy tool for creating tables, entering test data during development, and checking the contents.
There are many situations hbase-shell can help you in learning HBase, which is a key-value type database.
By using docker image, you can prepare a hbase-shell execution environment in a minimum of 5 minutes.
