Batch Edit EXIF Metadata of Photographs

I spent the past few days exploring the scenic Chamba and Kangra valleys of Himachal Pradesh (More on that later.. time permitting :-)). I have been meaning to do this for a long time and was glad that I was eventually able to cover most of the places as planned. Needless to say, I took hundreds of photographs.. trying to capture the natural beauty that the lovely place has to offer. Much to my chagrin, I found out later that the Date/Time settings of my camera were askew. Well I’m a bit finicky about such things and immediately googled around for various available freeware utilities that can batch modify EXIF metadata of multiple photographs. All I wanted was an application using which I can select a bunch of files and increment/decrement the Date-Time values by some number.

My search led me to a number of options and I finally settled for the combination of “ExifTool + ExifTool GUI”. You can get more details at the following links.

It’s simple.. powerful.. and gets the job done – highly recommend!

ZooKeeper: znodes

You can view details of your ZooKeeper instance using zk_dump from the hbase shell.

 

hbase(main):001:0> zk_dump
HBase is rooted at /hbase
Active master address: hbase2,52114,1352384965804
Backup master addresses:
Region server holding ROOT: hbase2,43876,1352384966172
Region servers:
 hbase2,43876,1352384966172
Quorum Server Statistics:
 localhost:2181
  Zookeeper version: 3.4.3-cdh4.1.0--1, built on 09/29/2012 17:54 GMT
  Clients:
   /127.0.0.1:43352[1](queued=0,recved=12,sent=12)
   /127.0.0.1:43146[1](queued=0,recved=2459,sent=2466)
   /127.0.0.1:43147[1](queued=0,recved=2283,sent=2284)
   /127.0.0.1:43354[0](queued=0,recved=1,sent=0)
   /127.0.0.1:43145[1](queued=0,recved=2551,sent=2645)

  Latency min/avg/max: 0/0/104
  Received: 7519
  Sent: 7620
  Outstanding: 0
  Zxid: 0xa9
  Mode: standalone
  Node count: 16

hbase(main):002:0>

HBase creates a list of znodes under its root node that contain various details. Let us examine the values they hold using the zookeeper-client tool.

 

abhi@hbase2:~$ zookeeper-client 
Connecting to localhost:2181
2012-11-08 13:56:27,437 [myid:] - INFO  [main:Environment@100] - Client environment:zookeeper.version=3.4.3-cdh4.1.0--1, built on 09/29/2012 17:54 GMT
2012-11-08 13:56:27,441 [myid:] - INFO  [main:Environment@100] - Client environment:host.name=hbase2
2012-11-08 13:56:27,441 [myid:] - INFO  [main:Environment@100] - Client environment:java.version=1.6.0_31
2012-11-08 13:56:27,442 [myid:] - INFO  [main:Environment@100] - Client environment:java.vendor=Sun Microsystems Inc.
2012-11-08 13:56:27,443 [myid:] - INFO  [main:Environment@100] - Client environment:java.home=/opt/java/jdk1.6.0_31/jre
2012-11-08 13:56:27,444 [myid:] - INFO  [main:Environment@100] - Client environment:java.class.path=/usr/lib/zookeeper/bin/../build/classes:/usr/lib/zookeeper/bin/../build/lib/*.jar:/usr/lib/zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/usr/lib/zookeeper/bin/../lib/slf4j-api-1.6.1.jar:/usr/lib/zookeeper/bin/../lib/netty-3.2.2.Final.jar:/usr/lib/zookeeper/bin/../lib/log4j-1.2.15.jar:/usr/lib/zookeeper/bin/../lib/jline-0.9.94.jar:/usr/lib/zookeeper/bin/../zookeeper-3.4.3-cdh4.1.0.jar:/usr/lib/zookeeper/bin/../src/java/lib/*.jar:/etc/zookeeper/conf::/etc/zookeeper/conf:/usr/lib/zookeeper/zookeeper.jar:/usr/lib/zookeeper/zookeeper-3.4.3-cdh4.1.0.jar:/usr/lib/zookeeper/lib/log4j-1.2.15.jar:/usr/lib/zookeeper/lib/slf4j-api-1.6.1.jar:/usr/lib/zookeeper/lib/slf4j-log4j12-1.6.1.jar:/usr/lib/zookeeper/lib/netty-3.2.2.Final.jar:/usr/lib/zookeeper/lib/jline-0.9.94.jar
2012-11-08 13:56:27,444 [myid:] - INFO  [main:Environment@100] - Client environment:java.library.path=/opt/java/jdk1.6.0_31/jre/lib/amd64/server:/opt/java/jdk1.6.0_31/jre/lib/amd64:/opt/java/jdk1.6.0_31/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2012-11-08 13:56:27,445 [myid:] - INFO  [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2012-11-08 13:56:27,446 [myid:] - INFO  [main:Environment@100] - Client environment:java.compiler=<NA>
2012-11-08 13:56:27,446 [myid:] - INFO  [main:Environment@100] - Client environment:os.name=Linux
2012-11-08 13:56:27,447 [myid:] - INFO  [main:Environment@100] - Client environment:os.arch=amd64
2012-11-08 13:56:27,447 [myid:] - INFO  [main:Environment@100] - Client environment:os.version=3.2.0-29-generic
2012-11-08 13:56:27,448 [myid:] - INFO  [main:Environment@100] - Client environment:user.name=abhi
2012-11-08 13:56:27,449 [myid:] - INFO  [main:Environment@100] - Client environment:user.home=/home/abhi
2012-11-08 13:56:27,449 [myid:] - INFO  [main:Environment@100] - Client environment:user.dir=/home/abhi
2012-11-08 13:56:27,452 [myid:] - INFO  [main:ZooKeeper@433] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@42b988a6
Welcome to ZooKeeper!
2012-11-08 13:56:27,515 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@958] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (Unable to locate a login configuration)
JLine support is enabled
2012-11-08 13:56:27,534 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@850] - Socket connection established to localhost/127.0.0.1:2181, initiating session
2012-11-08 13:56:27,576 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1187] - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x13ae06ce9780006, negotiated timeout = 30000

WATCHER::

WatchedEvent state:SyncConnected type:None path:null
[zk: localhost:2181(CONNECTED) 0] get /hbase/hbaseid
�
   1126@hbase21c3139f4-de24-4d59-9441-755a0e3f572e
cZxid = 0xc
ctime = Thu Nov 08 06:29:26 PST 2012
mZxid = 0xd
mtime = Thu Nov 08 06:29:26 PST 2012
pZxid = 0xc
cversion = 0
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 52
numChildren = 0
[zk: localhost:2181(CONNECTED) 1] get /hbase/master
�
   1126@hbase2hbase2,52114,1352384965804
cZxid = 0x9
ctime = Thu Nov 08 06:29:26 PST 2012
mZxid = 0x9
mtime = Thu Nov 08 06:29:26 PST 2012
pZxid = 0x9
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x13ae06ce9780000
dataLength = 44
numChildren = 0
[zk: localhost:2181(CONNECTED) 2] get /hbase/replication
Node does not exist: /hbase/replication
[zk: localhost:2181(CONNECTED) 3] get /hbase/root-region-server
�
   1126@hbase2hbase2,43876,1352384966172
cZxid = 0x15
ctime = Thu Nov 08 06:29:33 PST 2012
mZxid = 0x15
mtime = Thu Nov 08 06:29:33 PST 2012
pZxid = 0x15
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 42
numChildren = 0
[zk: localhost:2181(CONNECTED) 4] ls /hbase/rs
[hbase2,43876,1352384966172]
[zk: localhost:2181(CONNECTED) 5] get /hbase/shutdown
�
   1126@hbase2Thu Nov 08 06:29:26 PST 2012
cZxid = 0xf
ctime = Thu Nov 08 06:29:26 PST 2012
mZxid = 0xf
mtime = Thu Nov 08 06:29:26 PST 2012
pZxid = 0xf
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 44
numChildren = 0
[zk: localhost:2181(CONNECTED) 6] ls /hbase/splitlog
[]
[zk: localhost:2181(CONNECTED) 7] ls /hbase/table
[]
[zk: localhost:2181(CONNECTED) 8] ls /hbase/unassigned
[]
[zk: localhost:2181(CONNECTED) 9]


HBase Major Compaction

This is in continuation to my last two posts:

Each HBase Table has

  • 1 or More Column-families – that group columns and specify the physical layout of data storage
  • 1 or More Regions – that are akin to Shards (in the RDBMS world) i.e. A set of rows belonging to a table specified by its StartKey and EndKey

For every Column-family of a table in a region we have a Store which has

  • 1 MemStore – a buffer that holds in-memory modifications (till it is flushed to store files)
  • 0 or More Store files (HFiles) – that get created when MemStore fills up.

These store files are immutable and HBase creates a new file on every MemStore flush i.e. it does not write to an existing HFile.

Compaction combines all these Store files for a Region into fewer Store files to optimize performance. There are two types of compaction.

  • Minor Compaction – combines several Store files into fewer Store files
  • Major Compaction – reads all the Store files for a Region and writes to a single Store file.

Let us see how Major Compaction impacts HBase storage.

Create a table and insert data.


hbase(main):021:0> create 'users','info'
0 row(s) in 1.0540 seconds

hbase(main):022:0> list
TABLE
tbl1
users
2 row(s) in 0.0160 seconds

hbase(main):023:0> put 'users','abhi','info:name','abhishek'
0 row(s) in 0.0730 seconds

hbase(main):024:0> put 'users','abhi','info:age','30'
0 row(s) in 0.0120 seconds

Let us browse the HBase Root Directory and see how the data gets persisted physically on the filesystem.


abhi@hbase2:~$ ls -ltha /opt/hbase/data/
total 48K
drwxrwxr-x 4 hbase hbase 4.0K Nov  3 14:50 users
drwxr-xr-x 8 hbase users 4.0K Nov  3 14:50 .
drwxrwxr-x 4 hbase hbase 4.0K Nov  3 07:43 tbl1
drwxrwxr-x 2 hbase hbase 4.0K Nov  3 05:35 .oldlogs
drwxrwxr-x 3 hbase hbase 4.0K Nov  3 05:34 .logs
drwxrwxr-x 4 hbase hbase 4.0K Oct 30 12:00 -ROOT-
drwxrwxr-x 3 hbase hbase 4.0K Oct 30 12:00 .META.
-rwxr-xr-x 1 hbase hbase   38 Oct 30 12:00 hbase.id
-rw-rw-r-- 1 hbase hbase   12 Oct 30 12:00 .hbase.id.crc
-rwxr-xr-x 1 hbase hbase    3 Oct 30 12:00 hbase.version
-rw-rw-r-- 1 hbase hbase   12 Oct 30 12:00 .hbase.version.crc
drwxr-xr-x 3 abhi  users 4.0K Oct 11 08:10 ..
abhi@hbase2:~$
abhi@hbase2:~$ ls -ltha /opt/hbase/data/users/
total 24K
drwxrwxr-x 4 hbase hbase 4.0K Nov  3 14:50 6dda0024cbf8619a9c823e6ebbf78888
drwxrwxr-x 4 hbase hbase 4.0K Nov  3 14:50 .
-rwxr-xr-x 1 hbase hbase  515 Nov  3 14:50 .tableinfo.0000000001
-rw-rw-r-- 1 hbase hbase   16 Nov  3 14:50 ..tableinfo.0000000001.crc
drwxrwxr-x 2 hbase hbase 4.0K Nov  3 14:50 .tmp
drwxr-xr-x 8 hbase users 4.0K Nov  3 14:50 ..
abhi@hbase2:~$ ls -ltha /opt/hbase/data/users/6dda0024cbf8619a9c823e6ebbf78888/
total 24K
drwxrwxr-x 4 hbase hbase 4.0K Nov  3 14:50 .
drwxrwxr-x 2 hbase hbase 4.0K Nov  3 14:50 .oldlogs
drwxrwxr-x 2 hbase hbase 4.0K Nov  3 14:50 info
-rwxr-xr-x 1 hbase hbase  222 Nov  3 14:50 .regioninfo
-rw-rw-r-- 1 hbase hbase   12 Nov  3 14:50 ..regioninfo.crc
drwxrwxr-x 4 hbase hbase 4.0K Nov  3 14:50 ..
abhi@hbase2:~$ ls -ltha /opt/hbase/data/users/6dda0024cbf8619a9c823e6ebbf78888/info/
total 8.0K
drwxrwxr-x 4 hbase hbase 4.0K Nov  3 14:50 ..
drwxrwxr-x 2 hbase hbase 4.0K Nov  3 14:50 .

As you can see above, HBase created

  • a directory ‘users’ for the table and under it
  • a sub-directory ‘6dda0024cbf8619a9c823e6ebbf78888’ for the Region and under it
  • a sub-directory ‘info’ for the Column-family

All modifications to table/region columns that belong to the ‘info’ column-family get stored as store files under ‘/opt/hbase/data/users/6dda0024cbf8619a9c823e6ebbf78888/info/’

Although we entered data in the table but we don’t see any store files as all the data is currently in MemStore and has not been flushed yet. So let us flush the memstore and view the contents of the ‘info’ directory.


hbase(main):025:0> flush 'users'
0 row(s) in 0.0390 seconds

abhi@hbase2:~$ ls -ltha /opt/hbase/data/users/6dda0024cbf8619a9c823e6ebbf78888/info/
total 16K
drwxrwxr-x 2 hbase hbase 4.0K Nov  3 14:52 .
drwxrwxr-x 5 hbase hbase 4.0K Nov  3 14:52 ..
-rwxrwxrwx 1 hbase hbase  660 Nov  3 14:52 32f19d12583a46b98211ee77311f48eb
-rw-rw-r-- 1 hbase hbase   16 Nov  3 14:52 .32f19d12583a46b98211ee77311f48eb.crc

Notice how the store file /opt/hbase/data/users/6dda0024cbf8619a9c823e6ebbf78888/info/32f19d12583a46b98211ee77311f48eb got created. Let us add few more data to our table and view the filesystem.


hbase(main):026:0> put 'users','avi','info:name','avinash'
0 row(s) in 0.0050 seconds

hbase(main):027:0> flush 'users'
0 row(s) in 0.0490 seconds
abhi@hbase2:~$ ls -ltha /opt/hbase/data/users/6dda0024cbf8619a9c823e6ebbf78888/info/
total 24K
drwxrwxr-x 2 hbase hbase 4.0K Nov  3 14:52 .
-rwxrwxrwx 1 hbase hbase  623 Nov  3 14:52 ecc5f02da6234ac397d25bee6df0d019
-rw-rw-r-- 1 hbase hbase   16 Nov  3 14:52 .ecc5f02da6234ac397d25bee6df0d019.crc
drwxrwxr-x 5 hbase hbase 4.0K Nov  3 14:52 ..
-rwxrwxrwx 1 hbase hbase  660 Nov  3 14:52 32f19d12583a46b98211ee77311f48eb
-rw-rw-r-- 1 hbase hbase   16 Nov  3 14:52 .32f19d12583a46b98211ee77311f48eb.crc

Let us add some more data..

hbase(main):028:0> put 'users','avi','info:age','20'
0 row(s) in 0.0040 seconds

hbase(main):029:0> flush 'users'
0 row(s) in 0.1040 seconds
abhi@hbase2:~$ ls -ltha /opt/hbase/data/users/6dda0024cbf8619a9c823e6ebbf78888/info/
total 32K
drwxrwxr-x 2 hbase hbase 4.0K Nov  3 14:53 .
-rwxrwxrwx 1 hbase hbase  615 Nov  3 14:53 ebda0cc0af9a4d9e803a10cce27c52b6
-rw-rw-r-- 1 hbase hbase   16 Nov  3 14:53 .ebda0cc0af9a4d9e803a10cce27c52b6.crc
-rwxrwxrwx 1 hbase hbase  623 Nov  3 14:52 ecc5f02da6234ac397d25bee6df0d019
-rw-rw-r-- 1 hbase hbase   16 Nov  3 14:52 .ecc5f02da6234ac397d25bee6df0d019.crc
drwxrwxr-x 5 hbase hbase 4.0K Nov  3 14:52 ..
-rwxrwxrwx 1 hbase hbase  660 Nov  3 14:52 32f19d12583a46b98211ee77311f48eb
-rw-rw-r-- 1 hbase hbase   16 Nov  3 14:52 .32f19d12583a46b98211ee77311f48eb.crc
abhi@hbase2:~$

Notice how for each flush, a new store file gets created. Let us view the contents of these store files.

abhi@hbase2:~$ hbase org.apache.hadoop.hbase.io.hfile.HFile -f /opt/hbase/data/users/6dda0024cbf8619a9c823e6ebbf78888/info/ebda0cc0af9a4d9e803a10cce27c52b6 -p
12/11/03 14:55:59 WARN conf.Configuration: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
12/11/03 14:55:59 WARN conf.Configuration: fs.default.name is deprecated. Instead, use fs.defaultFS
12/11/03 14:56:00 INFO hfile.CacheConfig: Allocating LruBlockCache with maximum size 247.9m
K: avi/info:age/1351979593884/Put/vlen=2 V: 20
Scanned kv count -> 1
abhi@hbase2:~$ hbase org.apache.hadoop.hbase.io.hfile.HFile -f /opt/hbase/data/users/6dda0024cbf8619a9c823e6ebbf78888/info/ecc5f02da6234ac397d25bee6df0d019 -p
12/11/03 14:56:19 WARN conf.Configuration: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
12/11/03 14:56:19 WARN conf.Configuration: fs.default.name is deprecated. Instead, use fs.defaultFS
12/11/03 14:56:20 INFO hfile.CacheConfig: Allocating LruBlockCache with maximum size 247.9m
K: avi/info:name/1351979559394/Put/vlen=7 V: avinash
Scanned kv count -> 1
abhi@hbase2:~$ hbase org.apache.hadoop.hbase.io.hfile.HFile -f /opt/hbase/data/users/6dda0024cbf8619a9c823e6ebbf78888/info/32f19d12583a46b98211ee77311f48eb -p
12/11/03 14:56:31 WARN conf.Configuration: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
12/11/03 14:56:31 WARN conf.Configuration: fs.default.name is deprecated. Instead, use fs.defaultFS
12/11/03 14:56:31 INFO hfile.CacheConfig: Allocating LruBlockCache with maximum size 247.9m
K: abhi/info:age/1351979477099/Put/vlen=2 V: 30
K: abhi/info:name/1351979467158/Put/vlen=8 V: abhishek
Scanned kv count -> 2
abhi@hbase2:~$

An alternate method to view the store file contents..

abhi@hbase2:~$ hbase org.apache.hadoop.hbase.io.hfile.HFile --printkv --file /opt/hbase/data/users/6dda0024cbf8619a9c823e6ebbf78888/info/ebda0cc0af9a4d9e803a10cce27c52b6
12/11/03 14:56:57 WARN conf.Configuration: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
12/11/03 14:56:57 WARN conf.Configuration: fs.default.name is deprecated. Instead, use fs.defaultFS
12/11/03 14:56:58 INFO hfile.CacheConfig: Allocating LruBlockCache with maximum size 247.9m
K: avi/info:age/1351979593884/Put/vlen=2 V: 20
Scanned kv count -> 1
abhi@hbase2:~$

Let us invoke Major Compaction to combine these files into a single new file.

hbase(main):030:0> major_compact 'users'
0 row(s) in 0.1000 seconds

hbase(main):031:0>
abhi@hbase2:~$
abhi@hbase2:~$ ls -ltha /opt/hbase/data/users/6dda0024cbf8619a9c823e6ebbf78888/info/
total 16K
drwxrwxr-x 2 hbase hbase 4.0K Nov  3 14:57 .
-rwxrwxrwx 1 hbase hbase  731 Nov  3 14:57 6a65463fa2814751b255fdcf1542cd0d
-rw-rw-r-- 1 hbase hbase   16 Nov  3 14:57 .6a65463fa2814751b255fdcf1542cd0d.crc
drwxrwxr-x 5 hbase hbase 4.0K Nov  3 14:52 ..
abhi@hbase2:~$

Let us view the contents of the new file that got created as a result of major compaction.

abhi@hbase2:~$
abhi@hbase2:~$ hbase org.apache.hadoop.hbase.io.hfile.HFile -f /opt/hbase/data/users/6dda0024cbf8619a9c823e6ebbf78888/info/6a65463fa2814751b255fdcf1542cd0d -p          12/11/03 14:58:23 WARN conf.Configuration: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
12/11/03 14:58:23 WARN conf.Configuration: fs.default.name is deprecated. Instead, use fs.defaultFS
12/11/03 14:58:23 INFO hfile.CacheConfig: Allocating LruBlockCache with maximum size 247.9m
K: abhi/info:age/1351979477099/Put/vlen=2 V: 30
K: abhi/info:name/1351979467158/Put/vlen=8 V: abhishek
K: avi/info:age/1351979593884/Put/vlen=2 V: 20
K: avi/info:name/1351979559394/Put/vlen=7 V: avinash
Scanned kv count -> 4
abhi@hbase2:~$
abhi@hbase2:~$

Understanding HBase files and directories

This is in continuation to my last post – Getting started with HBase.

HBase physically stores data in the specified Root Directory on the filesystem. The filesystem is typically HDFS but since I have installed HBase in the stand-alone mode, I am using the local filesystem.

Now lets examine the contents of our HBase Root Directory.

Note: Always do a flush on your tables so that the data gets written as files in your filesystem.


abhi@hbase2:~$ ls -ltha /opt/hbase/data/
total 48K
drwxrwxr-x 2 hbase hbase 4.0K Oct 31 14:32 .oldlogs
drwxrwxr-x 3 hbase hbase 4.0K Oct 31 14:31 .logs
drwxrwxr-x 4 hbase hbase 4.0K Oct 30 17:10 table1
drwxr-xr-x 8 hbase users 4.0K Oct 30 17:10 .
drwxrwxr-x 4 hbase hbase 4.0K Oct 30 15:36 users
drwxrwxr-x 4 hbase hbase 4.0K Oct 30 12:00 -ROOT-
drwxrwxr-x 3 hbase hbase 4.0K Oct 30 12:00 .META.
-rwxr-xr-x 1 hbase hbase   38 Oct 30 12:00 hbase.id
-rw-rw-r-- 1 hbase hbase   12 Oct 30 12:00 .hbase.id.crc
-rwxr-xr-x 1 hbase hbase    3 Oct 30 12:00 hbase.version
-rw-rw-r-- 1 hbase hbase   12 Oct 30 12:00 .hbase.version.crc
drwxr-xr-x 3 abhi  users 4.0K Oct 11 08:10 ..
abhi@hbase2:~$
abhi@hbase2:~$
abhi@hbase2:~$ ls -lthaR /opt/hbase/data/
/opt/hbase/data/:
total 48K
drwxrwxr-x 2 hbase hbase 4.0K Oct 31 14:32 .oldlogs
drwxrwxr-x 3 hbase hbase 4.0K Oct 31 14:31 .logs
drwxrwxr-x 4 hbase hbase 4.0K Oct 30 17:10 table1
drwxr-xr-x 8 hbase users 4.0K Oct 30 17:10 .
drwxrwxr-x 4 hbase hbase 4.0K Oct 30 15:36 users
drwxrwxr-x 4 hbase hbase 4.0K Oct 30 12:00 -ROOT-
drwxrwxr-x 3 hbase hbase 4.0K Oct 30 12:00 .META.
-rwxr-xr-x 1 hbase hbase   38 Oct 30 12:00 hbase.id
-rw-rw-r-- 1 hbase hbase   12 Oct 30 12:00 .hbase.id.crc
-rwxr-xr-x 1 hbase hbase    3 Oct 30 12:00 hbase.version
-rw-rw-r-- 1 hbase hbase   12 Oct 30 12:00 .hbase.version.crc
drwxr-xr-x 3 abhi  users 4.0K Oct 11 08:10 ..

/opt/hbase/data/.oldlogs:
total 8.0K
drwxrwxr-x 2 hbase hbase 4.0K Oct 31 14:32 .
drwxr-xr-x 8 hbase users 4.0K Oct 30 17:10 ..

/opt/hbase/data/.logs:
total 12K
drwxrwxr-x 2 hbase hbase 4.0K Oct 31 14:31 hbase2,54165,1351719115872
drwxrwxr-x 3 hbase hbase 4.0K Oct 31 14:31 .
drwxr-xr-x 8 hbase users 4.0K Oct 30 17:10 ..

/opt/hbase/data/.logs/hbase2,54165,1351719115872:
total 8.0K
drwxrwxr-x 2 hbase hbase 4.0K Oct 31 14:31 .
-rwxr-xr-x 1 hbase hbase    0 Oct 31 14:31 .hbase2%2C54165%2C1351719115872.1351719119755.crc
-rwxr-xr-x 1 hbase hbase    0 Oct 31 14:31 hbase2%2C54165%2C1351719115872.1351719119755
drwxrwxr-x 3 hbase hbase 4.0K Oct 31 14:31 ..

/opt/hbase/data/table1:
total 24K
drwxrwxr-x 5 hbase hbase 4.0K Oct 31 14:32 9a35a1636b9d0639e2838c5a8ff180cf
drwxrwxr-x 4 hbase hbase 4.0K Oct 30 17:10 .
drwxr-xr-x 8 hbase users 4.0K Oct 30 17:10 ..
-rwxr-xr-x 1 hbase hbase  935 Oct 30 17:10 .tableinfo.0000000001
-rw-rw-r-- 1 hbase hbase   16 Oct 30 17:10 ..tableinfo.0000000001.crc
drwxrwxr-x 2 hbase hbase 4.0K Oct 30 17:10 .tmp

/opt/hbase/data/table1/9a35a1636b9d0639e2838c5a8ff180cf:
total 28K
drwxrwxr-x 5 hbase hbase 4.0K Oct 31 14:32 .
drwxrwxr-x 2 hbase hbase 4.0K Oct 30 17:35 cf2
drwxrwxr-x 2 hbase hbase 4.0K Oct 30 17:35 cf1
drwxrwxr-x 2 hbase hbase 4.0K Oct 30 17:10 .oldlogs
-rwxr-xr-x 1 hbase hbase  225 Oct 30 17:10 .regioninfo
-rw-rw-r-- 1 hbase hbase   12 Oct 30 17:10 ..regioninfo.crc
drwxrwxr-x 4 hbase hbase 4.0K Oct 30 17:10 ..

/opt/hbase/data/table1/9a35a1636b9d0639e2838c5a8ff180cf/cf2:
total 16K
drwxrwxr-x 5 hbase hbase 4.0K Oct 31 14:32 ..
drwxrwxr-x 2 hbase hbase 4.0K Oct 30 17:35 .
-rwxrwxrwx 1 hbase hbase  790 Oct 30 17:35 cbca7d4b4619453e95e313e54fd12649
-rw-rw-r-- 1 hbase hbase   16 Oct 30 17:35 .cbca7d4b4619453e95e313e54fd12649.crc

/opt/hbase/data/table1/9a35a1636b9d0639e2838c5a8ff180cf/cf1:
total 16K
drwxrwxr-x 5 hbase hbase 4.0K Oct 31 14:32 ..
drwxrwxr-x 2 hbase hbase 4.0K Oct 30 17:35 .
-rwxrwxrwx 1 hbase hbase  848 Oct 30 17:35 c11f3c3fe30e437c907e7b4656bbb6a8
-rw-rw-r-- 1 hbase hbase   16 Oct 30 17:35 .c11f3c3fe30e437c907e7b4656bbb6a8.crc

/opt/hbase/data/table1/9a35a1636b9d0639e2838c5a8ff180cf/.oldlogs:
total 16K
drwxrwxr-x 5 hbase hbase 4.0K Oct 31 14:32 ..
drwxrwxr-x 2 hbase hbase 4.0K Oct 30 17:10 .
-rwxr-xr-x 1 hbase hbase  124 Oct 30 17:10 hlog.1351642227144
-rwxr-xr-x 1 hbase hbase   12 Oct 30 17:10 .hlog.1351642227144.crc

/opt/hbase/data/table1/.tmp:
total 8.0K
drwxrwxr-x 4 hbase hbase 4.0K Oct 30 17:10 ..
drwxrwxr-x 2 hbase hbase 4.0K Oct 30 17:10 .

/opt/hbase/data/users:
total 24K
drwxrwxr-x 4 hbase hbase 4.0K Oct 31 14:32 ecff3a77396cba69adea1b1f789ca5a2
drwxr-xr-x 8 hbase users 4.0K Oct 30 17:10 ..
drwxrwxr-x 4 hbase hbase 4.0K Oct 30 15:36 .
-rwxr-xr-x 1 hbase hbase  515 Oct 30 15:36 .tableinfo.0000000001
-rw-rw-r-- 1 hbase hbase   16 Oct 30 15:36 ..tableinfo.0000000001.crc
drwxrwxr-x 2 hbase hbase 4.0K Oct 30 15:36 .tmp

/opt/hbase/data/users/ecff3a77396cba69adea1b1f789ca5a2:
total 24K
drwxrwxr-x 4 hbase hbase 4.0K Oct 31 14:32 .
drwxrwxr-x 2 hbase hbase 4.0K Oct 30 17:35 info
drwxrwxr-x 2 hbase hbase 4.0K Oct 30 15:36 .oldlogs
-rwxr-xr-x 1 hbase hbase  222 Oct 30 15:36 .regioninfo
-rw-rw-r-- 1 hbase hbase   12 Oct 30 15:36 ..regioninfo.crc
drwxrwxr-x 4 hbase hbase 4.0K Oct 30 15:36 ..

/opt/hbase/data/users/ecff3a77396cba69adea1b1f789ca5a2/info:
total 16K
drwxrwxr-x 4 hbase hbase 4.0K Oct 31 14:32 ..
drwxrwxr-x 2 hbase hbase 4.0K Oct 30 17:35 .
-rwxrwxrwx 1 hbase hbase 1.3K Oct 30 17:35 4080f890ac4449a2a151d5c4d79f8579
-rw-rw-r-- 1 hbase hbase   20 Oct 30 17:35 .4080f890ac4449a2a151d5c4d79f8579.crc

/opt/hbase/data/users/ecff3a77396cba69adea1b1f789ca5a2/.oldlogs:
total 16K
drwxrwxr-x 4 hbase hbase 4.0K Oct 31 14:32 ..
drwxrwxr-x 2 hbase hbase 4.0K Oct 30 15:36 .
-rwxr-xr-x 1 hbase hbase  124 Oct 30 15:36 hlog.1351636576499
-rwxr-xr-x 1 hbase hbase   12 Oct 30 15:36 .hlog.1351636576499.crc

/opt/hbase/data/users/.tmp:
total 8.0K
drwxrwxr-x 2 hbase hbase 4.0K Oct 30 15:36 .
drwxrwxr-x 4 hbase hbase 4.0K Oct 30 15:36 ..

/opt/hbase/data/-ROOT-:
total 24K
drwxrwxr-x 4 hbase hbase 4.0K Oct 31 14:32 70236052
drwxr-xr-x 8 hbase users 4.0K Oct 30 17:10 ..
drwxrwxr-x 4 hbase hbase 4.0K Oct 30 12:00 .
-rwxr-xr-x 1 hbase hbase  551 Oct 30 12:00 .tableinfo.0000000001
-rw-rw-r-- 1 hbase hbase   16 Oct 30 12:00 ..tableinfo.0000000001.crc
drwxrwxr-x 2 hbase hbase 4.0K Oct 30 12:00 .tmp

/opt/hbase/data/-ROOT-/70236052:
total 24K
drwxrwxr-x 4 hbase hbase 4.0K Oct 31 14:32 .
drwxrwxr-x 2 hbase hbase 4.0K Oct 30 17:35 info
drwxrwxr-x 4 hbase hbase 4.0K Oct 30 12:00 ..
drwxrwxr-x 2 hbase hbase 4.0K Oct 30 12:00 .oldlogs
-rwxr-xr-x 1 hbase hbase  109 Oct 30 12:00 .regioninfo
-rw-rw-r-- 1 hbase hbase   12 Oct 30 12:00 ..regioninfo.crc

/opt/hbase/data/-ROOT-/70236052/info:
total 32K
drwxrwxr-x 4 hbase hbase 4.0K Oct 31 14:32 ..
drwxrwxr-x 2 hbase hbase 4.0K Oct 30 17:35 .
-rwxrwxrwx 1 hbase hbase  718 Oct 30 17:35 fb48fa0302be4d37a5b70ffbf039fe9a
-rw-rw-r-- 1 hbase hbase   16 Oct 30 17:35 .fb48fa0302be4d37a5b70ffbf039fe9a.crc
-rwxrwxrwx 1 hbase hbase  718 Oct 30 12:11 a913edee0ac34de490c46ee12175dc02
-rw-rw-r-- 1 hbase hbase   16 Oct 30 12:11 .a913edee0ac34de490c46ee12175dc02.crc
-rwxrwxrwx 1 hbase hbase  714 Oct 30 12:00 c6f09dc3ee6a4150b8e787a747a81707
-rw-rw-r-- 1 hbase hbase   16 Oct 30 12:00 .c6f09dc3ee6a4150b8e787a747a81707.crc

/opt/hbase/data/-ROOT-/70236052/.oldlogs:
total 16K
drwxrwxr-x 4 hbase hbase 4.0K Oct 31 14:32 ..
drwxrwxr-x 2 hbase hbase 4.0K Oct 30 12:00 .
-rwxr-xr-x 1 hbase hbase  411 Oct 30 12:00 hlog.1351623609149
-rwxr-xr-x 1 hbase hbase   12 Oct 30 12:00 .hlog.1351623609149.crc

/opt/hbase/data/-ROOT-/.tmp:
total 8.0K
drwxrwxr-x 2 hbase hbase 4.0K Oct 30 12:00 .
drwxrwxr-x 4 hbase hbase 4.0K Oct 30 12:00 ..

/opt/hbase/data/.META.:
total 12K
drwxrwxr-x 4 hbase hbase 4.0K Oct 31 14:32 1028785192
drwxr-xr-x 8 hbase users 4.0K Oct 30 17:10 ..
drwxrwxr-x 3 hbase hbase 4.0K Oct 30 12:00 .

/opt/hbase/data/.META./1028785192:
total 24K
drwxrwxr-x 4 hbase hbase 4.0K Oct 31 14:32 .
drwxrwxr-x 2 hbase hbase 4.0K Oct 30 17:35 info
drwxrwxr-x 2 hbase hbase 4.0K Oct 30 12:00 .oldlogs
-rwxr-xr-x 1 hbase hbase  111 Oct 30 12:00 .regioninfo
-rw-rw-r-- 1 hbase hbase   12 Oct 30 12:00 ..regioninfo.crc
drwxrwxr-x 3 hbase hbase 4.0K Oct 30 12:00 ..

/opt/hbase/data/.META./1028785192/info:
total 16K
drwxrwxr-x 4 hbase hbase 4.0K Oct 31 14:32 ..
drwxrwxr-x 2 hbase hbase 4.0K Oct 30 17:35 .
-rwxrwxrwx 1 hbase hbase 2.8K Oct 30 17:35 de44bdf76ce6477ba3a7da1df0b159df
-rw-rw-r-- 1 hbase hbase   32 Oct 30 17:35 .de44bdf76ce6477ba3a7da1df0b159df.crc

/opt/hbase/data/.META./1028785192/.oldlogs:
total 16K
drwxrwxr-x 4 hbase hbase 4.0K Oct 31 14:32 ..
drwxrwxr-x 2 hbase hbase 4.0K Oct 30 12:00 .
-rwxr-xr-x 1 hbase hbase  124 Oct 30 12:00 hlog.1351623609390
-rwxr-xr-x 1 hbase hbase   12 Oct 30 12:00 .hlog.1351623609390.crc
abhi@hbase2:~$

In the above:
.logs and .oldlogs – contain the Write-Ahead Log (WAL) files that are shared by all regions from that region server

  • The .logs directory has a subdirectory for each RegionServer e.g. /opt/hbase/data/.logs/hbase2,54165,1351719115872.
    RegionServer subdirectory name is of the format [RegionServer Host], [Port], [Server Start Code]

    In each RegionServer subdirectory, there are the HLog files. You can view the contents of a HLog file using the org.apache.hadoop.hbase.regionserver.wal.HLog tool.

    abhi@hbase2:~$ hbase org.apache.hadoop.hbase.regionserver.wal.HLog --dump /opt/hbase/data/.logs/hbase2,54165,1351719115872/hbase2%2C54165%2C1351719115872.1351719119755
    12/11/05 13:45:37 WARN conf.Configuration: fs.default.name is deprecated. Instead, use fs.defaultFS
    12/11/05 13:45:37 INFO wal.SequenceFileLogReader: Input stream class: org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker, not adjusting length
    Sequence 618 from region 70236052 in table -ROOT-
      Action:
        row: .META.,,1
        column: info:server
        at time: Mon Nov 05 02:01:20 PST 2012
      Action:
        row: .META.,,1
        column: info:serverstartcode
        at time: Mon Nov 05 02:01:20 PST 2012
    Sequence 619 from region 1028785192 in table .META.
      Action:
        row: tbl1,,1351953259243.5ab545fc59596f7784eb179df4654930.
        column: info:server
        at time: Mon Nov 05 02:01:21 PST 2012
      Action:
        row: tbl1,,1351953259243.5ab545fc59596f7784eb179df4654930.
        column: info:serverstartcode
        at time: Mon Nov 05 02:01:21 PST 2012
    
    ... ... ... ... ...
    
    Sequence 637 from region 5ab545fc59596f7784eb179df4654930 in table tbl1
      Action:
        row: row3
        column: cf2:col2
        at time: Mon Nov 05 03:00:53 PST 2012
    Sequence 638 from region 5ab545fc59596f7784eb179df4654930 in table tbl1
      Action:
        row: row3
        column: cf3:col1
        at time: Mon Nov 05 03:00:54 PST 2012
    abhi@hbase2:~$
    
    
  • The .oldlogs directory contains all the old logfiles i.e. the ones that are already stored in the store files

-ROOT- and .META. – contain the files related to the catalog tables
hbase.id and hbase.version – hold the unique ID of the cluster, and the file format respectively

users and table1 – hold the store files for the user-defined tables. Each table has its own directory which has the following contents:

  • .tableinfo file that contains the table and column family schemas
    /opt/hbase/data/users/.tableinfo.0000000001
    You can view the contents of the file as follows:

    abhi@hbase2:~$ cat /opt/hbase/data/users/.tableinfo.0000000001
    MIN_VERSIONS0TTL
    2147483647      BLOCKSIZE65536  IN_MEMORYfalse
    BLOCKCACHEtrue
    
    {NAME => 'users', FAMILIES => [{NAME => 'info', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}
    abhi@hbase2:~$
    
  • Region directories – a directory for each region of a table. The directory name is MD5 hash of the region name
    /opt/hbase/data/users/ecff3a77396cba69adea1b1f789ca5a2

      A Region directory contains

    • .regioninfo – file that contains serialized information of a Region
      /opt/hbase/data/users/ecff3a77396cba69adea1b1f789ca5a2
    • Column-Family directories – a directory for each Column-Family that holds the actual storage file of a table
      /opt/hbase/data/users/ecff3a77396cba69adea1b1f789ca5a2/info

      The store files are in HFile format.
      /opt/hbase/data/users/ecff3a77396cba69adea1b1f789ca5a2/info/4080f890ac4449a2a151d5c4d79f8579

      The figure below describes the path to the actual storage file.

      Path to the Actual Storage File

      NOTE: If you have inserted data in your table and yet you don’t see any storage files under the column-family directory, do a flush on your table i.e. flush 'users'

      We can view the contents of a store file using the org.apache.hadoop.hbase.io.hfile.HFile tool.

      abhi@hbase2:~$ hbase org.apache.hadoop.hbase.io.hfile.HFile -f /opt/hbase/data/users/ecff3a77396cba69adea1b1f789ca5a2/info/4080f890ac4449a2a151d5c4d79f8579 -p
      12/10/31 17:02:04 WARN conf.Configuration: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
      12/10/31 17:02:04 WARN conf.Configuration: fs.default.name is deprecated. Instead, use fs.defaultFS
      12/10/31 17:02:04 INFO hfile.CacheConfig: Allocating LruBlockCache with maximum size 247.9m
      K: abhi/info:age/1351726630581/Put/vlen=2 V: 30
      K: abhi/info:name/1351726623818/Put/vlen=8 V: abhishek
      Scanned kv count -> 2
      abhi@hbase2:~$
      

Tips: Posting source code in WordPress blogs

I haven’t been too happy with the WordPress blog editor as it doesn’t let me to post source code snippets in my blogs – especially Javascript and XML. Although it allows Java, bash scripts and other code bits to be embedded within the <code> </code> tags, the final display isn’t too visually appealing as you can see below – probably to do with the the default CSS settings.


abhi@hbase2:~$ ls -lthR /opt/hbase/data/
/opt/hbase/data/:
total 16K
drwxrwxr-x 4 hbase hbase 4.0K Oct 30 12:47 users
drwxrwxr-x 4 hbase hbase 4.0K Oct 30 12:00 -ROOT-
-rwxr-xr-x 1 hbase hbase 38 Oct 30 12:00 hbase.id
-rwxr-xr-x 1 hbase hbase 3 Oct 30 12:00 hbase.version

/opt/hbase/data/users:
total 4.0K
drwxrwxr-x 4 hbase hbase 4.0K Oct 30 12:47 a070247328d9deec48d4e3cfa46b33a4

/opt/hbase/data/users/a070247328d9deec48d4e3cfa46b33a4:
total 4.0K
drwxrwxr-x 2 hbase hbase 4.0K Oct 30 12:47 info

/opt/hbase/data/users/a070247328d9deec48d4e3cfa46b33a4/info:
total 0

/opt/hbase/data/-ROOT-:
total 4.0K
drwxrwxr-x 4 hbase hbase 4.0K Oct 30 12:12 70236052

/opt/hbase/data/-ROOT-/70236052:
total 4.0K
drwxrwxr-x 2 hbase hbase 4.0K Oct 30 12:11 info

/opt/hbase/data/-ROOT-/70236052/info:
total 8.0K
-rwxrwxrwx 1 hbase hbase 718 Oct 30 12:11 a913edee0ac34de490c46ee12175dc02
-rwxrwxrwx 1 hbase hbase 714 Oct 30 12:00 c6f09dc3ee6a4150b8e787a747a81707
abhi@hbase2:~$

Luckily I chanced upon the following link http://en.support.wordpress.com/code/posting-source-code/ and it shares some neat tips that one can use.

The [sourcecode] [/sourcecode] tags definitely give a much better look as evident below.

abhi@hbase2:~$ ls -lthR /opt/hbase/data/
/opt/hbase/data/:
total 16K
drwxrwxr-x 4 hbase hbase 4.0K Oct 30 12:47 users
drwxrwxr-x 4 hbase hbase 4.0K Oct 30 12:00 -ROOT-
-rwxr-xr-x 1 hbase hbase   38 Oct 30 12:00 hbase.id
-rwxr-xr-x 1 hbase hbase    3 Oct 30 12:00 hbase.version

/opt/hbase/data/users:
total 4.0K
drwxrwxr-x 4 hbase hbase 4.0K Oct 30 12:47 a070247328d9deec48d4e3cfa46b33a4

/opt/hbase/data/users/a070247328d9deec48d4e3cfa46b33a4:
total 4.0K
drwxrwxr-x 2 hbase hbase 4.0K Oct 30 12:47 info

/opt/hbase/data/users/a070247328d9deec48d4e3cfa46b33a4/info:
total 0

/opt/hbase/data/-ROOT-:
total 4.0K
drwxrwxr-x 4 hbase hbase 4.0K Oct 30 12:12 70236052

/opt/hbase/data/-ROOT-/70236052:
total 4.0K
drwxrwxr-x 2 hbase hbase 4.0K Oct 30 12:11 info

/opt/hbase/data/-ROOT-/70236052/info:
total 8.0K
-rwxrwxrwx 1 hbase hbase 718 Oct 30 12:11 a913edee0ac34de490c46ee12175dc02
-rwxrwxrwx 1 hbase hbase 714 Oct 30 12:00 c6f09dc3ee6a4150b8e787a747a81707
abhi@hbase2:~$

I intend to replace all my <code> </code> tags in past entries with [sourcecode] [/sourcecode] the moment I get some free time. I wonder if there’s a simpler way to do this. Please do drop in a line if you think so🙂.

Getting started with HBase

There are quite a few HBase tutorials out there, but the reason I wanted to add another one was for two specific reasons:
1. To document the installation steps (HBase stand-alone mode) for my ready reference in future
2. To highlight the installation issues I faced and how I got around them so that anyone else facing the same can benefit.

Installation Steps

Download and install the latest version of Ubuntu from http://www.ubuntu.com/download

In my case I downloaded and installed Ubuntu 12.04.1 64-bit i.e. ubuntu-12.04.1-desktop-amd64.iso

Download and install Java SDK from http://www.oracle.com/technetwork/java/javase/downloads/index.html

Since Cloudera recommends JDK version 1.6.0_31 (https://ccp.cloudera.com/display/CDH4DOC/Java+Development+Kit+Installation)
I downloaded the same and installed it as follows:

root@ubuntu:~# mkdir /opt/java
root@ubuntu:~# cd /opt/java/
root@ubuntu:/opt/java# chmod +x jdk-6u31-linux-x64.bin 
root@ubuntu:/opt/java# ./jdk-6u31-linux-x64.bin 
... ... ... ...
... ... ... ...
root@ubuntu:/opt/java#
root@ubuntu:/opt/java# ls -lth jdk1.6.0_31/
total 19M
-r--r--r--  1 root root 4.7K Oct  6 14:32 register_zh_CN.html
-r--r--r--  1 root root 5.1K Oct  6 14:32 register.html
-r--r--r--  1 root root 6.5K Oct  6 14:32 register_ja.html
drwxr-xr-x  7 root root 4.0K Oct  6 14:32 jre
drwxr-xr-x  3 root root 4.0K Oct  6 14:32 lib
drwxr-xr-x  7 root root 4.0K Jan 20  2012 db
drwxr-xr-x  3 root root 4.0K Jan 20  2012 include
drwxr-xr-x  9 root root 4.0K Jan 20  2012 sample
drwxr-xr-x 10 root root 4.0K Jan 20  2012 demo
drwxr-xr-x  4 root root 4.0K Jan 20  2012 man
drwxr-xr-x  2 root root 4.0K Jan 20  2012 bin
-r--r--r--  1 root root 3.3K Jan 20  2012 COPYRIGHT
-r--r--r--  1 root root   40 Jan 20  2012 LICENSE
-r--r--r--  1 root root  115 Jan 20  2012 README.html
-r--r--r--  1 root root 165K Jan 20  2012 THIRDPARTYLICENSEREADME.txt
-rw-r--r--  1 root root  19M Jan 20  2012 src.zip
root@ubuntu:/opt/java# 

Set the JAVA_HOME environment variable

root@ubuntu:~# echo $JAVA_HOME
root@ubuntu:~#
root@ubuntu:~# vi .bashrc 

Add the following lines, save and exit:

export JAVA_HOME=/opt/java/jdk1.6.0_31
export PATH=$JAVA_HOME/bin:$PATH

Check

root@ubuntu:~# source .bashrc 
root@ubuntu:~# echo $JAVA_HOME
/opt/java/jdk1.6.0_31

root@ubuntu:~#
root@ubuntu:~# java -version
java version "1.6.0_31"
Java(TM) SE Runtime Environment (build 1.6.0_31-b04)
Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode)
root@ubuntu:~# 

Configure the CDH repositories

root@ubuntu:~# mkdir /opt/hbase
root@ubuntu:~# cd /opt/hbase/

Download http://archive.cloudera.com/cdh4/one-clickinstall/precise/amd64/cdh4-repository_1.0_all.deb into /opt/hbase

root@ubuntu:/opt/hbase# ls -lth
total 4.0K
-rw-r--r-- 1 root root 3.3K Oct  6 14:38 cdh4-repository_1.0_all.deb
root@ubuntu:/opt/hbase# 
root@ubuntu:/opt/hbase# sudo dpkg -i cdh4-repository_1.0_all.deb 
Selecting previously unselected package cdh4-repository.
(Reading database ... 140999 files and directories currently installed.)
Unpacking cdh4-repository (from cdh4-repository_1.0_all.deb) ...
Setting up cdh4-repository (1.0) ...
gpg: keyring `/etc/apt/secring.gpg' created
gpg: keyring `/etc/apt/trusted.gpg.d/cloudera-cdh4.gpg' created
gpg: key 02A818DD: public key "Cloudera Apt Repository" imported
gpg: Total number processed: 1
gpg:               imported: 1
root@ubuntu:/opt/hbase# 

Install HBase

root@ubuntu:/opt/hbase# sudo apt-get update
Ign http://archive.cloudera.com precise-cdh4 InRelease                                                          
Ign http://security.ubuntu.com precise-security InRelease                                                       
Ign http://extras.ubuntu.com precise InRelease                                                                  
Get:1 http://archive.cloudera.com precise-cdh4 Release.gpg [198 B]                                  
Ign http://us.archive.ubuntu.com precise InRelease                                               
Ign http://us.archive.ubuntu.com precise-updates InRelease           
Ign http://us.archive.ubuntu.com precise-backports InRelease         
Hit http://extras.ubuntu.com precise Release.gpg                     
Hit http://security.ubuntu.com precise-security Release.gpg          
Get:2 http://archive.cloudera.com precise-cdh4 Release [1,682 B]     
Hit http://us.archive.ubuntu.com precise Release.gpg                                                   
Hit http://extras.ubuntu.com precise Release                                                
Hit http://security.ubuntu.com precise-security Release                                      
Get:3 http://archive.cloudera.com precise-cdh4/contrib Sources [6,382 B]                     
Hit http://us.archive.ubuntu.com precise-updates Release.gpg                                          
Hit http://us.archive.ubuntu.com precise-backports Release.gpg                              
Hit http://extras.ubuntu.com precise/main Sources                                           
Get:4 http://archive.cloudera.com precise-cdh4/contrib amd64 Packages [16.8 kB]             
Hit http://security.ubuntu.com precise-security/main Sources                                          
Hit http://us.archive.ubuntu.com precise Release                                                      
Ign http://archive.cloudera.com precise-cdh4/contrib TranslationIndex                                           
Hit http://extras.ubuntu.com precise/main amd64 Packages                                    
Hit http://extras.ubuntu.com precise/main i386 Packages                                     
Hit http://us.archive.ubuntu.com precise-updates Release                                                        
Ign http://extras.ubuntu.com precise/main TranslationIndex                                                      
Hit http://us.archive.ubuntu.com precise-backports Release                                                      
Ign http://archive.cloudera.com precise-cdh4/contrib Translation-en_US                                          
Ign http://archive.cloudera.com precise-cdh4/contrib Translation-en                         
Ign http://extras.ubuntu.com precise/main Translation-en_US                                                     
Ign http://extras.ubuntu.com precise/main Translation-en                                                        
Hit http://us.archive.ubuntu.com precise/main Sources                                                           
Hit http://us.archive.ubuntu.com precise/restricted Sources                                                     
Hit http://us.archive.ubuntu.com precise/universe Sources                                                       
Hit http://us.archive.ubuntu.com precise/multiverse Sources                                                     
Hit http://us.archive.ubuntu.com precise/main amd64 Packages                                                    
Hit http://security.ubuntu.com precise-security/restricted Sources                                              
Hit http://security.ubuntu.com precise-security/universe Sources                                                
Hit http://us.archive.ubuntu.com precise/restricted amd64 Packages                                              
Hit http://security.ubuntu.com precise-security/multiverse Sources                                              
Hit http://security.ubuntu.com precise-security/main amd64 Packages                                             
Hit http://security.ubuntu.com precise-security/restricted amd64 Packages                                       
Hit http://security.ubuntu.com precise-security/universe amd64 Packages                                         
Hit http://us.archive.ubuntu.com precise/universe amd64 Packages                                                
Hit http://security.ubuntu.com precise-security/multiverse amd64 Packages                                       
Hit http://security.ubuntu.com precise-security/main i386 Packages                                              
Hit http://security.ubuntu.com precise-security/restricted i386 Packages                                        
Hit http://us.archive.ubuntu.com precise/multiverse amd64 Packages                                              
Hit http://security.ubuntu.com precise-security/universe i386 Packages                                          
Hit http://security.ubuntu.com precise-security/multiverse i386 Packages                                        
Hit http://security.ubuntu.com precise-security/main TranslationIndex                                           
Hit http://security.ubuntu.com precise-security/multiverse TranslationIndex                                     
Hit http://security.ubuntu.com precise-security/restricted TranslationIndex                                     
Hit http://security.ubuntu.com precise-security/universe TranslationIndex                                       
Hit http://us.archive.ubuntu.com precise/main i386 Packages                                                     
Hit http://security.ubuntu.com precise-security/main Translation-en                                             
Hit http://security.ubuntu.com precise-security/multiverse Translation-en                                       
Hit http://security.ubuntu.com precise-security/restricted Translation-en                                       
Hit http://us.archive.ubuntu.com precise/restricted i386 Packages                                               
Hit http://security.ubuntu.com precise-security/universe Translation-en                                         
Hit http://us.archive.ubuntu.com precise/universe i386 Packages                                                 
Hit http://us.archive.ubuntu.com precise/multiverse i386 Packages
Hit http://us.archive.ubuntu.com precise/main TranslationIndex
Hit http://us.archive.ubuntu.com precise/multiverse TranslationIndex
Hit http://us.archive.ubuntu.com precise/restricted TranslationIndex
Hit http://us.archive.ubuntu.com precise/universe TranslationIndex
Hit http://us.archive.ubuntu.com precise-updates/main Sources
Hit http://us.archive.ubuntu.com precise-updates/restricted Sources
Hit http://us.archive.ubuntu.com precise-updates/universe Sources
Hit http://us.archive.ubuntu.com precise-updates/multiverse Sources
Hit http://us.archive.ubuntu.com precise-updates/main amd64 Packages
Hit http://us.archive.ubuntu.com precise-updates/restricted amd64 Packages
Hit http://us.archive.ubuntu.com precise-updates/universe amd64 Packages
Hit http://us.archive.ubuntu.com precise-updates/multiverse amd64 Packages
Hit http://us.archive.ubuntu.com precise-updates/main i386 Packages
Hit http://us.archive.ubuntu.com precise-updates/restricted i386 Packages
Hit http://us.archive.ubuntu.com precise-updates/universe i386 Packages
Hit http://us.archive.ubuntu.com precise-updates/multiverse i386 Packages
Hit http://us.archive.ubuntu.com precise-updates/main TranslationIndex
Hit http://us.archive.ubuntu.com precise-updates/multiverse TranslationIndex
Hit http://us.archive.ubuntu.com precise-updates/restricted TranslationIndex
Hit http://us.archive.ubuntu.com precise-updates/universe TranslationIndex
Hit http://us.archive.ubuntu.com precise-backports/main Sources
Hit http://us.archive.ubuntu.com precise-backports/restricted Sources
Hit http://us.archive.ubuntu.com precise-backports/universe Sources
Hit http://us.archive.ubuntu.com precise-backports/multiverse Sources
Hit http://us.archive.ubuntu.com precise-backports/main amd64 Packages
Hit http://us.archive.ubuntu.com precise-backports/restricted amd64 Packages
Hit http://us.archive.ubuntu.com precise-backports/universe amd64 Packages
Hit http://us.archive.ubuntu.com precise-backports/multiverse amd64 Packages
Hit http://us.archive.ubuntu.com precise-backports/main i386 Packages
Hit http://us.archive.ubuntu.com precise-backports/restricted i386 Packages
Hit http://us.archive.ubuntu.com precise-backports/universe i386 Packages
Hit http://us.archive.ubuntu.com precise-backports/multiverse i386 Packages
Hit http://us.archive.ubuntu.com precise-backports/main TranslationIndex
Hit http://us.archive.ubuntu.com precise-backports/multiverse TranslationIndex
Hit http://us.archive.ubuntu.com precise-backports/restricted TranslationIndex
Hit http://us.archive.ubuntu.com precise-backports/universe TranslationIndex
Hit http://us.archive.ubuntu.com precise/main Translation-en
Hit http://us.archive.ubuntu.com precise/multiverse Translation-en
Hit http://us.archive.ubuntu.com precise/restricted Translation-en
Hit http://us.archive.ubuntu.com precise/universe Translation-en
Hit http://us.archive.ubuntu.com precise-updates/main Translation-en
Hit http://us.archive.ubuntu.com precise-updates/multiverse Translation-en
Hit http://us.archive.ubuntu.com precise-updates/restricted Translation-en
Hit http://us.archive.ubuntu.com precise-updates/universe Translation-en
Hit http://us.archive.ubuntu.com precise-backports/main Translation-en
Hit http://us.archive.ubuntu.com precise-backports/multiverse Translation-en
Hit http://us.archive.ubuntu.com precise-backports/restricted Translation-en
Hit http://us.archive.ubuntu.com precise-backports/universe Translation-en
Fetched 25.1 kB in 30s (827 B/s)
Reading package lists... Done
root@ubuntu:/opt/hbase# 
root@ubuntu:/opt/hbase# sudo apt-get install hbase hbase-master
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following extra packages will be installed:
  bigtop-jsvc bigtop-utils hadoop hadoop-hdfs libopts25 ntp zookeeper
Suggested packages:
  ntp-doc
The following NEW packages will be installed:
  bigtop-jsvc bigtop-utils hadoop hadoop-hdfs hbase hbase-master libopts25 ntp zookeeper
0 upgraded, 9 newly installed, 0 to remove and 119 not upgraded.
Need to get 70.1 MB of archives.
After this operation, 82.3 MB of additional disk space will be used.
Do you want to continue [Y/n]? Y
Get:1 http://us.archive.ubuntu.com/ubuntu/ precise/main libopts25 amd64 1:5.12-0.1ubuntu1 [59.9 kB]
Get:2 http://us.archive.ubuntu.com/ubuntu/ precise-updates/main ntp amd64 1:4.2.6.p3+dfsg-1ubuntu3.1 [612 kB]
Get:3 http://archive.cloudera.com/cdh4/ubuntu/precise/amd64/cdh/ precise-cdh4/contrib bigtop-jsvc amd64 0.4+352-1.cdh4.1.0.p0.29~precise-cdh4.1.0 [53.2 kB]
Get:4 http://archive.cloudera.com/cdh4/ubuntu/precise/amd64/cdh/ precise-cdh4/contrib bigtop-utils all 0.4+352-1.cdh4.1.0.p0.28~precise-cdh4.1.0 [2,004 B]
Get:5 http://archive.cloudera.com/cdh4/ubuntu/precise/amd64/cdh/ precise-cdh4/contrib zookeeper all 3.4.3+25-1.cdh4.1.0.p0.28~precise-cdh4.1.0 [4,087 kB]
Get:6 http://archive.cloudera.com/cdh4/ubuntu/precise/amd64/cdh/ precise-cdh4/contrib hadoop all 2.0.0+541-1.cdh4.1.0.p0.27~precise-cdh4.1.0 [16.6 MB]
Get:7 http://archive.cloudera.com/cdh4/ubuntu/precise/amd64/cdh/ precise-cdh4/contrib hadoop-hdfs all 2.0.0+541-1.cdh4.1.0.p0.27~precise-cdh4.1.0 [12.7 MB]
Get:8 http://archive.cloudera.com/cdh4/ubuntu/precise/amd64/cdh/ precise-cdh4/contrib hbase all 0.92.1+154-1.cdh4.1.0.p0.23~precise-cdh4.1.0 [35.9 MB]
Get:9 http://archive.cloudera.com/cdh4/ubuntu/precise/amd64/cdh/ precise-cdh4/contrib hbase-master all 0.92.1+154-1.cdh4.1.0.p0.23~precise-cdh4.1.0 [19.2 kB]
Fetched 70.1 MB in 8min 41s (134 kB/s)                                                                          
Selecting previously unselected package libopts25.
(Reading database ... 141003 files and directories currently installed.)
Unpacking libopts25 (from .../libopts25_1%3a5.12-0.1ubuntu1_amd64.deb) ...
Selecting previously unselected package ntp.
Unpacking ntp (from .../ntp_1%3a4.2.6.p3+dfsg-1ubuntu3.1_amd64.deb) ...
Selecting previously unselected package bigtop-jsvc.
Unpacking bigtop-jsvc (from .../bigtop-jsvc_0.4+352-1.cdh4.1.0.p0.29~precise-cdh4.1.0_amd64.deb) ...
Selecting previously unselected package bigtop-utils.
Unpacking bigtop-utils (from .../bigtop-utils_0.4+352-1.cdh4.1.0.p0.28~precise-cdh4.1.0_all.deb) ...
Selecting previously unselected package zookeeper.
Unpacking zookeeper (from .../zookeeper_3.4.3+25-1.cdh4.1.0.p0.28~precise-cdh4.1.0_all.deb) ...
Selecting previously unselected package hadoop.
Unpacking hadoop (from .../hadoop_2.0.0+541-1.cdh4.1.0.p0.27~precise-cdh4.1.0_all.deb) ...
Selecting previously unselected package hadoop-hdfs.
Unpacking hadoop-hdfs (from .../hadoop-hdfs_2.0.0+541-1.cdh4.1.0.p0.27~precise-cdh4.1.0_all.deb) ...
Selecting previously unselected package hbase.
Unpacking hbase (from .../hbase_0.92.1+154-1.cdh4.1.0.p0.23~precise-cdh4.1.0_all.deb) ...
Selecting previously unselected package hbase-master.
Unpacking hbase-master (from .../hbase-master_0.92.1+154-1.cdh4.1.0.p0.23~precise-cdh4.1.0_all.deb) ...
Processing triggers for ureadahead ...
Processing triggers for man-db ...
Setting up libopts25 (1:5.12-0.1ubuntu1) ...
Setting up ntp (1:4.2.6.p3+dfsg-1ubuntu3.1) ...
 * Starting NTP server ntpd                                                                               [ OK ] 
Setting up bigtop-jsvc (0.4+352-1.cdh4.1.0.p0.29~precise-cdh4.1.0) ...
Setting up bigtop-utils (0.4+352-1.cdh4.1.0.p0.28~precise-cdh4.1.0) ...
Setting up zookeeper (3.4.3+25-1.cdh4.1.0.p0.28~precise-cdh4.1.0) ...
update-alternatives: using /etc/zookeeper/conf.dist to provide /etc/zookeeper/conf (zookeeper-conf) in auto mode.
Setting up hadoop (2.0.0+541-1.cdh4.1.0.p0.27~precise-cdh4.1.0) ...
update-alternatives: using /etc/hadoop/conf.empty to provide /etc/hadoop/conf (hadoop-conf) in auto mode.
Setting up hadoop-hdfs (2.0.0+541-1.cdh4.1.0.p0.27~precise-cdh4.1.0) ...
Setting up hbase (0.92.1+154-1.cdh4.1.0.p0.23~precise-cdh4.1.0) ...
update-alternatives: using /etc/hbase/conf.dist to provide /etc/hbase/conf (hbase-conf) in auto mode.
Setting up hbase-master (0.92.1+154-1.cdh4.1.0.p0.23~precise-cdh4.1.0) ...
Starting Hadoop HBase master daemon: +======================================================================+
|      Error: JAVA_HOME is not set and Java could not be found         |
+----------------------------------------------------------------------+
| Please download the latest Sun JDK from the Sun Java web site        |
|       > http://java.sun.com/javase/downloads/ <                      |
|                                                                      |
| HBase requires Java 1.6 or later.                                    |
| NOTE: This script will find Sun Java whether you install using the   |
|       binary or the RPM based installer.                             |
+======================================================================+
invoke-rc.d: initscript hbase-master, action &quot;start&quot; failed.
dpkg: error processing hbase-master (--configure):
 subprocess installed post-installation script returned error exit status 1
Processing triggers for libc-bin ...
ldconfig deferred processing now taking place
Errors were encountered while processing:
 hbase-master
E: Sub-process /usr/bin/dpkg returned an error code (1)
root@ubuntu:/opt/hbase# 

I got the above error the first time so I checked if JAVA_HOME is set properly.

root@ubuntu:/opt/hbase# echo $JAVA_HOME
/opt/java/jdk1.6.0_31

Since it seems ok, I decided to directly set JAVA_HOME in the hbase-master script

root@ubuntu:/opt/hbase# vi /etc/init.d/hbase-master

# Add this
export JAVA_HOME=/opt/java/jdk1.6.0_31

Lets try installing again

root@ubuntu:/opt/hbase# sudo apt-get install hbase hbase-master
Reading package lists... Done
Building dependency tree       
Reading state information... Done
hbase is already the newest version.
hbase-master is already the newest version.
0 upgraded, 0 newly installed, 0 to remove and 119 not upgraded.
1 not fully installed or removed.
After this operation, 0 B of additional disk space will be used.
Do you want to continue [Y/n]? Y
Setting up hbase-master (0.92.1+154-1.cdh4.1.0.p0.23~precise-cdh4.1.0) ...
Starting Hadoop HBase master daemon: starting master, logging to /var/log/hbase/hbase-hbase-master-ubuntu.out
hbase-master.
root@ubuntu:/opt/hbase# 

This time everything went well.
View the list of HBase configuration files.

root@ubuntu:/opt/hbase# ls -lth /etc/hbase/conf/
total 28K
-rw-r--r-- 1 root root 1.1K Oct 30 12:06 hbase-site.xml
-rw-r--r-- 1 root root 2.3K Sep 29 11:54 hadoop-metrics.properties
-rw-r--r-- 1 root root 4.2K Sep 29 11:54 hbase-env.sh
-rw-r--r-- 1 root root 2.2K Sep 29 11:54 hbase-policy.xml
-rw-r--r-- 1 root root 2.5K Sep 29 11:54 log4j.properties
-rw-r--r-- 1 root root   10 Sep 29 11:54 regionservers
root@ubuntu:/opt/hbase# cat /etc/hbase/conf/regionservers
localhost
root@ubuntu:/opt/hbase#

Lets invoke the HBase shell and test.

root@ubuntu:/opt/hbase# hbase shell
12/10/06 15:07:20 WARN conf.Configuration: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
HBase Shell; enter 'help' for list of supported commands.
Type "exit" to leave the HBase Shell
Version 0.92.1-cdh4.1.0, rUnknown, Sat Sep 29 11:55:59 PDT 2012

hbase(main):001:0> status
1 servers, 0 dead, 3.0000 average load

hbase(main):001:0> list
TABLE                                                                                                            
0 row(s) in 0.6370 seconds

hbase(main):002:0> 

hbase(main):002:0> create 'table1','cf1'


^Croot@ubuntu:/opt/hbase# 
root@ubuntu:/opt/hbase# 

Here I encountered the second issue. For some reason the HBase shell would hang.
After googling around for quite some time I found a fix.

Update the /etc/hosts file and ensure that there is no 127.0.1.1 that points to localhost and ubuntu
and comment out the ipv6 lines

root@ubuntu:/opt/hbase# vi /etc/hosts

192.168.38.137  hbase2

127.0.0.1       localhost ubuntu

# The following lines are desirable for IPv6 capable hosts
#::1     ip6-localhost ip6-loopback
#fe00::0 ip6-localnet
#ff00::0 ip6-mcastprefix
#ff02::1 ip6-allnodes
#ff02::2 ip6-allrouters

Also disable ipv6 as follows:

root@ubuntu:~# vi /etc/sysctl.conf 

and add the following lines to the end of it:

# Abhi: Disable ipv6
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

Now restart the system

root@ubuntu:/opt/hbase# /etc/init.d/hbase-master restart
Restarting Hadoop HBase master daemon: stopping master...
Starting Hadoop HBase master daemon: starting master, logging to /var/log/hbase/hbase-hbase-master-ubuntu.out
hbase-master.
root@ubuntu:/opt/hbase# 

Check if its running

root@ubuntu:/opt/hbase# jps
16353 Jps
15654 HMaster

Open the Hbase shell and lets play around with few commands

root@ubuntu:/opt/hbase# 
root@ubuntu:/opt/hbase# hbase shell
12/10/06 15:35:18 WARN conf.Configuration: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
HBase Shell; enter 'help' for list of supported commands.
Type "exit" to leave the HBase Shell
Version 0.92.1-cdh4.1.0, rUnknown, Sat Sep 29 11:55:59 PDT 2012

hbase(main):007:0* list
TABLE                                                                                                            
0 row(s) in 0.0040 seconds

hbase(main):008:0> create 'table1','cf1'
0 row(s) in 1.0880 seconds

hbase(main):009:0> list
TABLE                                                                                                            
table1                                                                                                           
1 row(s) in 0.0160 seconds

hbase(main):010:0> scan 'table1'
ROW                           COLUMN+CELL                                                                        
0 row(s) in 0.0200 seconds

hbase(main):012:0> put 'table1','row1','cf1:greeting','Hello'
0 row(s) in 0.0590 seconds

hbase(main):013:0> put 'table1','row1','cf1:name','World'
0 row(s) in 0.0110 seconds

hbase(main):014:0> scan 'table1'
ROW                           COLUMN+CELL                                                                        
 row1                         column=cf1:greeting, timestamp=1349563833359, value=Hello                          
 row1                         column=cf1:name, timestamp=1349563858582, value=World                              
1 row(s) in 0.0350 seconds

hbase(main):016:0> get 'table1','row1'
COLUMN                        CELL                                                                               
 cf1:greeting                 timestamp=1349563833359, value=Hello                                               
 cf1:name                     timestamp=1349563858582, value=World                                               
2 row(s) in 0.0300 seconds

hbase(main):017:0> put 'table1','row2','cf1:greeting','Hi'
0 row(s) in 0.0140 seconds

hbase(main):018:0> put 'table1','row2','cf1:name','Abhi'
0 row(s) in 0.0080 seconds

hbase(main):019:0> scan 'table1'
ROW                           COLUMN+CELL                                                                        
 row1                         column=cf1:greeting, timestamp=1349563833359, value=Hello                          
 row1                         column=cf1:name, timestamp=1349563858582, value=World                              
 row2                         column=cf1:greeting, timestamp=1349563961204, value=Hi                             
 row2                         column=cf1:name, timestamp=1349563973437, value=Abhi                               
2 row(s) in 0.0730 seconds

hbase(main):020:0> get 'table1','row2'
COLUMN                        CELL                                                                               
 cf1:greeting                 timestamp=1349563961204, value=Hi                                                  
 cf1:name                     timestamp=1349563973437, value=Abhi                                                
2 row(s) in 0.0080 seconds

hbase(main):021:0> 

You can open a browser and go to http://localhost:60010/ to access the HBase monitoring WebUI

HBase WebUI

Updated on October 31, 2012
Note: I have changed the hostname of my system from ‘ubuntu’ to ‘hbase2’

Set the HBase Root Directory

Although we are able to play around with HBase – create tables, put and get data etc. the data will get deleted once we restart the system as it is transient. In the stand-alone mode everything is executed within a single Java process and the data/files get stored under /tmp by default. Most OS clear /tmp on reboot thereby removing all the data. To make the data persistent we need to edit the hbase-site.xml file and set the root directory.

abhi@hbase2:~$ sudo mkdir /opt/hbase/data/
abhi@hbase2:~$ sudo chown -cRvf hbase:users /opt/hbase/data/
abhi@hbase2:~$ ls -lth /opt/hbase/
total 8.0K
drwxr-xr-x 7 hbase users 4.0K Oct 30 12:47 data
abhi@hbase2:~$
abhi@hbase2:~$ sudo vi /etc/hbase/conf/hbase-site.xml

<configuration>
    <property>
        <name>hbase.rootdir</name>
        <value>file:///opt/hbase/data</value>
    </property>
</configuration>

Restart hbase-master.

abhi@hbase2:~$ sudo /etc/init.d/hbase-master restart

Now when you create tables and enter data, the data will persist even after system restart.
Lets create a table and enter data.

hbase(main):005:0> create 'users','info'
0 row(s) in 1.1180 seconds

hbase(main):006:0> list
TABLE
users
1 row(s) in 0.0090 seconds

hbase(main):007:0> put 'users','abhi','info:name','abhishek'
0 row(s) in 0.0660 seconds

hbase(main):008:0> put 'users','abhi','info:age','30'
0 row(s) in 0.0110 seconds

hbase(main):009:0> scan 'users'
ROW                                         COLUMN+CELL
 abhi                                       column=info:age, timestamp=1351626512340, value=30
 abhi                                       column=info:name, timestamp=1351626501011, value=abhishek
1 row(s) in 0.0350 seconds

hbase(main):010:0> flush 'users'
0 row(s) in 0.0750 seconds

hbase(main):011:0>

Note: Always do a flush on your tables so that the data gets written as files in your filesystem.

In my next post we’ll see how HBase persists the data physically on the filesystem as files and directories.

Enable Nautilus Toolbar, Statusbar and Sidebar with Tree view

I installed CentOS 5.6 and was a bit uncomfortable with the way the file browser Nautilus was behaving – I couldn’t find the Toolbar, Statusbar or the Sidebar/Sidepane with the folder tree that I have gotten used to over the years. Moreover every time I would double-click on a folder it would open it in a new window – my screen was simply getting cluttered. I tried to enable the Sidepane by pressing ‘F9″ but no luck. After quite a bit of looking around I managed to fix it.

Basically open gconf-editor i.e. press Alt+F2, enter gconf-editor and click on Run.

Run Application

Go to / -> apps -> nautilus -> preferences, select always_use_browser and enable the checkbox as shown below.

Configuration Editor

Now when you open Nautilus you can view the Toolbar, Status bar and the Sidebar with the tree.

Mozilla Thunderbird: Subscribe nested folders and subfolders

Well I’ve been trying to organize my mailbox – basically create folders/subfolders and organize my emails for quite some time now. Today I finally managed to get some time and decided to go ahead with this seemingly simple activity. I’mean how difficult is it to create folders/subfolders and move emails right?

Now I have MS Exchange Server at work which has been IMAP-enabled so that I can access my mailbox from my phone and from Thunderbird (on my Linux box). I organized all my emails in folders and subfolders (nested upto 2-3 levels) using MS Outlook at work and later when I tried to access my mailbox using Thunderbird, I realized that it was not displaying all my folders and subfolders.

So I go to my Mail Account Settings to explicitly subscribe the folders/subfolders. That’s when I found out that it was simply fetching/displaying the top-level folders and the immediate subfolders (nested 1 level). It was not displaying the nested subfolders within other subfolders. So in the example below, only folders X and A were being displayed while B and C were not.

X
 |___ A
      |___ B
           |___ C

So in the time-tested manner I googled for some answer and came across quite a few forums where people have faced the same problem.

http://getsatisfaction.com/mozilla_messaging/topics/imap_sub_sub_folders_do_not_appear_in_subscribe_window

http://www.emaildiscussions.com/showthread.php?t=55577

http://www.emaildiscussions.com/showthread.php?t=55134

http://www.emaildiscussions.com/showpost.php?p=462865&postcount=27

I didn’t find much help till I came across – http://kb.mozillazine.org/IMAP:_advanced_account_configuration

So this is how I fixed the problem:

  • Remove your mail account (e.g. mail.xxx.com) and recreate it just to be on the safe side. You can skip this too.. I just didn’t
  • Right click on your mail account folder and go to the Mail Server Settings i.e.
    Account Settings –> mail.xxx.com –> Server Settings
  • Mozilla Thunderbird Account Settings

  • Now press the button “Advanced” to view the “Advanced Account Settings”. Uncheck the option “Show only subscribed folders” and press “OK“.

    Mozilla Thunderbird Advanced Account Settings

    Now the moment you connect to your mail server, this will fix two things:

    1. You won’t have to explicitly subscribe each folder/subfolder
    2. All folders and nested subfolders get fetched and displayed automatically

Watching movies on TV using HDMI support in Ubuntu 10.04

The thing I love the most in Ubuntu 10.04 is the support for HDMI Audio. I like to plug my laptop (Dell Studio 1555) to my TV using HDMI cable when I need to watch a movie. Earlier when I had openSUSE 11.1, the display worked fine but the audio would still come out of my laptop speakers instead of the TV ones. This was quite a pain as I couldn’t sit back with my TV remote control and had to walk up to my laptop to increase/decrease the volume every now and then.
However this was the case only with the Linux distros as HDMI audio worked fine with Windows Vista. The good news is that with Ubuntu 10.04, HDMI Audio works perfect. All you need to do is just select “HDMI Stereo” as the Output option in the Sound Preferences dialog as shown below.


HDMI Stereo Output in Sound Preferences Dialog

I’ve setup a Bluetooth mouse and now I can sit back and use the mouse as a remote control.

Bluetooth Mouse Setup on Ubuntu 10.04

I’ve also installed a virtual keyboard “onBoard” to key in whatever text I need to run a command from the terminal.

GNome Virtual Keyboard - onBoard

I’m simply loving it🙂

PostgreSQL 8.4 on Ubuntu 10.04

I’ve been struggling to install PostgreSQL Server 8.4 on Ubuntu 10.04.

I created a new user “postgres” using the Users Settings Tool (System > Administration > Users and Groups) and tried to install the PostgreSQL database server and pgAdmin III tool using the “Software Update Center”. Although the server seems to have been installed, I was unable to start it.
I tried the instructions mentioned at – https://help.ubuntu.com/community/PostgreSQL to no avail.

Finally found the cause to be a bug – https://bugs.launchpad.net/ubuntu/+source/postgresql-8.4/+bug/558319

Anyway, since I had to create a small Proof-Of-Concept for a project I’m working on, I really wanted to install PostgreSQL as the database.

So I went to the PostgreSQL site and downloaded the one-click installer from – http://www.postgresql.org/download/linux. I downloaded – postgresql-8.4.3-1-linux-x64.bin and followed the instructions at – http://www.enterprisedb.com/learning/pginst_guide.do

This was much simpler and I was able to install the server and all the necessary tools in less than 10 minutes. I now have a nice PostgreSQL 8.4 Menu under the Application Menu with a number of options viz. start server, stop server etc. These menu items along with the pgAdmin III tool definitely make working with PostgreSQL lot easier.


PostgreSQL 8.4 on Ubuntu 10.04

Follow

Get every new post delivered to your Inbox.