Basic Hadoop/ZooKeeper/HBase configurations
Setting up multiple High Availability (HA) masters
Importing data from MySQL via single client
Importing data from TSV files using the bulk load tool
Writing your own MapReduce job to import data
Precreating regions before moving data into HBase
Using HBase Shell to manage tables
Using HBase Shell to access data in HBase
Using HBase Shell to manage the cluster
Executing Java methods from HBase Shell
WAL tool—manually splitting and dumping WALs
HFile tool—viewing textualized HFile content
HBase hbck—checking the consistency of an HBase cluster
Hive on HBase—querying HBase using a SQL-like language
Backing Up and Restoring HBase Data
Full shutdown backup using distcp
Using CopyTable to copy data from one table to another
Exporting an HBase table to dump files on HDFS
Restoring HBase data by importing dump files from HDFS
Backing up region starting keys
Showing the disk utilization of HBase tables
Setting up Ganglia to monitor an HBase cluster
OpenTSDB—using HBase to monitor an HBase cluster
Setting up Nagios to monitor HBase processes
Using Nagios to check Hadoop/HBase logs
Simple scripts to report the status of the cluster
Enabling HBase RPC DEBUG-level logging
Simple script for managing HBase processes
Simple script for making deployment easier
Kerberos authentication for Hadoop and HBase
Configuring HDFS security with Kerberos
Handling the XceiverCount error
Handling the "too many open files" error
Handling the "unable to create new native thread" error
Handling the "HBase ignores HDFS client configuration" issue
Handling the ZooKeeper client connection error
Handling the ZooKeeper session expired error
Handling the HBase startup error on EC2
Setting up Hadoop to spread disk I/O
Using network topology script to make Hadoop rack-aware
Mounting disks with noatime and nodiratime
Setting vm.swappiness to 0 to avoid swap
Java GC and HBase heap settings
Advanced Configurations and Tuning
Benchmarking HBase cluster with YCSB
Increasing region server handler count
Precreating regions using your own algorithm
Avoiding update blocking on write-heavy clusters
Tuning memory size for MemStores
Client-side tuning for low latency systems
Configuring block cache for column families