05_尚硅谷大数据之HBase API操作

格式：pdf
大小：475.46 KB
文档页数：13

下载文档原格式

大数据-05-Spark之读写HBase数据

⼤数据-05-Spark之读写HBase数据本⽂主要来⾃于谢谢原作者准备⼯作⼀：创建⼀个HBase表这⾥依然是以student表为例进⾏演⽰。

这⾥假设你已经成功安装了HBase数据库，如果你还没有安装，可以参考,进⾏安装，安装好以后，不要创建数据库和表，只要跟着本节后⾯的内容操作即可。

因为hbase依赖于hadoop，因此启动和停⽌都是需要按照顺序进⾏如果安装了独⽴的zookeeper启动顺序: hadoop-> zookeeper-> hbase停⽌顺序：hbase-> zookeeper-> hadoop使⽤⾃带的zookeeper启动顺序: hadoop-> hbase停⽌顺序：hbase-> hadoop如下所⽰：cd /usr/local/hadoop./sbin/start-all.shcd /usr/local/hbase./bin/start-hbase.sh //启动HBase./bin/hbase shell //启动hbase shell这样就可以进⼊hbase shell命令提⽰符状态。

下⾯我们在HBase数据库中创建student表（注意：在关系型数据库MySQL中，需要⾸先创建数据库，然后再创建表，但是，在HBase数据库中，不需要创建数据库，只要直接创建表就可以）：hbase> list # 查看所有表hbase> disable 'student' # 禁⽤表hbase> drop 'student' # 删除表下⾯让我们⼀起来创建⼀个student表，我们可以在hbase shell中使⽤下⾯命令创建：hbase> create 'student','info'hbase> describe 'student'//⾸先录⼊student表的第⼀个学⽣记录hbase> put 'student','1','info:name','Xueqian'hbase> put 'student','1','info:gender','F'hbase> put 'student','1','info:age','23'//然后录⼊student表的第⼆个学⽣记录hbase> put 'student','2','info:name','Weiliang'hbase> put 'student','2','info:gender','M'hbase> put 'student','2','info:age','24'数据录⼊结束后，可以⽤下⾯命令查看刚才已经录⼊的数据：//如果每次只查看⼀⾏，就⽤下⾯命令hbase> get 'student','1'//如果每次查看全部数据，就⽤下⾯命令hbase> scan 'student'准备⼯作⼆：配置Spark在开始编程操作HBase数据库之前，需要对做⼀些准备⼯作。

HBASEJAVAAPI介绍及使用示例

HBASEJAVAAPI介绍及使用示例HBase是一个基于列存储的分布式数据库，它能够处理大规模数据，并提供高可靠性、高性能和高可扩展性。

HBase提供了Java API用来与HBase进行交互，通过这个API可以实现对HBase的管理和操作。

下面将介绍HBase Java API的基本用法，并给出一个简单的示例来演示如何使用HBase Java API进行数据操作。

### HBase Java API介绍HBase Java API是HBase的一个客户端库，我们可以使用这个API来连接HBase集群，创建、修改和查询数据。

在使用HBase Java API时，我们需要首先创建一个HBase的配置对象，然后创建一个HBase的连接对象。

HBase Java API提供了一系列的接口和类，用来实现对HBase的数据操作，比如表的创建、删除，数据的插入、查询和删除等。

### HBase Java API使用示例以下是一个简单的示例，演示了如何使用HBase Java API进行数据操作：1. 首先，创建一个HBase配置对象和一个HBase连接对象：```javaConfiguration conf = HBaseConfiguration.create(;conf.set("hbase.zookeeper.quorum", "localhost");conf.set("hbase.zookeeper.property.clientPort", "2181");Connection connection =ConnectionFactory.createConnection(conf);```2. 接下来，我们可以使用连接对象来创建一个HBase表：```javaAdmin admin = connection.getAdmin(;HTableDescriptor tableDescriptor = newHTableDescriptor(TableName.valueOf("my_table"));HColumnDescriptor cf = new HColumnDescriptor("cf");tableDescriptor.addFamily(cf);admin.createTable(tableDescriptor);```3.然后，我们可以向表中插入数据：```javaTable table =connection.getTable(TableName.valueOf("my_table"));Put put = new Put(Bytes.toBytes("row1"));put.addColumn(Bytes.toBytes("cf"), Bytes.toBytes("col1"), Bytes.toBytes("value1"));table.put(put);```4.最后，我们可以查询数据并输出结果：```javaGet get = new Get(Bytes.toBytes("row1"));Result result = table.get(get);byte[] value = result.getValue(Bytes.toBytes("cf"),Bytes.toBytes("col1"));System.out.println(Bytes.toString(value));```以上就是一个简单的示例，展示了如何使用HBase Java API进行数据操作。

HBase之API简单操作

HBase之API简单操作HBase版本：Hadoop-2.2.0与Hbase-0.96.0public abstract class AbstrUtils {protected static Logger logger = Logger.getLogger(AbstrUtils.class);protected static Configuration configuration = null;/** 初始化配置 **/static {System.setProperty("hadoop.home.dir", "D:/develop/data/hadoop/hadoop-2.2.0");System.setProperty("HADOOP_MAPRED_HOME", "D:/develop/data/hadoop/hadoop-2.2.0");System.setProperty("SQOOP_CONF_DIR", "D:/develop/data/sqoop/sqoop-1.4.4-hadoop-2.0.4");configuration = new Configuration();/** 与hbase/conf/hbase-site.xml中hbase.master配置的值相同 */configuration.set("hbase.master", "192.168.10.10:60000");/** 与hbase/conf/hbase-site.xml中hbase.zookeeper.quorum配置的值相同 */configuration.set("hbase.zookeeper.quorum", "192.168.10.10");/** 与hbase/conf/hbase-site.xml中hbase.zookeeper.property.clientPort配置的值相同*/configuration.set("hbase.zookeeper.property.clientPort", "2181");//configuration = HBaseConfiguration.create(configuration);}}public class HBaseUtils extends AbstrUtils {private static HBaseAdmin admin = null;private static HTablePool tablePool = null;static {try {admin = new HBaseAdmin(configuration);tablePool = new HTablePool(configuration, 10);tablePool.close();} catch (IOException e) {(e.getMessage(), e);}}/** 创建一张表*/public static void creatTable(String tableName, String[] familys) {try {if (admin.tableExists(tableName)) {("table "+ tableName + " already exists!");} else {HTableDescriptor tableDesc = new HTableDescriptor(TableName.valueOf(tableName));for (int i = 0; i < familys.length; i++) {tableDesc.addFamily(new HColumnDescriptor(familys[i]));}admin.createTable(tableDesc);("create table " + tableName + " success.");}} catch (Exception e) {(e.getMessage(), e);}}/** 删除表*/public static void deleteTable(String tableName) {try {admin.disableTable(tableName);admin.deleteTable(tableName);("delete table " + tableName + " success.");} catch (Exception e) {(e.getMessage(), e);}}/** 插入一行记录*/@SuppressWarnings("resource")public static void putRecord(String tableName, String rowKey, String family, String qualifier, String value) {try {HTable table = new HTable(configuration, tableName);Put put = new Put(Bytes.toBytes(rowKey));put.add(Bytes.toBytes(family), Bytes.toBytes(qualifier), Bytes.toBytes(value));table.put(put);("insert recored " + rowKey + " to table " + tableName + " success.");} catch (IOException e) {(e.getMessage(), e);}/** 批量插入记录*/public static void putRecords(String tableName, List<Put> puts) { HTable table = null;try {table = new HTable(configuration, tableName);table.put(puts);} catch (IOException e) {(e.getMessage(), e);try {table.flushCommits();} catch (Exception e1) {(e1.getMessage(), e1);}}}/** 删除一行记录*/@SuppressWarnings("resource")public static void deleteRecord(String tableName, String... rowKeys) { try {HTable table = new HTable(configuration, tableName);List<Delete> list = new ArrayList<Delete>();Delete delete = null;for (String rowKey : rowKeys) {delete = new Delete(rowKey.getBytes());list.add(delete);}if (list.size() > 0) {table.delete(list);}("delete recoreds " + rowKeys + " success.");} catch (IOException e) {(e.getMessage(), e);}}/** 查找一行记录*/@SuppressWarnings({ "resource"})public static Result getRecord(String tableName, String rowKey) { try {HTable table = new HTable(configuration, tableName);Get get = new Get(rowKey.getBytes());get.setMaxVersions();return table.get(get);} catch (IOException e) {(e.getMessage(), e);}return null;}/** 查找所有记录*/@SuppressWarnings({"resource" })public static ResultScanner getRecords(String tableName) {try {HTable table = new HTable(configuration, tableName);return table.getScanner(new Scan());} catch (IOException e) {(e.getMessage(), e);}return null;}public static void printRecord(Result result) {for (Cell cell : result.rawCells()) {("cell row: " + new String(cell.getRowArray()));("cell family: " + new String(cell.getFamilyArray()));("cell qualifier: " + new String(cell.getQualifierArray()));("cell value: " + new String(cell.getValueArray()));("cell timestamp: " + cell.getTimestamp());}/** 之前版本*//**for (KeyValue kv : rs.raw()) {System.out.print(new String(kv.getRow()) + " ");System.out.print(new String(kv.getFamily()) + ":");System.out.print(new String(kv.getQualifier()) + " ");System.out.print(kv.getTimestamp() + " ");System.out.println(new String(kv.getValue()));}*/}public static void printRecords(ResultScanner resultScanner) {for (Result result : resultScanner) {printRecord(result);}}}版本：Hadoop-2.7.2与Hbase-1.2.3public class HBaseUtils extends AbstrUtils {private static HBaseAdmin admin = null;private static Connection connection = null;static {try {ExecutorService pool = Executors.newCachedThreadPool();connection = ConnectionFactory.createConnection(configuration, pool);admin = (HBaseAdmin) connection.getAdmin();} catch (IOException e) {LOG.error(e.getMessage(), e);}}/** 创建一张表*/public static void creatTable(String tableName, String[] familys) {try {if (admin.tableExists(tableName)) {("table "+ tableName + " already exists!");} else {HTableDescriptor tableDesc = new HTableDescriptor(TableName.valueOf(tableName));for (int i = 0; i < familys.length; i++) {tableDesc.addFamily(new HColumnDescriptor(familys[i]));}admin.createTable(tableDesc);("create table " + tableName + " success.");}} catch (Exception e) {LOG.error(e.getMessage(), e);}}/** 表注册Coprocessor*/public static void addTableCoprocessor(String tableName, String coprocessorClassName) { try {admin.disableTable(tableName);HTableDescriptor htd = admin.getTableDescriptor(Bytes.toBytes(tableName));htd.addCoprocessor(coprocessorClassName);admin.modifyTable(Bytes.toBytes(tableName), htd);admin.enableTable(tableName);} catch (IOException e) {LOG.error(e.getMessage(), e);}}/** 统计表行数*/public static long rowCount(String tableName) {long rowCount = 0;try {HTable table = (HTable) connection.getTable(TableName.valueOf(tableName));Scan scan = new Scan();// scan.setFilter(new KeyOnlyFilter());scan.setFilter(new FirstKeyOnlyFilter());ResultScanner resultScanner = table.getScanner(scan);for (Result result : resultScanner) {rowCount += result.size();}} catch (IOException e) {LOG.error(e.getMessage(), e);}return rowCount;}/** 插入一行记录*/public static void insertRecord(String tableName, String rowKey, String family, String qualifier, String value) {try {HTable table = (HTable) connection.getTable(TableName.valueOf(tableName));Put put = new Put(Bytes.toBytes(rowKey));put.addColumn(Bytes.toBytes(family), Bytes.toBytes(qualifier), Bytes.toBytes(value));table.put(put);("insert recored " + rowKey + " to table " + tableName + " success.");} catch (IOException e) {(e.getMessage(), e);}}/** 批量插入记录*/public static void insertRecords(String tableName, List<Put> puts) { HTable table = null;try {table = (HTable) connection.getTable(TableName.valueOf(tableName));table.put(puts);} catch (IOException e) {(e.getMessage(), e);try {table.flushCommits();} catch (Exception e1) {(e1.getMessage(), e1);}}}/** 删除一行记录*/public static void deleteRecord(String tableName, String... rowKeys) { try {HTable table = (HTable) connection.getTable(TableName.valueOf(tableName));List<Delete> list = new ArrayList<Delete>();Delete delete = null;for (String rowKey : rowKeys) {delete = new Delete(rowKey.getBytes());list.add(delete);}if (list.size() > 0) {table.delete(list);}("delete recoreds " + rowKeys + " success.");} catch (IOException e) {LOG.error(e.getMessage(), e);}}/** 删除一个列族*/public static void deleteFamily(String tableName, String columnName) { try {admin.deleteColumn(tableName, columnName);} catch (Exception e) {LOG.error(e.getMessage(), e);}}/** 删除表*/public static void deleteTable(String tableName) {try {admin.disableTable(tableName);admin.deleteTable(tableName);("delete table " + tableName + " success.");} catch (Exception e) {LOG.error(e.getMessage(), e);}}/** 查找一行记录*/public static Result getRecord(String tableName, String rowKey) {try {HTable table = (HTable) connection.getTable(TableName.valueOf(tableName));Get get = new Get(rowKey.getBytes());get.setMaxVersions();return table.get(get);} catch (IOException e) {LOG.error(e.getMessage(), e);}return null;}/** 查找所有记录*/public static ResultScanner getRecords(String tableName) {return getRecords(tableName, null, null, null, null, null);}/** 查找所有记录*/public static ResultScanner getRecords(String tableName, String family) {return getRecords(tableName, family, null, null, null, null);}/** 查找所有记录*/public static ResultScanner getRecords(String tableName, String family, String qualifier) {return getRecords(tableName, family, qualifier, null, null, null);}/** 查找所有记录*/public static ResultScanner getRecords(String tableName, Filter filter) {return getRecords(tableName, null, null, null, null, filter);/** 查找所有记录*/public static ResultScanner getRecords(String tableName, String family, String qualifier, String startRow, String stopRow, Filter filter) {try {HTable table = (HTable) connection.getTable(TableName.valueOf(tableName));Scan scan = new Scan();if (!StringUtils.isBlank(family)) scan.addFamily(Bytes.toBytes(family));if (!StringUtils.isBlank(family) && !StringUtils.isBlank(qualifier))scan.addColumn(Bytes.toBytes(family), Bytes.toBytes(qualifier));if (!StringUtils.isBlank(startRow)) scan.setStartRow(Bytes.toBytes(startRow));if (!StringUtils.isBlank(stopRow)) scan.setStopRow(Bytes.toBytes(stopRow));if (null != filter) scan.setFilter(filter);return table.getScanner(scan);} catch (IOException e) {LOG.error(e.getMessage(), e);}return null;}/** 查找所有记录*/public static ResultScanner getRecords(String tableName, Scan scan) {try {HTable table = (HTable) connection.getTable(TableName.valueOf(tableName));return table.getScanner(scan);} catch (IOException e) {LOG.error(e.getMessage(), e);}return null;}public static void printRecord(Result result) {try {List<Cell> cells= result.listCells();for (Cell cell : cells) {String row = new String(result.getRow(), "UTF-8");String family = new String(CellUtil.cloneFamily(cell), "UTF-8");String qualifier = new String(CellUtil.cloneQualifier(cell), "UTF-8");String value = new String(CellUtil.cloneValue(cell), "UTF-8");System.out.println("[row:"+row+"],[family:"+family+"],[qualifier:"+qualifier+"],[value: "+value+"]");}} catch (UnsupportedEncodingException e) {LOG.error(e.getMessage(), e);}}public static void printRecords(ResultScanner resultScanner) { for (Result result : resultScanner) {printRecord(result);}}}。

HBaseAPI详解

HBaseAPI详解1 HBaseAPI1.1 获取Configuration对象public static Configuration conf;static{//使用HBaseConfiguration的单例方法实例化conf = HBaseConfiguration.create();conf.set("hbase.zookeeper.quorum", "192.168.9.102");conf.set("hbase.zookeeper.property.clientPort", "2181");}1.2 判断表是否存在public static boolean isTableExist(String tableName) throws MasterNotRunningException,ZooKeeperConnectionException, IOException{//在HBase中管理、访问表需要先创建HBaseAdmin对象//Connection connection = ConnectionFactory.createConnection(conf); //HBaseAdmin admin = (HBaseAdmin) connection.getAdmin();HBaseAdmin admin = new HBaseAdmin(conf);return admin.tableExists(tableName);}1.3 创建表public static void createTable(String tableName, String... columnFamily) throwsMasterNotRunningException, ZooKeeperConnectionException, IOException{HBaseAdmin admin = new HBaseAdmin(conf);//判断表是否存在if(isTableExist(tableName)){System.out.println("表" + tableName + "已存在");//System.exit(0);}else{//创建表属性对象,表名需要转字节HTableDescriptor descriptor = new HTableDescriptor(TableName.valueOf(tableName));//创建多个列族for(String cf : columnFamily){descriptor.addFamily(new HColumnDescriptor(cf));}//根据对表的配置，创建表admin.createTable(descriptor);System.out.println("表" + tableName + "创建成功！");}}1.4 删除表public static void dropTable(String tableName) throws MasterNotRunningException,ZooKeeperConnectionException, IOException{HBaseAdmin admin = new HBaseAdmin(conf);if(isTableExist(tableName)){admin.disableTable(tableName);admin.deleteTable(tableName);System.out.println("表" + tableName + "删除成功！");}else{System.out.println("表" + tableName + "不存在！");}}1.5 向表中插入数据public static void addRowData(String tableName, String rowKey, String columnFamily, Stringcolumn, String value) throws IOException{//创建HTable对象HTable hTable = new HTable(conf, tableName);//向表中插入数据Put put = new Put(Bytes.toBytes(rowKey));//向Put对象中组装数据put.add(Bytes.toBytes(columnFamily), Bytes.toBytes(column), Bytes.toBytes(value));hTable.put(put);hTable.close();System.out.println("插入数据成功");}1.6 删除多行数据public static void deleteMultiRow(String tableName, String... rows) throws IOException{HTable hTable = new HTable(conf, tableName);List<Delete> deleteList = new ArrayList<Delete>();for(String row : rows){Delete delete = new Delete(Bytes.toBytes(row));deleteList.add(delete);}hTable.delete(deleteList);hTable.close();}1.7 获取所有数据public static void getAllRows(String tableName) throws IOException{ HTable hTable = new HTable(conf, tableName);//得到用于扫描region的对象Scan scan = new Scan();//使用HTable得到resultcanner实现类的对象ResultScanner resultScanner = hTable.getScanner(scan);for(Result result : resultScanner){Cell[] cells = result.rawCells();for(Cell cell : cells){//得到rowkeySystem.out.println("行键:" + Bytes.toString(CellUtil.cloneRow(cell)));//得到列族System.out.println("列族" +Bytes.toString(CellUtil.cloneFamily(cell)));System.out.println("列:" + Bytes.toString(CellUtil.cloneQualifier(cell)));System.out.println("值:" + Bytes.toString(CellUtil.cloneValue(cell)));}}}1.8 获取某一行数据public static void getRow(String tableName, String rowKey) throws IOException{HTable table = new HTable(conf, tableName);Get get = new Get(Bytes.toBytes(rowKey));//get.setMaxVersions();显示所有版本//get.setTimeStamp();显示指定时间戳的版本Result result = table.get(get);for(Cell cell : result.rawCells()){System.out.println("行键:" + Bytes.toString(result.getRow()));System.out.println("列族" + Bytes.toString(CellUtil.cloneFamily(cell)));System.out.println("列:" + Bytes.toString(CellUtil.cloneQualifier(cell)));System.out.println("值:" + Bytes.toString(CellUtil.cloneValue(cell)));System.out.println("时间戳:" + cell.getTimestamp());}}1.9 获取某一行指定“列族:列”的数据public static void getRowQualifier(String tableName, String rowKey, String family, Stringqualifier) throws IOException{HTable table = new HTable(conf, tableName);Get get = new Get(Bytes.toBytes(rowKey));get.addColumn(Bytes.toBytes(family), Bytes.toBytes(qualifier));Result result = table.get(get);for(Cell cell : result.rawCells()){System.out.println("行键:" + Bytes.toString(result.getRow()));System.out.println("列族" + Bytes.toString(CellUtil.cloneFamily(cell)));System.out.println("列:" + Bytes.toString(CellUtil.cloneQualifier(cell)));System.out.println("值:" + Bytes.toString(CellUtil.cloneValue(cell)));}}。

Hbase学习02-API操作

Hbase学习02-API操作重要的部分1.创建hbase连接以及admin管理对象要操作hbase也需要建⽴hbase的连接，此处我们仍然使⽤TestNG来进⾏测试，使⽤@BeforeTest初始化Hbase的连接，然后创建admin的对象，@AfterTest来关闭连接实现步骤:“1.1使⽤HbaseConfiguration.create（）创建Hbase配置~1.2使⽤ConnectionFactory.createConnection（）创建Hbase连接1.3要创建表，需要基于Hbase连接获取admin管理对象1.4使⽤admin.close.connection.close关闭连接其中重要的before和after操作：//开始前都要执⾏的@BeforeTestpublic void beforeTest() throws IOException {//1使⽤HbaseConfiguration.create（）创建Hbase配置~Configuration configuration = HBaseConfiguration.create();//2使⽤ConnectionFactory.createConnection（）创建Hbase连接connection = ConnectionFactory.createConnection(configuration);//3要创建表，需要基于Hbase连接获取admin管理对象//要创建表和删除表都需要hmaster创建连接就需要⼀个admin对象admin = connection.getAdmin();}//结束之前都要执⾏的@AfterTestpublic void afterTest() throws IOException {//4使⽤admin.close.connection.close关闭连接admin.close();connection.close();}2、进⾏创建表的操作@Testpublic void createTable() throws IOException {TableName tableName = TableName.valueOf("WATER_BILL");//1、判断这个表是否存在if (admin.tableExists(tableName)){System.out.println("表存在");return;}//构建表//2、使⽤TableDescriptorBuilder.newBuilder构建表描述构造器//TableDescriptorBuilder:表描述器他是⽤来描述这个表有⼏个列蔟和其他属性TableDescriptorBuilder tableDescriptorBuilder = TableDescriptorBuilder.newBuilder(tableName);//3、使⽤ColumnFamilyDescriptorBuilder.newBuilder()构建列蔟描述构造器//创建列蔟也需要有列蔟的描述器，需要⽤⼀个构建器来构建ColumnFamilyDescriptor//经常会使⽤到⼀个⼯具类:Bytes (hbase包下的Bytes⼯具类）//这个⼯具类可以将字符串、long.double类型转换成byte[]数组//也可以将byte[]数组转换为指定类型ColumnFamilyDescriptorBuilder columnFamilyDescriptorBuilder = ColumnFamilyDescriptorBuilder.newBuilder(Bytes.toBytes("C1"));//4、构建列蔟描述，构建表描述ColumnFamilyDescriptor cfDes = columnFamilyDescriptorBuilder.build();//建⽴表和列蔟的关联tableDescriptorBuilder.setColumnFamily(cfDes);TableDescriptor tableDescriptor = tableDescriptorBuilder.build();//5、创建表admin.createTable(tableDescriptor);}3、写⼊数据操作：//写⼊数据@Testpublic void putTest() throws IOException {//1使⽤hbase连接获取htableTable table = connection.getTable(TABLE_NAME);//2都贱ROWKEY、列蔟、列名String rowkey="4944191";String columnFamily="C1";String columnName="NAME";String columnNameADDRESS ="ADDRESS ";String columnNameSEX="SEX";String columnNamePAR_DATE="PAR_DATE";String columnNameNUM_CURRENT="NUM_CURRENT";String columnNameNUM_PERVIOUS="NUM_PERVIOUS";String columnNameNUM_USAGE="NUM_USAGE";String columnNameTOTAL_MONEY="TOTAL_MONEY";String columnNameRECORD_DATE ="RECORD_DATE ";String columnNameLATEST_DATE="LATEST_DATE";//value为温学智//3构造put对象（对应put命令）Put put = new Put(Bytes.toBytes(rowkey));//4添加性名列put.addColumn(Bytes.toBytes(columnFamily),Bytes.toBytes(columnName),Bytes.toBytes("温学智"));put.addColumn(Bytes.toBytes(columnFamily),Bytes.toBytes("columnNameADDRESS "),Bytes.toBytes("河北省"));put.addColumn(Bytes.toBytes(columnFamily),Bytes.toBytes("columnNameSEX"),Bytes.toBytes("男"));put.addColumn(Bytes.toBytes(columnFamily),Bytes.toBytes("columnNamePAR_DATE"),Bytes.toBytes("2020-05-19"));put.addColumn(Bytes.toBytes(columnFamily),Bytes.toBytes("columnNameNUM_CURRENT"),Bytes.toBytes("308.1"));put.addColumn(Bytes.toBytes(columnFamily),Bytes.toBytes("columnNameNUM_PERVIOUS"),Bytes.toBytes("308.1"));put.addColumn(Bytes.toBytes(columnFamily),Bytes.toBytes("columnNameNUM_USAGE"),Bytes.toBytes("25"));put.addColumn(Bytes.toBytes(columnFamily),Bytes.toBytes("columnNameTOTAL_MONEY"),Bytes.toBytes("150"));put.addColumn(Bytes.toBytes(columnFamily),Bytes.toBytes("columnNameRECORD_DATE "),Bytes.toBytes("2020-04-25")); put.addColumn(Bytes.toBytes(columnFamily),Bytes.toBytes("columnNameLATEST_DATE"),Bytes.toBytes("2020-06-09"));//5、使⽤htable表对象执⾏put操作table.put(put);//6、关闭table表对象//htable是⼀个轻量级的对象可以经常创建//htable他是⼀个⾮线程安全的table.close();4、通过rowkey获取数据：//获取数据/*获取rowkey为4944191的所有的列的信息*/@Testpublic void getTest() throws IOException {//实现步骤://1获取HTablewTable table = connection.getTable(TABLE_NAME);//2使⽤rowkey构建Get对象-Get get = new Get(Bytes.toBytes("4944191"));//3执⾏get请求-Result result = table.get(get);//4.获取所有单元格List<Cell> cellList = result.listCells();//5打印rowkey-byte[] rowkey = result.getRow();System.out.println(Bytes.toString(rowkey));//6．迭代单元格列表-for (Cell cell : cellList) {//将字节数组转换换位字符串//获取列蔟的名称String cf=Bytes.toString(cell.getFamilyArray(),cell.getFamilyOffset(),cell.getFamilyLength());//获取列的名称String columnName = Bytes.toString(cell.getQualifierArray(), cell.getQualifierOffset(), cell.getQualifierLength());//获取数值String value = Bytes.toString(cell.getValueArray(), cell.getValueOffset(), cell.getValueLength());System.out.println(cf+" : "+columnName+" -> "+value);}//7.关闭表~table.close();}5、根据rowkey进⾏数据的删除//删除数据@Testpublic void deleteTest() throws IOException {//1.获取HTable对象-Table table = connection.getTable(TABLE_NAME);//2.根据rowkey构建delete对象-Delete delete = new Delete(Bytes.toBytes("4944191"));//3.执⾏delete请求-table.delete(delete);//4.关闭表table.close();}6、给定⼀个⽇期区间。

hbase的基本使用流程

HBase的基本使用流程1. 概述HBase是一个开源的、分布式的、面向列的非关系型数据库，基于Hadoop架构。

它主要用于高可靠性、高性能的大规模数据存储和实时读写操作。

本文将介绍HBase的基本使用流程。

2. 安装和配置2.1 安装HBase1.下载HBase安装包；2.解压安装包到指定目录；3.配置环境变量。

2.2 配置HBase1.打开HBase的配置文件hbase-site.xml；2.配置HBase相关参数，如hbase.rootdir、hbase.zookeeper.quorum等；3.保存配置文件。

3. 启动和停止HBase3.1 启动HBase1.打开命令行窗口，切换到HBase安装目录的bin目录下；2.执行命令start-hbase.sh（Linux）或start-hbase.bat（Windows）启动HBase。

3.2 停止HBase1.打开命令行窗口，切换到HBase安装目录的bin目录下；2.执行命令stop-hbase.sh（Linux）或stop-hbase.bat（Windows）停止HBase。

4. HBase基本概念4.1 表（Table）表是HBase中最基本的数据存储单元，类似于关系型数据库中的表。

每个表由多行组成，每行又包含多个列。

4.2 列族（Column Family）列族是表中列的分组，所有的列必须隶属于一个列族。

列族需要在创建表的时候指定，一旦创建后不能修改。

4.3 行（Row）行是表中的数据记录，每一行由行键（Row Key）唯一标识。

4.4 列（Column）和单元格（Cell）列是行中的属性，由列族和列修饰符唯一标识。

单元格是行和列的交叉点，用于存储具体的数据。

5. HBase基本操作5.1 创建表1.打开HBase Shell；2.执行命令create 'table_name', 'column_family'创建一张表。

HBaseJavaAPI操作示例

HBaseJavaAPI操作⽰例package testHBase;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.hbase.HBaseConfiguration;import org.apache.hadoop.hbase.HColumnDescriptor;import org.apache.hadoop.hbase.HTableDescriptor;import org.apache.hadoop.hbase.client.Get;import org.apache.hadoop.hbase.client.HBaseAdmin;import org.apache.hadoop.hbase.client.HTable;import org.apache.hadoop.hbase.client.Put;import org.apache.hadoop.hbase.client.Result;import org.apache.hadoop.hbase.client.ResultScanner;import org.apache.hadoop.hbase.client.Scan;import org.apache.hadoop.hbase.util.Bytes;public class TestHBase {static Configuration cfg = HBaseConfiguration.create(); //得到配置对象public static void create(String tableName ,String columnFamily) throws Exception{HBaseAdmin admin = new HBaseAdmin(cfg); //得到HBaseAdmin对象，⽤于操作HBaseif(admin.tableExists(tableName)){System.out.println("Table" + tableName + " Exists!");System.exit(0);}else{HTableDescriptor tableDesc = new HTableDescriptor(tableName); //得到表描述对象创建表时⽤到tableDesc.addFamily(new HColumnDescriptor(columnFamily)); //HColumnDescriptor 列族描述对象admin.createTable(tableDesc); //创建表System.out.println("Table " + tableName + " Success!");}}static void put(String tableName,String row,String columnFamily,String column,String data) throws Exception{HTable table = new HTable(cfg,tableName); //得到表对象Put put = new Put(Bytes.toBytes(row)); //插⼊数据put.add(Bytes.toBytes(columnFamily), Bytes.toBytes(column),Bytes.toBytes(data));table.put(put);System.out.println("put " + row +"," + "columnFamily:" + column + "," + data);}static void get(String tableName,String row) throws Exception{HTable table = new HTable(cfg,tableName);Get get = new Get(Bytes.toBytes(row));//获取⾏数据Result result = table.get(get);System.out.println("Get: " + result);}static void scan(String tableName) throws Exception{HTable table = new HTable(cfg,tableName);Scan scan = new Scan(Bytes.toBytes(tableName)); ///扫描全表ResultScanner rs = table.getScanner(scan);for(Result r : rs){System.out.println("Scan: " + r);}}static boolean delete(String tableName) throws Exception{HBaseAdmin admin = new HBaseAdmin(cfg);if(admin.tableExists(tableName)){admin.disableTable(tableName); //先disableSystem.out.println("Disable Table " + tableName + " Success!");admin.deleteTable(tableName); //再删除System.out.println("Delete Table " + tableName + " Success!");}return true;}public static void main(String[] args) throws Exception {cfg.set("hbase.master", "hadoop:60000");cfg.set("hbase.rootdir", "hdfs://hadoop:9000/hbase");cfg.set("hbase.zookeeper.quorum", "hadoop");String tableName = "hbase_test";String columnFamily = "cf";create(tableName, columnFamily);put(tableName,"row1",columnFamily,"c1","data");scan(tableName);get(tableName,"row1");delete(tableName); }}。

hbase常用命令及使用方法

hbase常用命令及使用方法一、HBase简介HBase是一个基于Hadoop的分布式列存储系统，可以用来存储海量的结构化数据。

它是一个开源的、高可靠性、高性能、可伸缩的分布式数据库，具有强大的数据处理能力和卓越的扩展性。

二、HBase常用命令1.启动和停止HBase服务启动HBase服务：在终端输入start-hbase.sh命令即可启动HBase 服务。

停止HBase服务：在终端输入stop-hbase.sh命令即可停止HBase 服务。

2.创建表创建表：在HBase shell中使用create命令来创建表，语法如下：create 'table_name', 'column_family'3.删除表删除表：在HBase shell中使用disable和drop命令来删除表，语法如下：disable 'table_name'drop 'table_name'4.添加数据添加数据：在HBase shell中使用put命令来添加数据，语法如下：put 'table_name', 'row_key', 'column_family:column_qualifier','value'5.查询数据查询数据：在HBase shell中使用get命令来查询数据，语法如下：get 'table_name', 'row_key'6.扫描全表扫描全表：在HBase shell中使用scan命令来扫描全表，语法如下：scan 'table_name'7.删除数据删除数据：在HBase shell中使用delete命令来删除数据，语法如下：delete 'table_name', 'row_key', 'column_family:column_qualifier'8.修改表修改表：在HBase shell中使用alter命令来修改表，语法如下：alter 'table_name', {NAME => 'column_family', VERSIONS =>version_num}9.查看表结构查看表结构：在HBase shell中使用describe命令来查看表结构，语法如下：describe 'table_name'10.退出HBase shell退出HBase shell：在HBase shell中输入exit命令即可退出。

hbase 读写流程

hbase 读写流程
HBase的读写流程大致如下：
1. 读流程：
客户端通过HBase的API发起读请求。

HBase Master收到请求后，将请求转发给一个RegionServer。

RegionServer根据请求中的rowkey定位到具体的HRegion，并从HRegion中读取数据。

如果数据在内存中（即缓存中），则直接返回给客户端；如果数据在磁盘上，则从磁盘读取数据并放入内存缓存。

返回给客户端的数据可能是多个数据块（block）的组合，客户端需要对这些数据进行整合。

2. 写流程：
客户端通过HBase的API发起写请求。

HBase Master将写请求转发给一个RegionServer。

RegionServer根据rowkey判断该数据应该写入哪个HRegion，然后将数据写入HRegion的MemStore中。

当MemStore达到一定大小时，HRegion会将MemStore中的数据flush到磁盘上，形成StoreFile。

随着时间的推移，StoreFile可能会进行合并、分裂等操作，但最终目的是为了提高数据的读写效率。

在数据真正写入HBase之前，客户端可能会收到一个“写入完成”的响应，但这并不意味着数据已经持久化到磁盘上。

以上是HBase读写的大致流程，实际过程可能更加复杂，涉及到很多优化和细节处理。

hbase相关操作

hbase相关操作
HBase是一种分布式、可伸缩、大数据存储的列式数据库，以下是HBase 的一些常用操作：
1. 创建表：使用HBase Shell或HBase API创建表。

创建表时需要指定表名和列族。

2. 插入数据：使用HBase Shell或HBase API插入数据。

插入数据时需要指定表名、行键和列族。

3. 查询数据：使用HBase Shell或HBase API查询数据。

查询数据时需要指定表名、行键和列族。

4. 删除数据：使用HBase Shell或HBase API删除数据。

删除数据时需要指定表名、行键和列族。

5. 扫描表：使用HBase Shell或HBase API扫描整个表，获取所有数据。

6. 分区表：根据业务需求，将表分区存储，提高数据访问效率。

7. 调整表配置：根据业务需求，调整表的相关配置，如列族数量、存储格式等。

8. 备份和恢复数据：使用HBase Shell或HBase API备份和恢复数据，确保数据安全。

9. 监控和维护：使用HBase监控工具监控表的状态和性能，及时发现和解决问题。

以上是HBase的一些常用操作，根据实际业务需求，可以选择适合的操作来处理数据。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

第5章HBase API操作
5.1 环境准备
新建项目后在pom.xml中添加依赖：
5.2 HBaseAPI
5.2.1 获取Configuration对象：
5.2.2 判断表是否存在：
5.2.3 创建表
5.2.4 删除表
5.2.5 向表中插入数据
5.2.6 删除多行数据
5.2.7 获取所有数据
5.2.8 获取某一行数据
5.2.9 获取某一行指定“列族:列”的数据
5.3 MapReduce
通过HBase的相关JavaAPI，我们可以实现伴随HBase操作的MapReduce过程，比如使用MapReduce将数据从本地文件系统导入到HBase的表中，比如我们从HBase中读取一些原始数据后使用MapReduce做数据分析。

5.3.1 官方HBase-MapReduce
1）查看HBase的MapReduce任务的执行
$ bin/hbase mapredcp
2）执行环境变量的导入
$ export HBASE_HOME=/home/admin/modules/hbase-1.3.1
$ export HADOOP_HOME=/home/admin/modules/hadoop-2.7.2
$ export HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase mapredcp`
3）运行官方的MapReduce任务
-- 案例一：统计Student表中有多少行数据
$ ~/modules/hadoop-2.7.2/bin/yarn jar lib/hbase-server-1.3.1.jar rowcounter student -- 案例二：使用MapReduce将本地数据导入到HBase
2）创建HBase表
hbase(main):001:0> create 'fruit','info'
3）在HDFS中创建input_fruit文件夹并上传fruit.tsv文件
$ ~/modules/hadoop-2.7.2/bin/hdfs dfs -mkdir /input_fruit/
$ ~/modules/hadoop-2.7.2/bin/hdfs dfs -put fruit.tsv /input_fruit/
4）执行MapReduce到HBase的fruit表中
$ ~/modules/hadoop-2.7.2/bin/yarn jar lib/hbase-server-1.3.1.jar importtsv \
-Dimporttsv.columns=HBASE_ROW_KEY,info:name,info:color fruit \
hdfs://linux01:8020/input_fruit
5）使用scan命令查看导入后的结果
hbase(main):001:0> scan ‘fruit’
5.3.2 自定义HBase-MapReduce1
目标：将fruit表中的一部分数据，通过MR迁入到fruit_mr表中。

分步实现：
1) 构建ReadFruitMapper类，用于读取fruit表中的数据
2) 构建WriteFruitMRReducer类，用于将读取到的fruit表中的数据写入到fruit_mr表中
3) 构建Fruit2FruitMRRunner extends Configured implements Tool用于组装运行Job任务
4）主函数中调用运行该Job任务
5）打包运行任务
$ ~/modules/hadoop-2.7.2/bin/yarn jar ~/softwares/jars/hbase-0.0.1-SNAPSHOT.jar com.z.hbase.mr1.Fruit2FruitMRRunner
提示：运行任务前，如果待数据导入的表不存在，则需要提前创建。

提示：maven打包命令：-P local clean package或-P dev clean package install（将第三方jar包一同打包，需要插件：maven-shade-plugin）
5.3.3 自定义HBase-MapReduce2
目标：实现将HDFS中的数据写入到HBase表中。

分步实现：
1）构建ReadFruitFromHDFSMapper于读取HDFS中的文件数据
2）构建WriteFruitMRFromTxtReducer类
3）创建Txt2FruitRunner组装Job
4）调用执行Job
5）打包运行
$ ~/modules/hadoop-2.7.2/bin/yarn jar ~/softwares/jars/hbase-0.0.1-SNAPSHOT.jar com.z.hbase.mr2.Txt2FruitRunner
提示：运行任务前，如果待数据导入的表不存在，则需要提前创建之。

提示：maven打包命令：-P local clean package或-P dev clean package install（将第三方jar包一同打包，需要插件：maven-shade-plugin）
5.4 与Hive的集成
5.4.1 HBase与Hive的对比
1) Hive
(1) 数据仓库
Hive的本质其实就相当于将HDFS中已经存储的文件在Mysql中做了一个双射关系，以方便使用HQL去管理查询。

(2) 用于数据分析、清洗
Hive适用于离线的数据分析和清洗，延迟较高。

(3) 基于HDFS、MapReduce
Hive存储的数据依旧在DataNode上，编写的HQL语句终将是转换为MapReduce代码执行。

2) HBase
(1) 数据库
是一种面向列存储的非关系型数据库。

(2) 用于存储结构化和非结构话的数据
适用于单表非关系型数据的存储，不适合做关联查询，类似JOIN等操作。

(3) 基于HDFS
数据持久化存储的体现形式是Hfile，存放于DataNode中，被ResionServer以region的形式进行管理。

(4) 延迟较低，接入在线业务使用
面对大量的企业数据，HBase可以直线单表大量数据的存储，同时提供了高效的数据访问速度。

5.4.2 HBase与Hive集成使用
尖叫提示：HBase与Hive的集成在最新的两个版本中无法兼容。

所以，我们只能含着泪勇敢的重新编译：hive-hbase-handler-1.2.2.jar！！好气！！
环境准备
因为我们后续可能会在操作Hive的同时对HBase也会产生影响，所以Hive需要持有操作HBase的Jar，那么接下来拷贝Hive所依赖的Jar包（或者使用软连接的形式）。

1) 案例一
目标：建立Hive表，关联HBase表，插入数据到Hive表的同时能够影响HBase表。

分步实现：
(1) 在Hive中创建表同时关联HBase
尖叫提示：完成之后，可以分别进入Hive和HBase查看，都生成了对应的表
(2) 在Hive中创建临时中间表，用于load文件中的数据
尖叫提示：不能将数据直接load进Hive所关联HBase的那张表中
(5) 查看Hive以及关联的HBase表中是否已经成功的同步插入了数据
2) 案例二
目标：在HBase中已经存储了某一张表hbase_emp_table，然后在Hive中创建一个外部表来关联HBase中的hbase_emp_table这张表，使之可以借助Hive来分析HBase这张表中的数据。

注：该案例2紧跟案例1的脚步，所以完成此案例前，请先完成案例1。

分步实现：。