HDP2.4安装系列介绍了通过ambari创建hbase集群的过程,但工作中一直采用.net的技术路线,如何去访问基于Java搞的Hbase呢? Hbase提供基于Java的本地API访问,同时扩展了通过 Thrift、Rest 实现Web访问的API。 so 决定开发基于.net的 sdk,通过其提供的 rest webAPI 来访问 hbase, 其中c#与Java的数据交互过程采用protobuf协议。

目录:

  • 参考资料
  • 基本原理
  • c#、java数据交互
  • hbase filter 实体
  • WebRequester
  • hbaseClient

参考资料:

基本原理:

  • HBase Rest 是建立在HBase java 客户端基础之上提供的web 服务,示意图如下:
  • 可以通过 start /stop 等命令来启动或停止Hbase的 Rest server 服务,如下:
    1. 命令:hbase rest start   (默认的方式启动rest服务,端口是8080)
    2. 命令:hbase rest start 9000 (这种方式以端口9000方式启动)
    3. 命令:hbase-daemon.sh start rest -p 9000
  • 当服务启动的时候,系统内嵌的jetty servlet container启动并部署servlet.服务默认的监听端口为8080,可通过修改hbase 配置文件来替换其它端口。
  • 简单概述需求:将下面列表中的访问及传递参数用c#进行封装
    1. http://192.168.2.21: 为HBase master 对应的IP地址
    2. 8080: 是HBase Rest Server对应的端口
    3. yourTable: 操作HBase 数据库的表名
    4. schema/regions/scanner: 约定关键字

c#与java通过protobuf数据交互:

  • Hbase 为java与其它开发语言通过protobuf进行数据交互制定一个特定的数据结构(见hbase官网REST Protobufs Schema 的结构描述),网上有一堆的工具可根据据protobufs schemal 文件生成java、c#源码。意思是双方都遵守这个数据文件格式,来实现夸平台的数据交互与共享。这个就是做了一个平台无关的文件与平台和语言相关的数据对象之间的适配转化工作,如很多xml解析器一样的原理。
  • 协议文件是.proto为后缀的文件,格式如下代码示例
    package org.apache.hadoop.hbase.rest.protobuf.generated;
    
    message TableInfo {
    required string name = ;
    message Region {
    required string name = ;
    optional bytes startKey = ;
    optional bytes endKey = ;
    optional int64 id = ;
    optional string location = ;
    }
    repeated Region regions = ;
    }
    1. package:在Java里面代表这个文件所在的包名,在c#里面代表该文件的命名空间
    2. message:代表一个类;
    3. required: 代表该字段必填;
    4. optional: 代表该字段可选,并可以为其设置默认值
  • 从github上下载window版的转换工具,将解压后包中的ProtoGen.exe.config,protoc.exe,ProtoGen.exe及Google.ProtocolBuffers.dll文件放到某个新建的文件夹( 如:c:\zhu)
  • 将hbase 规定的协议文件同时copy至该目录 (hbase源码包中 \hbase\hbase-rest\src\main\resources\org\apache\hadoop\hbase\rest\protobuf  下的文件)
  • 以TableInfoMessage.proto 为例进行说明, windows系统下打开命令提示符,切换至 c:\zhu 文件夹下
  • 执行:protoc --descriptor_set_out=TableInfoMessage.protobin --include_imports TableInfoMessage.proto
  • 上述命令之后,c:\zhu 文件夹内生成了一个TableInfoMessage.protobin文件
  • 执行:protogen AddressBook.protobin  (目录下会生成名为TableInfoMessage.cs文件,这就是生成的c#源码)
  • 当然你可以写一个批处理命令来执行,完成后生成的9个文件引入到你的Visual studio 工程即可使用。

hbase filter 实体:

  • 在hbase读取数据时设置的过滤参数,参照 (hbase\hbase-client\src\main\java\org\apache\hadoop\hbase\filter)源码,用c#翻译一次
  • 完成后如下图
  •    

WebRequester:

  • 封装http请求 WebRequester 类

    public class WebRequester
    {
    private string url = string.Empty; /// <summary>
    ///
    /// </summary>
    /// <param name="urlString"></param>
    public WebRequester(string urlString)
    {
    this.url = urlString;
    } /// <summary>
    /// Issues the web request.
    /// </summary>
    /// <param name="endpoint">The endpoint.</param>
    /// <param name="method">The method.</param>
    /// <param name="input">The input.</param>
    /// <param name="options">request options</param>
    /// <returns></returns>
    public HttpWebResponse IssueWebRequest(string endpoint, string method, Stream input, RequestOptions options)
    {
    return IssueWebRequestAsync(endpoint, method, input,options).Result;
    } /// <summary>
    /// Issues the web request asynchronous.
    /// </summary>
    /// <param name="endpoint">The endpoint.</param>
    /// <param name="method">The method.</param>
    /// <param name="input">The input.</param>
    /// <param name="options">request options</param>
    /// <returns></returns>
    public async Task<HttpWebResponse> IssueWebRequestAsync(string endpoint, string method, Stream input, RequestOptions options)
    {
    string uri = string.Format("{0}/{1}", this.url, endpoint);
    HttpWebRequest httpWebRequest = HttpWebRequest.CreateHttp(uri);
    httpWebRequest.Timeout = options.TimeoutMillis;
    httpWebRequest.PreAuthenticate = true;
    httpWebRequest.Method = method;
    httpWebRequest.ContentType = options.ContentType; if (options.AdditionalHeaders != null)
    {
    foreach (var kv in options.AdditionalHeaders)
    {
    httpWebRequest.Headers.Add(kv.Key, kv.Value);
    }
    } if (input != null)
    {
    using (Stream req = await httpWebRequest.GetRequestStreamAsync())
    {
    await input.CopyToAsync(req);
    }
    } return (await httpWebRequest.GetResponseAsync()) as HttpWebResponse;
    }
    }
  • http 操作实体类
    public class RequestOptions
    {
    public string AlternativeEndpoint { get; set; }
    public bool KeepAlive { get; set; }
    public int TimeoutMillis { get; set; }
    public int SerializationBufferSize { get; set; }
    public int ReceiveBufferSize { get; set; }
    public bool UseNagle { get; set; }
    public int Port { get; set; }
    public Dictionary<string, string> AdditionalHeaders { get; set; }
    public string AlternativeHost { get; set; }
    public string ContentType { get; set; } public static RequestOptions GetDefaultOptions()
    {
    return new RequestOptions()
    {
    KeepAlive = true,
    TimeoutMillis = ,
    ReceiveBufferSize = * * ,
    SerializationBufferSize = * * ,
    UseNagle = false,
    //AlternativeEndpoint = Constants.RestEndpointBase,
    //Port = 443,
    AlternativeEndpoint = string.Empty,
    Port = ,
    AlternativeHost = null,
    ContentType = "application/x-protobuf"
    };
    } }

hbaseClient:

  • 定义hbase 常用操作接口IHbaseClient(包含基于表的操作以及数据的读写),示例如下

    public interface IHBaseClient
    { /// <summary>
    ///
    /// </summary>
    /// <param name="options"></param>
    /// <returns></returns>
    Task<org.apache.hadoop.hbase.rest.protobuf.generated.Version> GetVersionAsync(RequestOptions options = null); /// <summary>
    ///
    /// </summary>
    /// <param name="schema"></param>
    /// <param name="options"></param>
    /// <returns></returns>
    Task<bool> CreateTableAsync(TableSchema schema, RequestOptions options = null); /// <summary>
    ///
    /// </summary>
    /// <param name="table"></param>
    /// <param name="options"></param>
    /// <returns></returns>
    Task DeleteTableAsync(string table, RequestOptions options = null); /// <summary>
    ///
    /// </summary>
    /// <param name="table"></param>
    /// <param name="options"></param>
    /// <returns></returns>
    Task<TableInfo> GetTableInfoAsync(string table, RequestOptions options = null); /// <summary>
    ///
    /// </summary>
    /// <param name="table"></param>
    /// <param name="options"></param>
    /// <returns></returns>
    Task<TableSchema> GetTableSchemaAsync(string table, RequestOptions options = null); /// <summary>
    ///
    /// </summary>
    /// <param name="options"></param>
    /// <returns></returns>
    Task<TableList> ListTablesAsync(RequestOptions options = null); /// <summary>
    ///
    /// </summary>
    /// <param name="tableName"></param>
    /// <param name="scannerSettings"></param>
    /// <param name="options"></param>
    /// <returns></returns>
    Task<ScannerInformation> CreateScannerAsync(string tableName, Scanner scannerSettings, RequestOptions options); /// <summary>
    ///
    /// </summary>
    /// <param name="scannerInfo"></param>
    /// <param name="options"></param>
    /// <returns></returns>
    Task<CellSet> ScannerGetNextAsync(ScannerInformation scannerInfo, RequestOptions options); /// <summary>
    ///
    /// </summary>
    /// <param name="table"></param>
    /// <param name="cells"></param>
    /// <param name="options"></param>
    /// <returns></returns>
    Task<bool> StoreCellsAsync(string table, CellSet cells, RequestOptions options = null);
    }
  • 实现接口类 HBaseClient
    public class HBaseClient : IHBaseClient
    {
    private WebRequester _requester; private readonly RequestOptions _globalRequestOptions; /// <summary>
    ///
    /// </summary>
    /// <param name="endPoints"></param>
    /// <param name="globalRequestOptions"></param>
    public HBaseClient(string url, RequestOptions globalRequestOptions = null)
    {
    _globalRequestOptions = globalRequestOptions ?? RequestOptions.GetDefaultOptions();
    _requester = new WebRequester(url);
    } /// <summary>
    ///
    /// </summary>
    /// <param name="options"></param>
    /// <returns></returns>
    public async Task<org.apache.hadoop.hbase.rest.protobuf.generated.Version> GetVersionAsync(RequestOptions options = null)
    {
    var optionToUse = options ?? _globalRequestOptions;
    return await GetRequestAndDeserializeAsync<org.apache.hadoop.hbase.rest.protobuf.generated.Version>(EndPointType.Version, optionToUse);
    } /// <summary>
    ///
    /// </summary>
    /// <param name="schema"></param>
    /// <param name="options"></param>
    /// <returns></returns>
    public async Task<bool> CreateTableAsync(TableSchema schema, RequestOptions options = null)
    {
    if (string.IsNullOrEmpty(schema.name))
    throw new ArgumentException("schema.name was either null or empty!", "schema"); var optionToUse = options ?? _globalRequestOptions;
    string endpoint = string.Format("{0}/{1}", schema.name, EndPointType.Schema);
    using (HttpWebResponse webResponse = await PutRequestAsync(endpoint,schema, optionToUse))
    {
    if (webResponse.StatusCode == HttpStatusCode.Created)
    {
    return true;
    } // table already exits
    if (webResponse.StatusCode == HttpStatusCode.OK)
    {
    return false;
    } // throw the exception otherwise
    using (var output = new StreamReader(webResponse.GetResponseStream()))
    {
    string message = output.ReadToEnd();
    throw new WebException(
    string.Format("Couldn't create table {0}! Response code was: {1}, expected either 200 or 201! Response body was: {2}",
    schema.name,webResponse.StatusCode,message));
    }
    }
    } /// <summary>
    ///
    /// </summary>
    /// <param name="table"></param>
    /// <param name="options"></param>
    /// <returns></returns>
    public async Task DeleteTableAsync(string table, RequestOptions options = null)
    {
    var optionToUse = options ?? _globalRequestOptions;
    string endPoint = string.Format("{0}/{1}", table, EndPointType.Schema); using (HttpWebResponse webResponse = await ExecuteMethodAsync<HttpWebResponse>(WebMethod.Delete, endPoint, null, optionToUse))
    {
    if (webResponse.StatusCode != HttpStatusCode.OK)
    {
    using (var output = new StreamReader(webResponse.GetResponseStream()))
    {
    string message = output.ReadToEnd();
    throw new WebException(
    string.Format("Couldn't delete table {0}! Response code was: {1}, expected 200! Response body was: {2}",
    table, webResponse.StatusCode, message));
    }
    }
    }
    } /// <summary>
    ///
    /// </summary>
    /// <param name="table"></param>
    /// <param name="options"></param>
    /// <returns></returns>
    public async Task<TableInfo> GetTableInfoAsync(string table, RequestOptions options = null)
    {
    var optionToUse = options ?? _globalRequestOptions;
    string endPoint = string.Format("{0}/{1}", table, EndPointType.Regions);
    return await GetRequestAndDeserializeAsync<TableInfo>(endPoint, optionToUse);
    } /// <summary>
    ///
    /// </summary>
    /// <param name="table"></param>
    /// <param name="options"></param>
    /// <returns></returns>
    public async Task<TableSchema> GetTableSchemaAsync(string table, RequestOptions options = null)
    {
    var optionToUse = options ?? _globalRequestOptions;
    string endPoint = string.Format("{0}/{1}", table, EndPointType.Schema);
    return await GetRequestAndDeserializeAsync<TableSchema>(endPoint, optionToUse);
    } /// <summary>
    ///
    /// </summary>
    /// <param name="options"></param>
    /// <returns></returns>
    public async Task<TableList> ListTablesAsync(RequestOptions options = null)
    {
    var optionToUse = options ?? _globalRequestOptions;
    return await GetRequestAndDeserializeAsync<TableList>("", optionToUse);
    } /// <summary>
    ///
    /// </summary>
    /// <param name="tableName"></param>
    /// <param name="scannerSettings"></param>
    /// <param name="options"></param>
    /// <returns></returns>
    public async Task<ScannerInformation> CreateScannerAsync(string tableName, Scanner scannerSettings, RequestOptions options)
    {
    string endPoint = string.Format("{0}/{1}", tableName, EndPointType.Scanner); using (HttpWebResponse response = await ExecuteMethodAsync(WebMethod.Post, endPoint, scannerSettings, options))
    {
    if (response.StatusCode != HttpStatusCode.Created)
    {
    using (var output = new StreamReader(response.GetResponseStream()))
    {
    string message = output.ReadToEnd();
    throw new WebException(
    string.Format( "Couldn't create a scanner for table {0}! Response code was: {1}, expected 201! Response body was: {2}",
    tableName, response.StatusCode, message));
    }
    }
    string location = response.Headers.Get("Location");
    if (location == null)
    {
    throw new ArgumentException("Couldn't find header 'Location' in the response!");
    } return new ScannerInformation(new Uri(location), tableName, response.Headers);
    }
    } /// <summary>
    ///
    /// </summary>
    /// <param name="scannerInfo"></param>
    /// <param name="options"></param>
    /// <returns></returns>
    public async Task<CellSet> ScannerGetNextAsync(ScannerInformation scannerInfo, RequestOptions options)
    {
    string endPoint = string.Format("{0}/{1}/{2}", scannerInfo.TableName, EndPointType.Scanner, scannerInfo.ScannerId);
    using (HttpWebResponse webResponse = await GetRequestAsync(endPoint, options))
    {
    if (webResponse.StatusCode == HttpStatusCode.OK)
    {
    return Serializer.Deserialize<CellSet>(webResponse.GetResponseStream());
    } return null;
    }
    } /// <summary>
    ///
    /// </summary>
    /// <param name="table"></param>
    /// <param name="cells"></param>
    /// <param name="options"></param>
    /// <returns></returns>
    public async Task<bool> StoreCellsAsync(string table, CellSet cells, RequestOptions options = null)
    {
    var optionToUse = options ?? _globalRequestOptions;
    string path = table + "/somefalsekey";
    using (HttpWebResponse webResponse = await PutRequestAsync(path, cells, options))
    {
    if (webResponse.StatusCode == HttpStatusCode.NotModified)
    {
    return false;
    } if (webResponse.StatusCode != HttpStatusCode.OK)
    {
    using (var output = new StreamReader(webResponse.GetResponseStream()))
    {
    string message = output.ReadToEnd();
    throw new WebException(
    string.Format("Couldn't insert into table {0}! Response code was: {1}, expected 200! Response body was: {2}",
    table, webResponse.StatusCode, message));
    }
    }
    }
    return true;
    } /// <summary>
    ///
    /// </summary>
    /// <typeparam name="T"></typeparam>
    /// <param name="endpoint"></param>
    /// <param name="options"></param>
    /// <returns></returns>
    private async Task<T> GetRequestAndDeserializeAsync<T>(string endpoint, RequestOptions options)
    {
    using (WebResponse response = await _requester.IssueWebRequestAsync(endpoint, WebMethod.Get, null, options))
    {
    using (Stream responseStream = response.GetResponseStream())
    {
    return Serializer.Deserialize<T>(responseStream);
    }
    }
    } /// <summary>
    ///
    /// </summary>
    /// <typeparam name="TReq"></typeparam>
    /// <param name="endpoint"></param>
    /// <param name="query"></param>
    /// <param name="request"></param>
    /// <param name="options"></param>
    /// <returns></returns>
    private async Task<HttpWebResponse> PutRequestAsync<TReq>(string endpoint, TReq request, RequestOptions options)
    where TReq : class
    {
    return await ExecuteMethodAsync(WebMethod.Post, endpoint, request, options);
    } /// <summary>
    ///
    /// </summary>
    /// <typeparam name="TReq"></typeparam>
    /// <param name="method"></param>
    /// <param name="endpoint"></param>
    /// <param name="request"></param>
    /// <param name="options"></param>
    /// <returns></returns>
    private async Task<HttpWebResponse> ExecuteMethodAsync<TReq>(string method,string endpoint,TReq request,RequestOptions options) where TReq : class
    {
    using (var input = new MemoryStream(options.SerializationBufferSize))
    {
    if (request != null)
    {
    Serializer.Serialize(input, request);
    }
    input.Seek(, SeekOrigin.Begin);
    return await _requester.IssueWebRequestAsync(endpoint,method, input, options);
    }
    } /// <summary>
    ///
    /// </summary>
    /// <param name="endpoint"></param>
    /// <param name="query"></param>
    /// <param name="options"></param>
    /// <returns></returns>
    private async Task<HttpWebResponse> GetRequestAsync(string endpoint, RequestOptions options)
    {
    return await _requester.IssueWebRequestAsync(endpoint, WebMethod.Get, null, options);
    }
    }
  • 按步骤完成上面的代码,编译通过即OK,下一篇进入sdk的测试验证之旅

HBase(一): c#访问hbase组件开发的更多相关文章

  1. HBase(二): c#访问HBase之股票行情Demo

    上一章完成了c#访问hbase的sdk封装,接下来以一个具体Demo对sdk进行测试验证.场景:每5秒抓取指定股票列表的实时价格波动行情,数据下载后,一方面实时刷新UI界面,另一方面将数据放入到在内存 ...

  2. 使用C#通过Thrift访问HBase

    前言 因为项目需要要为客户程序提供C#.Net的HBase访问接口,而HBase并没有提供原生的.Net客户端接口,可以通过启动HBase的Thrift服务来提供多语言支持. Thrift介绍 环境 ...

  3. CDH 6.0.1 版本 默认配置下 HUE | happybase 无法访问 Hbase 的问题

    第一个问题 HUE 无法直接连接到 HBase 在默认配置下 CDH 6.0.1 版本下的 HBase2.0 使用了默认配置 hbase.regionserver.thrift.compact = T ...

  4. Hbase记录-client访问zookeeper大量断开以及参数调优分析(转载)

    1.hbase client配置参数 超时时间.重试次数.重试时间间隔的配置也比较重要,因为默认的配置的值都较大,如果出现hbase集群或者RegionServer以及ZK关掉,则对应用程序是灾难性的 ...

  5. 使用C#和Thrift来访问Hbase实例

    今天试着用C#和Thrift来访问Hbase,主要参考了博客园上的这篇文章.查了Thrift,Hbase的资料,结合博客园的这篇文章,终于搞好了.期间经历了不少弯路,下面我尽量详细的记录下来,免得大家 ...

  6. Pyspark访问Hbase

    作者:Syn良子 出处:http://www.cnblogs.com/cssdongl/p/7347167.html 转载请注明出处 记录自己最近抽空折腾虚拟机环境时用spark2.0的pyspark ...

  7. windows平台下用C#访问HBase

    Hadoop中的HBase有多种数据访问方式,ubuntu里可以用hbase shell查看操作hbase数据库,但windows平台下需要用thrift对它进行访问. 例如hadoop安装在/usr ...

  8. JAVA API访问Hbase org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=32

    Java使用API访问Hbase报错: 我的hbase主节点是spark1   java代码访问hbase的时候写的是ip 结果运行程序报错 不能够识别主机名 修改主机名     修改主机hosts文 ...

  9. PHP通过thrift2访问HBASE

    前一段时间需要在网页上显示HBASE查询的结果,考虑用PHP来实现,在网上搜了一下,普遍都是用thrift作为接口来实现的.​ 参考博文:​ http://www.cnblogs.com/scotom ...

随机推荐

  1. HDU 5009

    http://acm.hdu.edu.cn/showproblem.php?pid=5009 题意:一个数列,每个点代表一种颜色,每次选一个区间覆盖,覆盖的代价是区间内颜色种类数的平方,直到覆盖整个数 ...

  2. [转]LUA 学习笔记

    Lua 学习笔记 入门级 一.环境配置 方式一: 1.资源下载http://www.lua.org/download.html 2.用src中的源码创建了一个工程,注释调luac.c中main函数,生 ...

  3. 解决magento新闻邮件发送一直处于“正在发送”状态问题

    今天在弄magento新闻邮件发送时候发现,单个邮件发送完全没有问题,但是新闻邮件订阅死活都不成功,国内国外的帖子都翻了一遍没有用,最后还是得靠自己了,于是开始慢慢找问题   首先想到是不是cront ...

  4. .csproj文件的配置 IIS可以调试

    <ProjectExtensions> <VisualStudio> <FlavorProperties GUID="{349c5851-65df-11da-9 ...

  5. 【Sublime Text 3】

  6. 【转】ROC和AUC介绍以及如何计算AUC

    转自:https://www.douban.com/note/284051363/ ROC(Receiver Operating Characteristic)曲线和AUC常被用来评价一个二值分类器( ...

  7. The u32 classifier

    The u32 classifier The U32 filter is the most advanced filter available in the current implementatio ...

  8. ZOJ 1125 Floating Point Numbers

    原题链接 题目大意:给一个16位的数字,表示一个浮点数,按照规则转换成科学计数法表示. 解法:注释比较清楚了,注意浮点运算的四舍五入问题. 参考代码: #include<iostream> ...

  9. Spring+SpringMVC+Mybatis+ehcache

    http://www.tuicool.com/articles/myeANv http://www.mamicode.com/info-detail-1151624.html

  10. error when loading the sdk 发现了元素 d:skin 开头无效内容 转自http://blog.csdn.net/yueqinglkong/article/details/46340571

    把devices.xml这个文件删除,再把sdk里面tools\lib下的这个文件拷贝到你删除的那个文件夹里,重启eclipse