使用newAPIHadoopRDD接口访问hbase数据,网上有很多可以参考的例子,但是由于环境使用了kerberos安全加固,spark使用有kerberos认证的hbase,网上的参考资料不多,访问hbase时,有些需要注意的地方,这里简单记录下最后的实现方案以及实现过程中遇到的坑,博客有kerberos认证hbase在spark环境下的使用提供了很大的帮助!!!

环境及版本信息

CDH6.2.1大数据集群(包含yarn、spark、hdfs等组件)

项目pom文件

首先说明的是不需要安装scala,本地local模式运行时,在pom中直接添加scala运行时依赖即可;另外最终应用是放到集群中运行的,CDH Spark中的lib中都存在scala、spark-core、spark-sql等相关依赖,所以在pom文件中都作为provided属性添加,即编译时使用。

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion> <groupId>com.css.bigdata</groupId>
<artifactId>data-compare</artifactId>
<version>1.0-SNAPSHOT</version> <properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<java.version>1.8</java.version>
<version.hbase>2.1.0-cdh6.2.1</version.hbase>
<version.hadoop>3.0.0-cdh6.2.1</version.hadoop>
<maven.compiler.source>1.8</maven.compiler.source>
<version.scala>2.11</version.scala>
<version.scala.libray>2.11.12</version.scala.libray>
<version.spark>2.4.0-cdh6.2.1</version.spark>
</properties> <dependencies>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${version.scala.libray}</version>
<scope>provided</scope>
</dependency> <dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${version.scala}</artifactId>
<version>${version.spark}</version>
<scope>provided</scope>
</dependency> <dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_${version.scala}</artifactId>
<version>${version.spark}</version>
<scope>provided</scope>
</dependency> <!--HBase -->
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-client</artifactId>
<version>${version.hbase}</version>
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-server</artifactId>
<version>${version.hbase}</version>
</dependency> <dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-mapreduce</artifactId>
<version>${version.hbase}</version>
</dependency>
</dependencies> <build>
<plugins>
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.1</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
<encoding>UTF-8</encoding>
</configuration>
</plugin> <!-- 分离资源文件 -->
<plugin>
<artifactId>maven-resources-plugin</artifactId>
<executions>
<execution>
<id>copy-resources</id>
<phase>package</phase>
<goals>
<goal>copy-resources</goal>
</goals>
<configuration>
<resources>
<resource>
<directory>src/main/resources</directory>
</resource>
</resources>
<outputDirectory>${project.build.directory}/conf</outputDirectory>
</configuration>
</execution>
</executions>
</plugin> <plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<configuration>
<archive>
<manifestEntries>
<Class-Path>../conf/</Class-Path>
</manifestEntries>
<manifest>
<addClasspath>true</addClasspath>
<classpathPrefix>../lib/</classpathPrefix>
<mainClass>com.css.bigdata.dataCompare.HBaseCompare</mainClass>
</manifest>
</archive>
</configuration>
</plugin> <plugin>
<!--这个插件就是把依赖的jar包复制出来放到编译后的target/lib目录,并且在打包时候排除内部依赖 -->
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-dependency-plugin</artifactId>
<executions>
<execution>
<id>copy-dependencies</id>
<phase>prepare-package</phase>
<goals>
<goal>copy-dependencies</goal>
</goals>
<configuration>
<outputDirectory>${project.build.directory}/lib</outputDirectory>
<overWriteReleases>false</overWriteReleases>
<overWriteSnapshots>false</overWriteSnapshots>
<overWriteIfNewer>true</overWriteIfNewer>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>

HBaseUtil类

package com.css.bigdata.dataCompare.hbase;

import com.css.bigdata.dataCompare.Constant;
import com.css.bigdata.dataCompare.util.KerberosCheckUtil;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.security.User;
import org.apache.hadoop.security.UserGroupInformation;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.io.IOException;
public class HBaseUtil { public static Logger logger = LoggerFactory.getLogger(HBaseUtil.class);
public static Configuration getHbaseConfiguration(String cluster){
Configuration hbaseConf = HBaseConfiguration.create();
//调整部分配置
String hbaseIp = cluster;
hbaseConf.set("hbase.zookeeper.quorum", hbaseIp + ":2181");
hbaseConf.set("hbase.master", hbaseIp+":60000");
//避免超时
hbaseConf.set("hbase.rpc.timeout", "10000");//10s
hbaseConf.set("hbase.client.retries.number", "2");
hbaseConf.set("hbase.client.operation.timeout", "10000");
return hbaseConf;
} public static void kerberosLogin(Configuration hbConf){
//kerbose
hbConf.set("hadoop.security.authentication", "Kerberos");
hbConf.set("hbase.security.authentication", "kerberos");
hbConf.set("hbase.master.kerberos.principal", "hbase/_HOST@CVBG.COM");
hbConf.set("hbase.regionserver.kerberos.principal", "hbase/_HOST@CVBG.COM");
System.setProperty("javax.security.auth.useSubjectCredOnly", "false");
System.setProperty("java.security.krb5.conf", KerberosCheckUtil.getKrb5Conf());
try{
UserGroupInformation.setConfiguration(hbConf);
if (UserGroupInformation.isLoginKeytabBased() && UserGroupInformation.getLoginUser().getUserName().equals(KerberosCheckUtil.principal)) {
logger.info("hbase:" + hbConf.get("hbase.master")+ ",user [{}] is login already!",KerberosCheckUtil.principal);
}else {
UserGroupInformation.loginUserFromKeytab(KerberosCheckUtil.principal, KerberosCheckUtil.getKeyTabFile());
logger.info("hbase:" + hbConf.get("hbase.master") + ",user [{}] login successed!",KerberosCheckUtil.principal);
}
}catch (IOException e){
e.printStackTrace();
logger.error("kerbose登录报错," + KerberosCheckUtil.getKeyTabFile());
System.exit(1);
}
}
public static User getAuthenticatedUser(){
User loginedUser = null;
try {
logger.info("=====put the logined userinfomation to user====");
loginedUser = User.create(UserGroupInformation.getLoginUser());
} catch (IOException e) {
logger.error("===fialed put the logined userinfomation to user===",e);
}
return loginedUser;
}
}

KerberosCheckUtil类

package com.css.bigdata.dataCompare.util;

public class KerberosCheckUtil {
//主体
public static String principal="dw_hbkal@CVBG.COM";
//秘钥文件
public static String keyTabFileName="dw_hbkal.tab";
//默认配置文件
public static String krb5Conf= "krb5.conf"; public static String getKeyTabFile() {
String runPath = KerberosCheckUtil.class.getResource("/").getPath();
return runPath + keyTabFileName;
//return "file:///root/przhang/dw_hbkal.keytab";
} public static String getKrb5Conf() {
String runPath = KerberosCheckUtil.class.getResource("/").getPath();
return runPath + krb5Conf;
//return "fie:///root/przhang/krb5.conf";
} }

KerberosTableInputFormat类

该类直接拷贝了org.apache.hadoop.hbase.mapreduce.TableInputFormat类的代码,作了两处修改:1.在setConf方法中进行了kerberos认证,并获取认证的用户;2.在创建hbase连接的地方,将经过认证的user,加入到connection中,然后使用这个connection即可对hbase进行读写操作

package com.css.bigdata.dataCompare.hbase;
import java.io.IOException;
import java.util.Collections;
import java.util.List;
import java.util.Locale; import org.apache.hadoop.conf.Configurable;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.CellUtil;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.mapreduce.TableInputFormatBase;
import org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil;
import org.apache.hadoop.hbase.security.User;
import org.apache.yetus.audience.InterfaceAudience;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.apache.hadoop.hbase.client.Connection;
import org.apache.hadoop.hbase.client.ConnectionFactory;
import org.apache.hadoop.hbase.client.RegionLocator;
import org.apache.hadoop.hbase.client.Scan;
import org.apache.hadoop.hbase.util.Bytes;
import org.apache.hadoop.mapreduce.InputSplit;
import org.apache.hadoop.mapreduce.JobContext;
import org.apache.hadoop.hbase.util.Pair;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.util.StringUtils; /**
* Convert HBase tabular data into a format that is consumable by Map/Reduce.
*/
@InterfaceAudience.Public
public class KerberosTableInputFormat extends TableInputFormatBase
implements Configurable { @SuppressWarnings("hiding")
private static final Logger LOG = LoggerFactory.getLogger(KerberosTableInputFormat.class); /** Job parameter that specifies the input table. */
public static final String INPUT_TABLE = "hbase.mapreduce.inputtable";
/**
* If specified, use start keys of this table to split.
* This is useful when you are preparing data for bulkload.
*/
private static final String SPLIT_TABLE = "hbase.mapreduce.splittable";
/** Base-64 encoded scanner. All other SCAN_ confs are ignored if this is specified.
* See {@link TableMapReduceUtil#convertScanToString(Scan)} for more details.
*/
public static final String SCAN = "hbase.mapreduce.scan";
/** Scan start row */
public static final String SCAN_ROW_START = "hbase.mapreduce.scan.row.start";
/** Scan stop row */
public static final String SCAN_ROW_STOP = "hbase.mapreduce.scan.row.stop";
/** Column Family to Scan */
public static final String SCAN_COLUMN_FAMILY = "hbase.mapreduce.scan.column.family";
/** Space delimited list of columns and column families to scan. */
public static final String SCAN_COLUMNS = "hbase.mapreduce.scan.columns";
/** The timestamp used to filter columns with a specific timestamp. */
public static final String SCAN_TIMESTAMP = "hbase.mapreduce.scan.timestamp";
/** The starting timestamp used to filter columns with a specific range of versions. */
public static final String SCAN_TIMERANGE_START = "hbase.mapreduce.scan.timerange.start";
/** The ending timestamp used to filter columns with a specific range of versions. */
public static final String SCAN_TIMERANGE_END = "hbase.mapreduce.scan.timerange.end";
/** The maximum number of version to return. */
public static final String SCAN_MAXVERSIONS = "hbase.mapreduce.scan.maxversions";
/** Set to false to disable server-side caching of blocks for this scan. */
public static final String SCAN_CACHEBLOCKS = "hbase.mapreduce.scan.cacheblocks";
/** The number of rows for caching that will be passed to scanners. */
public static final String SCAN_CACHEDROWS = "hbase.mapreduce.scan.cachedrows";
/** Set the maximum number of values to return for each call to next(). */
public static final String SCAN_BATCHSIZE = "hbase.mapreduce.scan.batchsize";
/** Specify if we have to shuffle the map tasks. */
public static final String SHUFFLE_MAPS = "hbase.mapreduce.inputtable.shufflemaps"; /** The configuration. */
private Configuration conf = null; /** The kerberos authenticated user*/
private User user; /**
* Returns the current configuration.
*
* @return The current configuration.
* @see org.apache.hadoop.conf.Configurable#getConf()
*/
@Override
public Configuration getConf() {
return conf;
} /**
* Sets the configuration. This is used to set the details for the table to
* be scanned.
*
* @param configuration The configuration to set.
* @see org.apache.hadoop.conf.Configurable#setConf(
* org.apache.hadoop.conf.Configuration)
*/
@Override
@edu.umd.cs.findbugs.annotations.SuppressWarnings(value="REC_CATCH_EXCEPTION",
justification="Intentional")
public void setConf(Configuration configuration) {
this.conf = configuration;
//=========get kerberos authentication before create hbase connection========== HBaseUtil.kerberosLogin(conf);
user = HBaseUtil.getAuthenticatedUser();
Scan scan = null; if (conf.get(SCAN) != null) {
try {
scan = TableMapReduceUtil.convertStringToScan(conf.get(SCAN));
} catch (IOException e) {
LOG.error("An error occurred.", e);
}
} else {
try {
scan = createScanFromConfiguration(conf);
} catch (Exception e) {
LOG.error(StringUtils.stringifyException(e));
}
} setScan(scan);
} /**
* Sets up a {@link Scan} instance, applying settings from the configuration property
* constants defined in {@code TableInputFormat}. This allows specifying things such as:
* <ul>
* <li>start and stop rows</li>
* <li>column qualifiers or families</li>
* <li>timestamps or timerange</li>
* <li>scanner caching and batch size</li>
* </ul>
*/
public static Scan createScanFromConfiguration(Configuration conf) throws IOException {
Scan scan = new Scan(); if (conf.get(SCAN_ROW_START) != null) {
scan.setStartRow(Bytes.toBytesBinary(conf.get(SCAN_ROW_START)));
} if (conf.get(SCAN_ROW_STOP) != null) {
scan.setStopRow(Bytes.toBytesBinary(conf.get(SCAN_ROW_STOP)));
} if (conf.get(SCAN_COLUMNS) != null) {
addColumns(scan, conf.get(SCAN_COLUMNS));
} for (String columnFamily : conf.getTrimmedStrings(SCAN_COLUMN_FAMILY)) {
scan.addFamily(Bytes.toBytes(columnFamily));
} if (conf.get(SCAN_TIMESTAMP) != null) {
scan.setTimestamp(Long.parseLong(conf.get(SCAN_TIMESTAMP)));
} if (conf.get(SCAN_TIMERANGE_START) != null && conf.get(SCAN_TIMERANGE_END) != null) {
scan.setTimeRange(
Long.parseLong(conf.get(SCAN_TIMERANGE_START)),
Long.parseLong(conf.get(SCAN_TIMERANGE_END)));
} if (conf.get(SCAN_MAXVERSIONS) != null) {
scan.setMaxVersions(Integer.parseInt(conf.get(SCAN_MAXVERSIONS)));
} if (conf.get(SCAN_CACHEDROWS) != null) {
scan.setCaching(Integer.parseInt(conf.get(SCAN_CACHEDROWS)));
} if (conf.get(SCAN_BATCHSIZE) != null) {
scan.setBatch(Integer.parseInt(conf.get(SCAN_BATCHSIZE)));
} // false by default, full table scans generate too much BC churn
scan.setCacheBlocks((conf.getBoolean(SCAN_CACHEBLOCKS, false))); return scan;
} @Override
protected void initialize(JobContext context) throws IOException {
// Do we have to worry about mis-matches between the Configuration from setConf and the one
// in this context?
TableName tableName = TableName.valueOf(conf.get(INPUT_TABLE));
try {
//====================add authenticated user ===================
initializeTable(ConnectionFactory.createConnection(new Configuration(conf),user), tableName);
} catch (Exception e) {
LOG.error(StringUtils.stringifyException(e));
}
} /**
* Parses a combined family and qualifier and adds either both or just the
* family in case there is no qualifier. This assumes the older colon
* divided notation, e.g. "family:qualifier".
*
* @param scan The Scan to update.
* @param familyAndQualifier family and qualifier
* @throws IllegalArgumentException When familyAndQualifier is invalid.
*/
private static void addColumn(Scan scan, byte[] familyAndQualifier) {
byte [][] fq = CellUtil.parseColumn(familyAndQualifier);
if (fq.length == 1) {
scan.addFamily(fq[0]);
} else if (fq.length == 2) {
scan.addColumn(fq[0], fq[1]);
} else {
throw new IllegalArgumentException("Invalid familyAndQualifier provided.");
}
} /**
* Adds an array of columns specified using old format, family:qualifier.
* <p>
* Overrides previous calls to {@link Scan#addColumn(byte[], byte[])}for any families in the
* input.
*
* @param scan The Scan to update.
* @param columns array of columns, formatted as <code>family:qualifier</code>
* @see Scan#addColumn(byte[], byte[])
*/
public static void addColumns(Scan scan, byte [][] columns) {
for (byte[] column : columns) {
addColumn(scan, column);
}
} /**
* Calculates the splits that will serve as input for the map tasks. The
* number of splits matches the number of regions in a table. Splits are shuffled if
* required.
* @param context The current job context.
* @return The list of input splits.
* @throws IOException When creating the list of splits fails.
* @see org.apache.hadoop.mapreduce.InputFormat#getSplits(
* org.apache.hadoop.mapreduce.JobContext)
*/
@Override
public List<InputSplit> getSplits(JobContext context) throws IOException {
List<InputSplit> splits = super.getSplits(context);
if ((conf.get(SHUFFLE_MAPS) != null) && "true".equals(conf.get(SHUFFLE_MAPS).toLowerCase(Locale.ROOT))) {
Collections.shuffle(splits);
}
return splits;
} /**
* Convenience method to parse a string representation of an array of column specifiers.
*
* @param scan The Scan to update.
* @param columns The columns to parse.
*/
private static void addColumns(Scan scan, String columns) {
String[] cols = columns.split(" ");
for (String col : cols) {
addColumn(scan, Bytes.toBytes(col));
}
} @Override
protected Pair<byte[][], byte[][]> getStartEndKeys() throws IOException {
if (conf.get(SPLIT_TABLE) != null) {
TableName splitTableName = TableName.valueOf(conf.get(SPLIT_TABLE));
//====================add authenticated user ===================
try (Connection conn = ConnectionFactory.createConnection(getConf(),user)) {
try (RegionLocator rl = conn.getRegionLocator(splitTableName)) {
return rl.getStartEndKeys();
}
}
} return super.getStartEndKeys();
} /**
* Sets split table in map-reduce job.
*/
public static void configureSplitTable(Job job, TableName tableName) {
job.getConfiguration().set(SPLIT_TABLE, tableName.getNameAsString());
}
}

主程序示例类

package com.css.bigdata.dataCompare;
import com.css.bigdata.dataCompare.hbase.HBaseUtil;
import com.css.bigdata.dataCompare.hbase.KerberosTableInputFormat;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.Cell;
import org.apache.hadoop.hbase.CellUtil;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.client.Scan;
import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
import org.apache.hadoop.hbase.mapreduce.TableInputFormat;
import org.apache.hadoop.hbase.util.Bytes;
import org.apache.spark.api.java.JavaPairRDD;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil; import org.apache.spark.api.java.function.PairFunction;
import org.apache.spark.sql.SparkSession;
import scala.Tuple2; import java.io.IOException;
import java.util.HashMap;
import java.util.Map; public class HBaseCompare { private static Configuration getKerberosLoginConf(String cluster){
Configuration conf = HBaseUtil.getHbaseConfiguration(cluster);
//HBaseUtil.kerberosLogin(conf);
return conf;
} //获取hbase数据并转换 private static JavaPairRDD<String, Map<String,String>> getTableDataRDD(Configuration hconf,String tableName, JavaSparkContext sc) throws IOException {
hconf.set(KerberosTableInputFormat.INPUT_TABLE,tableName);
//添加scan
String scanToString = TableMapReduceUtil.convertScanToString(new Scan());
hconf.set(KerberosTableInputFormat.SCAN, scanToString);
//hbase数据转化为RDD JavaPairRDD<ImmutableBytesWritable, Result> dataRDD = sc.newAPIHadoopRDD(hconf,KerberosTableInputFormat.class,ImmutableBytesWritable.class,Result.class);
//hbase的Result对象不支持序列化
JavaPairRDD<String, Map<String,String>> dataRowsRDD = dataRDD.mapToPair(new PairFunction<Tuple2<ImmutableBytesWritable, Result>, String, Map<String,String>>() {
@Override
public Tuple2<String, Map<String,String>> call(Tuple2<ImmutableBytesWritable, Result> immutableBytesWritableResultTuple2) throws Exception {
Result result = immutableBytesWritableResultTuple2._2;
HashMap<String,String> resultMap = new HashMap<String, String>();
for(Cell cell : result.rawCells()) {
resultMap.put(new String(CellUtil.cloneQualifier(cell)).toLowerCase(), new String(CellUtil.cloneValue(cell)));
}
return new Tuple2<>(Bytes.toString(result.getRow()),resultMap);
}
});
return dataRowsRDD;
} public static void main(String[] args) {
String ip = args[0];
String table = args[1]; //SparkSession session = SparkSession.builder().appName("hbase example").master("local").getOrCreate();
SparkSession session = SparkSession.builder().appName("hbase example").getOrCreate();
JavaSparkContext sc = JavaSparkContext.fromSparkContext(session.sparkContext());
Configuration srcConf = getKerberosLoginConf(ip);
try{
JavaPairRDD<String, Map<String,String>> srcRowsRDD = getTableDataRDD(srcConf,table,sc);
//使用数据
//...
} catch (Exception e){
e.printStackTrace();
}
}
}

打包,提交yarn集群执行

打包时,依赖打入到lib目录,kerberos的配置文件krb5.conf以及kerberos登录的秘钥文件dw_hbkal.tab文件打包到conf中,程序本身打成jar包放入bin目录,然后以yarn-client模式提交任务

spark-submit --keytab ../conf/kerberos/dw_hbkal.keytab --principal dw_hbkal@CVBG.COM --files ../conf/kerberos/dw_hbkal.keytab,../conf/kerberos/krb5.conf --master yarn --jars ../lib/hbase-client-2.1.0-cdh6.2.1.jar,../lib/hbase-server-2.1.0-cdh6.2.1.jar,../lib/hbase-mapreduce-2.1.0-cdh6.2.1.jar --class com.css.bigdata.dataCompare.HBaseCompare data-compare-1.0-SNAPSHOT.jar 172.xxx.xxx.xxx testtable

记录坑

  1. 未使用自定义的KerberosTableInputFormat的类,在主程序类HBaseCompare中的getKerberosLoginConf方法中进行了kerberos认证,在本地IDEA中以local模式运行时可以正常执行,但是当提交到yarn集群时,执行失败,报错提示executor无法访问hbase集群,查了好久,突然意识到,主程序类中非RDD操作相关的代码是在driver端执行的,相当于在driver端进行了认证,而executor执行时并没有进行认证,后来找到了这篇博客有kerberos认证hbase在spark环境下的使用,重写了KerberosTableInputFormat类,并在该类中进行了kerberos认证。

  2. 解决了上述问题后,考虑到应用的jar包会被分发到各个executor节点中,因此将dw_hbkal.keytab、krb5.conf文件打到了jar包中,然后在代码KerberosCheckUtil中返回文件路径,然而提交后,程序一直提示找不到文件。。。于是又尝试将这两个文件在集群上各个节点存放了一份,并在KerberosCheckUtil中返回了文件的绝对路径,然而程序运行时依旧提示找不到文件。。

  3. 查看spark-submit命令,发现有--files参数,并说明通过该参数提交的文件会被分发到各个executor节点的运行内存中,于是果断试验一把,jar中不打入kerberos文件,然后在spark-submit提交时,加上了--files参数,终于程序正常运行了。反思了下,任务是在yarn容器中运行的,实际路径并不知道是什么样的,写入绝对路径或者文件放入jar中这些方式,kerberos认证时并不能找到文件,而通过spark-submit --files选项,spark自身已经解决了这些问题,保证可以在内存中读到这些文件,不知道是不是可以这样理解?

使用Spark的newAPIHadoopRDD接口访问有kerberos认证的hbase的更多相关文章

  1. 用Java访问带有Kerberos认证的HBase

    程序代码实例如下:    package com.hbasedemo; import java.io.IOException; import org.apache.hadoop.conf.Config ...

  2. java 连接 kerberos 认证的 HBase 和 HDFS

    这是两个功能,都很简单就写一块了.. 简单到什么程度呢,简单到只贴代码就可以了... HBase package com.miras.data; import org.apache.hadoop.co ...

  3. spark 2.x在windows环境使用idea本地调试启动了kerberos认证的hive

    1 概述 开发调试spark程序时,因为要访问开启kerberos认证的hive/hbase/hdfs等组件,每次调试都需要打jar包,上传到服务器执行特别影响工作效率,所以调研了下如何在window ...

  4. cloudera集群开启kerberos认证后,删除zk中的/hbase目录

    问题 在cdh集群中开启了kerberos认证,hbase集群出现一点问题,需要通过zookeeper-client访问zookeeper,删除/hbase节点时候报错:Authentication ...

  5. API接口访问频次限制 / 网站恶意爬虫限制 / 网站恶意访问限制 方案

    API接口访问频次限制 / 网站恶意爬虫限制 / 网站恶意访问限制 方案 采用多级拦截,后置拦截的方式体系化解决 1 分层拦截 1.1 第一层 商业web应用防火墙(WAF) 直接用商业服务 传统的F ...

  6. seller【2】Mock数据(接口访问配置)

    Mock数据 在文件[vue.config.js] - devServer 字段 - before(app)函数配置数据接口访问 const appData = require('./data.jso ...

  7. window 环境下jdbc访问启用kerberos的impala

    最近,公司生产集群添加kerberos安全认证后,访问集群的任何组件都需要进行认证,这样问题来了,对于impala,未配置kerberos安全认证之前通过impala的jdbc驱动(impala-jd ...

  8. yarn 用户导致的被挖矿 启用Kerberos认证功能,禁止匿名访问修改8088端口

    用户为dr.who,问下内部使用人员,都没有任务在跑: 结论: 恭喜你,你中毒了,攻击者利用Hadoop Yarn资源管理系统REST API未授权漏洞对服务器进行攻击,攻击者可以在未授权的情况下远程 ...

  9. redis 限制接口访问频率

    代码: <?php /** * */ class myRedis { private static $redis = null; /** * @return null|Redis */ publ ...

随机推荐

  1. tp5 跨域问题

    只需要三行代码,写到入口文件public/index.php处即可解决 header("Access-Control-Allow-Origin:*"); header(" ...

  2. 上周我面了个三年 Javaer,这几个问题都没答出来

    身为 Java Web 开发我发现很多人一些 Web 基础问题都答不上来. 上周我面试了一个三年经验的小伙子,一开始我问他 HTTP/1.HTTP/2相关的他到是能答点东西出来. 后来我问他:你知道 ...

  3. Codeforces Round #667 (Div. 3) B、C、D、E 题解

    抱歉B.C题咕了这么久 B. Minimum Product #枚举 #贪心 题目链接 题意 给定四个整数\(a, b, x, y\),其中\(a\geq x, b\geq y\),你可以执行不超过\ ...

  4. csust T1097 “是时候表演真正的技术了” 题解(虚点跑最短路)

    题目链接 题目大意 给你n个点m条路,以及k个宝藏点,q次查询要你求出距离这个点最近的宝藏点的距离 题目思路 一个套路题,建立虚点与k个点连一个权值为0的边,跑最短路即可 注意边多了4000条 代码 ...

  5. Java基础教程——BigDecimal类

    BigDecimal类 float.double类型的数字在计算的时候,容易发生精度丢失. 使用java.math.BigDecimal类可以解决此类问题. 前面讲过Math类,现在的BigDecim ...

  6. django搭建完毕运行显示hello django

    1.使用pycharm打开工程,进入工程配置解释器路径 2.视图和url 视图:处理我们从业务的地方,可以理解为函数 url:进行路由匹配的地方,先在主工程bookpro中进行匹配,如果匹配ok,那么 ...

  7. Alpha冲刺-第六次冲刺笔记

    Alpha冲刺-冲刺笔记 这个作业属于哪个课程 https://edu.cnblogs.com/campus/fzzcxy/2018SE2 这个作业要求在哪里 https://edu.cnblogs. ...

  8. 使用Docker快速部署各类服务

    使用Docker快速部署各类服务 一键安装Docker #Centos环境 wget -O- https://gitee.com/iubest/dinstall/raw/master/install. ...

  9. redis集群管理--sentinel

    什么是sentinel? Sentinel(哨兵)是用于监控redis集群中Master状态的工具,是Redis 的高可用性解决方案,sentinel哨兵模式已经被集成在redis2.4之后的版本中. ...

  10. 第15.6节 PyQt5安装与配置

    一. 引言 关于PyQt5的安装网上有很多的文章,老猿也是学习了好多,最后结合其他模块安装的知识发现其实安装很简单,就是直接使用pip或pip3安装就可以了,这样既无需预先下载好软件,也无需担心版本的 ...