4、记录1----获取hdfs上FileSystem的方法记录2：正则匹配路径：linux、hdfs

    /**

     * 获取hadoop相关配置信息

     * @param hadoopConfPath 目前用户需要提供hadoop的配置文件路径

     * @return

     */

    public static Configuration getHadoopConfig(String hadoopConfPath){

        Configuration conf=new Configuration();

        conf.addResource(new Path(hadoopConfPath+"/core-site.xml"));

        conf.addResource(new Path(hadoopConfPath+"/hdfs-site.xml"));

        return conf;

    }

    /**

     * 获取hdfs文件系统连接

     * @param hadoopConfPath 目前用户需要提供hadoop的配置文件路径

     * @return

     */

    public static FileSystem getFileSystem(String hadoopConfPath) {

        Configuration conf=new Configuration();

        conf.addResource(new Path(hadoopConfPath+"/core-site.xml"));

        conf.addResource(new Path(hadoopConfPath+"/hdfs-site.xml"));

        FileSystem fs = null;

        try {

            fs=FileSystem.get(conf);

        } catch (IOException e) {

            LOGGER.error("从path={}路径获取hadoop配置信息错误:{}", hadoopConfPath, e.getMessage());

        }

        return fs;

    }

正则匹配路径的方法：

 /**

     * 通过正则获取该目录下满足条件的所有目录

     * @param luceneFilePathRegular  正则目录，如/user/solrindex/正则表达式

     * @return 满足正则表达式的目录集合 list

     */

    public static List<String> fetchDirByRegularLinux(String luceneFilePathRegular){

        List<String> list=new ArrayList<>();

        //分割获取主目录

        int len= luceneFilePathRegular.lastIndexOf(EtlConstants.LINUX_ROUTE_SEGMENT)+1;

        String mainDir=luceneFilePathRegular.substring(0, len);

        String regular=luceneFilePathRegular.substring(len,luceneFilePathRegular.length());

        File dir=new File(mainDir);

        if(dir.exists() && dir.isDirectory()){

            File [] arr= dir.listFiles();

            for (File file : arr) {

                if (file.exists() && file.isDirectory()) {

                    String fileName = file.getName();

                    if (matchStr(fileName, regular)) {

                        list.add(file.getAbsolutePath()+SolrUtil.INDEX_DIR_SUFFIX);

                    }

                }

            }

        }

        if(list.size()>0){

            LOGGER.info("通过正则匹配到的Solr目录有：");

            for (String s : list) {

                LOGGER.info(s);

            }

        }else{

            LOGGER.error("路径{}下，不存在满足正则：{}条件的目录", dir, regular);

        }

        return  list;

    }

    /**

     * 通过正则获取该目录下满足条件的所有目录

     * @param luceneFilePathRegular 正则目录，如hdfs:/user/solrindex/正则表达式

     * @param nameNodeConfigPath //获取name配置信息目录

     * @return 满足正则表达式的目录集合 list

     */

    public static List<String> fetchDirByRegularHdfs(String luceneFilePathRegular,String nameNodeConfigPath){

        List<String> list=new ArrayList<>();

        FileSystem fs=HdfsUtil.getFileSystem(nameNodeConfigPath);

        String prefixHdfs=luceneFilePathRegular.split(":")[0];

        String hdfsPath=luceneFilePathRegular.split(":")[1];

        //分割获取主目录

        int len= hdfsPath.lastIndexOf(EtlConstants.LINUX_ROUTE_SEGMENT)+1;

        String mainDir=hdfsPath.substring(0, len);

        String regular=hdfsPath.substring(len, hdfsPath.length());

        try {

            FileStatus[] fileStatuses = fs.globStatus(new Path(mainDir+"*"));

            for (FileStatus fileStatus : fileStatuses){

                if (fileStatus.isDirectory() && matchStr(fileStatus.getPath().getName(), regular)) {

                    list.add(prefixHdfs+":"+mainDir+fileStatus.getPath().getName()+SolrUtil.INDEX_DIR_SUFFIX);

                }

            }

        } catch (IOException e) {

            LOGGER.error("获取hdfs目录信息异常，路径：{}，异常信息：{}",luceneFilePathRegular,e.getMessage());

            e.printStackTrace();

        }

        if(list.size()>0){

            LOGGER.info("通过正则匹配到的Solr目录有：");

            for (String s : list) {

                LOGGER.info(s);

            }

        }else{

            LOGGER.error("路径{}下，不存在满足正则：{}条件的目录", luceneFilePathRegular, regular);

        }

        return  list;

    }

    /**

     * @Method Description:按正则表示是匹配字符串

     * @param str

     * @param regular

     * @return

     * @author: libingjie

     */

    public static Boolean matchStr(String str, String regular) {

        Pattern pattern = Pattern.compile(regular);

        Matcher matcher = pattern.matcher(str);

        return matcher.matches();

    }

4、记录1----获取hdfs上FileSystem的方法记录2：正则匹配路径：linux、hdfs的更多相关文章

php程序无法记录log情况下可尝试下面方法记录log
error_reporting(E_ERROR | E_PARSE); function shutdownCallback(){ $arrError = error_get_last(); // ...
python获取文件扩展名的方法(转)
主要介绍了python获取文件扩展名的方法,涉及Python针对文件路径的相关操作技巧.具体实现方法如下: 1 2 3 4 import os.path def file_extension(path ...
python获取文件扩展名的方法
主要介绍了python获取文件扩展名的方法,涉及Python针对文件路径的相关操作技巧 import os.path def file_extension(path): ] print file_ex ...
Linux记录-shell获取hdfs used使用
#!/bin/bash export JAVA_HOME=/app/jdk/jdk1.8.0_92 export HADOOP_HOME=/app/hadoop export HADOOP_CONF_ ...
Eclipse 上传删除下载分析 hdfs 上的文件
本篇讲解如何通过Eclipse 编写代码去操作分析hdfs 上的文件. 1.在eclipse 下新建Map/Reduce Project项目.如图: 项目建好后,会默认加载一系列相应的jar包. 下 ...
用流的方式来操作hdfs上的文件
import java.io.FileInputStream; import java.io.FileOutputStream; import java.io.IOException; import ...
HBase 在HDFS 上的目录树
总所周知,HBase 是天生就是架设在 HDFS 上,在这个分布式文件系统中,HBase 是怎么去构建自己的目录树的呢? 这里只介绍系统级别的目录树. 一.0.94-cdh4.2.1版本系 ...
hadoop的API对HDFS上的文件访问
这篇文章主要介绍了使用hadoop的API对HDFS上的文件访问,其中包括上传文件到HDFS上.从HDFS上下载文件和删除HDFS上的文件,需要的朋友可以参考下hdfs文件操作操作示例,包括上传文件到 ...
iOS获取UIView上某点的颜色值
项目需求中遇到获取UIView上某个坐标点的RGB颜色值的需求,现在把自己找到的解决方案简单总结记录一下,遇到了下面的情况: 不可移动的UIView 旋转式的UIView 滑条式的UIView 不可移 ...

随机推荐

【干货】国外程序员整理的 C++ 资源大全【转】
来自 https://github.com/fffaraz/awesome-cpp A curated list of awesome C/C++ frameworks, libraries, res ...
php基础之三数组
一.正则表达式: 1. "/"代表界定符, "^"代表开始符号 "&"结束符号 eg: $reg="/(13[0-9] ...
恢复root用户目录，及~目录
普通帐号登su;mkdir /root;chown root:root /root cp -R /etc/skel/.[!.]* ./
移动端远程关闭PC端实现（一）需求设计
公司有台半新不旧的电脑,因无甚大用,就拿来做了服务器,服务于民.服务器所提供的功能不是太多,无非是数据库以及svn服务. 公司每天下班会断电,我们吧会常常忘记关闭服务器,所以服务器非正常关机的次数约等 ...
使用接口的方式调用远程服务 ------ 利用动态调用服务，实现.net下类似Dubbo的玩法。
分布式微服务现在成为了很多公司架构首先项,据我了解,很多java公司架构都是 Maven+Dubbo+Zookeeper基础上扩展的. Dubbo是Alibaba开源的分布式服务框架,它最大的特点是按 ...
Codeforces 478D Red-Green Towers
http://codeforces.com/problemset/problem/478/D 思路:dp:f[i][j]代表当前第i层,用了j个绿色方块的方案数,用滚动数组,还有,数组清零的时候一定要 ...
Could not find *.apk!解决办法
右键点击项目选择Properties,把Libraries下Android x.x给remove了. 点右侧的Add Library,选择JRE System Library然后next,重新指定JR ...
Temporary failure in name resolution
公司搬家,在一台测试机上执行git clone,出现错误 ssh: Could not resolve hostname **: Temporary failure in name resolutio ...
UVA11922--Permutation Transformer (伸展树Splay)
题意:m条操作指令,对于指令 a b 表示取出第a~b个元素,翻转后添加到排列的尾部. 水题卡了一个小时,一直过不了样例. 原来是 dfs输出的时候忘记向下传递标记了. #include < ...
Java高级软件工程师面试考纲
如果要应聘高级开发工程师职务,仅仅懂得Java的基础知识是远远不够的,还必须懂得常用数据结构.算法.网络.操作系统等知识.因此本文不会讲解具体的技术,笔者综合自己应聘各大公司的经历,整理了一份大公司对 ...

4、记录1----获取hdfs上FileSystem的方法 记录2：正则匹配路径：linux、hdfs

4、记录1----获取hdfs上FileSystem的方法 记录2：正则匹配路径：linux、hdfs的更多相关文章

随机推荐

热门专题

4、记录1----获取hdfs上FileSystem的方法记录2：正则匹配路径：linux、hdfs

4、记录1----获取hdfs上FileSystem的方法记录2：正则匹配路径：linux、hdfs的更多相关文章