原文地址:http://zh.hortonworks.com/hadoop-tutorial/using-commandline-manage-files-hdfs/

In this tutorial we will walk through some of the basic HDFS commands you will need to manage files on HDFS. To complete this tutorial you will need a working HDP cluster. The easiest way to have a Hadoop cluster is to download the Hortonworks Sandbox.

Let’s get started.

Step 1: Let’s create a directory in HDFS, upload a file and list.

Let’s look at the syntax first:

hadoop fs -mkdir:
  • It will take path uri’s as argument and creates directory or directories.
    Usage:
hadoop fs -mkdir <paths>
Example:
hadoop fs -mkdir /user/hadoop/dir1 /user/hadoop/dir2
hadoop fs -mkdir hdfs://nn1.example.com/user/hadoop/dir
hadoop fs -ls:
  • Lists the contents of a directory
  • For a file returns stats of a file
    Usage:
hadoop fs -ls <args>
Example:
hadoop fs -ls /user/hadoop/dir1 /user/hadoop/dir2
hadoop fs -ls /user/hadoop/dir1/filename.txt
hadoop fs -ls hdfs://<hostname>:9000/user/hadoop/dir1/

Let’s use the following commands as follows and execute. You can ssh to the sandbox using Tools like Putty. You could download putty.exe from the internet.

Let’s touch a file locally.

$ touch filename.txt

Step 2: Now, let’s check how to find out space utilization in a HDFS dir.

hadoop fs -du:
  • Displays sizes of files and directories contained in the given directory or the size of a file if its just a file.
    Usage:
hadoop fs -du URI
Example:
hadoop fs -du /user/hadoop/ /user/hadoop/dir1/Sample.txt

Step 4:

Now let’s see how to upload and download files from and to Hadoop Data File System(HDFS)
Upload: ( we have already tried this earlier)

hadoop fs -put:
  • Copy single src file, or multiple src files from local file system to the Hadoop data file system
    Usage:
hadoop fs -put <localsrc> ... <HDFS_dest_Path>
Example:
hadoop fs -put /home/ec2-user/Samplefile.txt ./ambari.repo /user/hadoop/dir3/

Download:
hadoop fs -get:

  • Copies/Downloads files to the local file system
    Usage:
hadoop fs -get <hdfs_src> <localdst>
Example:
hadoop fs -get /user/hadoop/dir3/Samplefile.txt /home/

Step 5: Let’s look at quickly two advanced features.

hadoop fs -getmerge
  • Takes a source directory files as input and concatenates files in src into the destination local file.
    Usage:
hadoop fs -getmerge <src> <localdst> [addnl]
Example:
hadoop fs -getmerge /user/hadoop/dir1/ ./Samplefile2.txt
Option:
addnl: can be set to enable adding a newline on end of each file
hadoop distcp:
  • Copy file or directories recursively
  • It is a tool used for large inter/intra-cluster copying
  • It uses MapReduce to effect its distribution copy, error handling and recovery, and reporting
    Usage:
hadoop distcp <srcurl> <desturl>
Example:
hadoop distcp hdfs://<NameNode1>:8020/user/hadoop/dir1/ \
hdfs://<NameNode2>:8020/user/hadoop/dir2/

You could use the following steps to perform getmerge and discp.
Let’s upload two files for this exercise first:

# touch txt1 txt2
# hadoop fs -put txt1 txt2 /user/hadoop/dir2/
# hadoop fs -ls /user/hadoop/dir2/

Step 6:Getting help

You can use Help command to get list of commands supported by Hadoop Data File System(HDFS)

    Example:
hadoop fs -help

Hope this short tutorial was useful to get the basics of file management.

Using the command line to manage files on HDFS--转载的更多相关文章

  1. How to Use Android ADB Command Line Tool

    Android Debug Bridge (adb) is a tool that lets you manage the state of an emulator instance or Andro ...

  2. 18 Command Line Tools to Monitor Linux Performance

    By Ravi Saive Under: Linux Commands, Monitoring Tools On: December 26, 2013 http://www.tecmint.com/c ...

  3. How to build .apk file from command line(转)

    How to build .apk file from command line Created on Wednesday, 29 June 2011 14:32 If you don’t want ...

  4. Can't use Subversion command line client: svn Probably the path to Subversion executable is wrong. Fix it.

    1.最近使用SVN工具时,Checkout出项目到本地后后,然后将其导入到Intellij idea中开发,在提交svn代码的时候,出现这样的错误:Can't use Subversion comma ...

  5. 使用intellij的svn时提示出错: Can't use Subversion command line client: svn.Errors found while svn working copies detection.

    使用Intellij的svn时提示出错:Can't use Subversion command line client: svn. Errors found while svn working co ...

  6. MySQL 5.7 Command Line Client输入密码后闪退和windows下mysql忘记root密码的解决办法

    MySQL 5.7 Command Line Client输入密码后闪退的问题: 问题分析: 1.查看mysql command line client默认执行的一些参数.方法:开始->所有程序 ...

  7. 《The Linux Command Line》 读书笔记01 基本命令介绍

    <The Linux Command Line> 读书笔记01 基本命令介绍 1. What is the Shell? The Shell is a program that takes ...

  8. Linux Command Line Basics

    Most of this note comes from the Beginning the Linux Command Line, Second Edition by Sander van Vugt ...

  9. python click module for command line interface

    Click Module(一)                                                  ----xiaojikuaipao The following mat ...

随机推荐

  1. 20155304 2016-2017-2 《Java程序设计》第十周学习总结

    20155304 2016-2017-2 <Java程序设计>第十周学习总结 教材学习内容总结 网络编程 网络编程就是在两个或两个以上的设备(例如计算机)之间传输数据.程序员所作的事情就是 ...

  2. 20155321 2016-2017-2 《Java程序设计》第十周学习总结

    20155321 2016-2017-2 <Java程序设计>第十周学习总结 教材学习内容总结 网络概览 局域网和广域网:局域网通常限定在一个有效的地理区域之内,广域网由许多局域网组成.最 ...

  3. POI导出excel文件样式

    需求: 公司业务和银行挂钩,各种形式的数据之间交互性比较强,这就涉及到了存储形式之间的转换 比如数据库数据与excel文件之间的转换 解决: 我目前使用过的是POI转换数据库和文件之间的数据,下边上代 ...

  4. Java读取Propertity文件

    读取propertity 文件其实很简单,就是每次容易搞错文件路径,今天刚好项目又用到了,顺便记下来,以便以后参考: 目录如下: 代码如下: package com.infs.exam.process ...

  5. springmvc @Valid 接收实体类时出现bean为null的问题

    这是因为传到后端之后,全部以全小写形式处理了 所以前端也需要把name设置为全小写,否则后端不识别,导致接收不到,导致为null

  6. MySQLdb in Python: “Can't connect to MySQL server on 'localhost'”

    因为我使用的是win64,所以在此系统下,需要设置为 127.0.0.1 #coding=utf-8 import MySQLdb if __name__ == '__main__': # 打开数据库 ...

  7. BZOJ3174. [TJOI2013]拯救小矮人(dp)

    题目链接 https://www.lydsy.com/JudgeOnline/problem.php?id=3174 题解 其实此题并不需要那么多YY的部分. 我们考虑若干个小矮人逃出的顺序.若跳出的 ...

  8. abp core版本添加额外应用层

    1.新建类库WebProject.Application.App 2.添加WebProjectApplicationAppModule.cs 3.注册模块 using Abp.Application. ...

  9. python进程-进阶

    进程同步(multiprocessing.Lock(锁机制).multiprocessing.Semaphore(信号量机制).multiprocessing.Event(事件机制)) 在计算机中,有 ...

  10. phpquery 学习笔记

    phpQuery是一个基于PHP的服务端开源项目,它可以让PHP开发人员轻松处理DOM文档内容,比如获取某新闻网站的头条信息.更有意思的是,它采用了jQuery的思想,你可以像使用jQuery一样处理 ...