一、

参考 https://www.cnblogs.com/bovenson/p/5801536.html

[root@node- test]# chown hdfs:hdfs     /root/test/*
[root@node-1 test]# chown hdfs:hdfs /root/test
[root@node-1 test]# cd /var/lib/hadoop-hdfs/
[root@node-1 hadoop-hdfs]# ls
[root@node-1 hadoop-hdfs]# mkdir /var/lib/hadoop-hdfs/test
[root@node-1 hadoop-hdfs]# mv /root/test/* /var/lib/hadoop-hdfs/test
[root@node-1 hadoop-hdfs]# su hdfs
Attempting to create directory /var/lib/hadoop-hdfs/perl5
[hdfs@node-1 ~]$ hadoop fs -put test/*.txt /user/tt/me.txt
$ hadoop fs -ls -R /user/tt
-rw-r--r-- 3 hdfs supergroup 56302 2018-01-30 20:59 /user/tt/me.txt # spark-shell --master spark://node-1:7077 --executor-memory 100M --driver-memory 1000M Using Scala version 2.10.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_162)
Type in expressions to have them evaluated.
Type :help for more information.
Spark context available as sc (master = spark://node-1:7077, app id = app-20180130211525-0002).
SQL context available as sqlContext. scala> val lines = sc.textFile("hdfs://node-1:8020/user/tt/me.txt")
lines: org.apache.spark.rdd.RDD[String] = hdfs://node-1:8020/user/tt/me.txt MapPartitionsRDD[1] at textFile at <console>:27 scala> lines.count()
res0: Long = 10000 scala> lines.first()
res1: String = 4 scala>

spark 文件行数统计-命令行

二、

备注:

# spark-shell  --help
Usage: ./bin/spark-shell [options] Options:
--master MASTER_URL spark://host:port, mesos://host:port, yarn, or local.
--deploy-mode DEPLOY_MODE Whether to launch the driver program locally ("client") or
on one of the worker machines inside the cluster ("cluster")
(Default: client).
--class CLASS_NAME Your application's main class (for Java / Scala apps).
--name NAME A name of your application.
--jars JARS Comma-separated list of local jars to include on the driver
and executor classpaths.
--packages Comma-separated list of maven coordinates of jars to include
on the driver and executor classpaths. Will search the local
maven repo, then maven central and any additional remote
repositories given by --repositories. The format for the
coordinates should be groupId:artifactId:version.
--exclude-packages Comma-separated list of groupId:artifactId, to exclude while
resolving the dependencies provided in --packages to avoid
dependency conflicts.
--repositories Comma-separated list of additional remote repositories to
search for the maven coordinates given with --packages.
--py-files PY_FILES Comma-separated list of .zip, .egg, or .py files to place
on the PYTHONPATH for Python apps.
--files FILES Comma-separated list of files to be placed in the working
directory of each executor. --conf PROP=VALUE Arbitrary Spark configuration property.
--properties-file FILE Path to a file from which to load extra properties. If not
specified, this will look for conf/spark-defaults.conf. --driver-memory MEM Memory for driver (e.g. 1000M, 2G) (Default: 1024M).
--driver-java-options Extra Java options to pass to the driver.
--driver-library-path Extra library path entries to pass to the driver.
--driver-class-path Extra class path entries to pass to the driver. Note that
jars added with --jars are automatically included in the
classpath. --executor-memory MEM Memory per executor (e.g. 1000M, 2G) (Default: 1G). --proxy-user NAME User to impersonate when submitting the application. --help, -h Show this help message and exit
--verbose, -v Print additional debug output
--version, Print the version of current Spark Spark standalone with cluster deploy mode only:
--driver-cores NUM Cores for driver (Default: ). Spark standalone or Mesos with cluster deploy mode only:
--supervise If given, restarts the driver on failure.
--kill SUBMISSION_ID If given, kills the driver specified.
--status SUBMISSION_ID If given, requests the status of the driver specified. Spark standalone and Mesos only:
--total-executor-cores NUM Total cores for all executors. Spark standalone and YARN only:
--executor-cores NUM Number of cores per executor. (Default: in YARN mode,
or all available cores on the worker in standalone mode) YARN-only:
--driver-cores NUM Number of cores used by the driver, only in cluster mode
(Default: ).
--queue QUEUE_NAME The YARN queue to submit to (Default: "default").
--num-executors NUM Number of executors to launch (Default: ).
--archives ARCHIVES Comma separated list of archives to be extracted into the
working directory of each executor.
--principal PRINCIPAL Principal to be used to login to KDC, while running on
secure HDFS.
--keytab KEYTAB The full path to the file that contains the keytab for the
principal specified above. This keytab will be copied to
the node running the Application Master via the Secure
Distributed Cache, for renewing the login tickets and the
delegation tokens periodically.

spark-shell --help

CDH spark 命令行测试的更多相关文章

  1. Linux命令行测试网速speedtest.net

    Linux命令行测试网速speedtest.net 当发现上网速度变慢时,人们通常会先首先测试自己的电脑到网络服务提供商(通常被称为"最后一公里")的网络连接速度.在可用于测试宽带 ...

  2. I.MX6 Android CAN 命令行测试

    /********************************************************************* * I.MX6 Android CAN 命令行测试 * 说 ...

  3. [转]使用Linux命令行测试网速

    装speedtest-cli speedtest-cli是一个用Python编写的轻量级Linux命令行工具,在Python2.4至3.4版本下均可运行.它基于Speedtest.net的基础架构来测 ...

  4. 使用Linux命令行测试网速

    安装speedtest speedtest是一个用Python编写的轻量级Linux命令行工具,在Python2.4至3.4版本下均可运行.它基于Speedtest.net的基础架构来测量网络的上/下 ...

  5. 使用Linux命令行测试网速-----speedtest-cli

    https://github.com/sivel/speedtest-cli 当发现上网速度变慢时,人们通常会先首先测试自己的电脑到网络服务提供商(通常被称为“最后一公里”)的网络连接速度.在可用于测 ...

  6. gtest命令行测试案例

    使用gtest编写的测试案例通常本身就是一个可执行文件,因此运行起来非常方便.同时,gtest也为我们提供了一系列的运行参数(环境变量.命令行参数或代码里指定),使得我们可以对案例的执行进行一些有效的 ...

  7. monkey命令行测试

    一. 什么是Monkey monkey是google提供的一个用于稳定性与压力测试的命令行工具.monkey程序由android系统自带,位于/sdcard/system/framework/monk ...

  8. Junit 命令行测试 报错:Could not find class 理解及解决方法

    一.报错 : 『Could not find class』 下面给出三个示例比较,其中只有第一个是正确的. 1. MyComputer:bin marikobayashi$ java -cp .:./ ...

  9. laravel 命令行测试 Uncaught ReflectionException: Class config does not exist

    require __DIR__ . '/vendor/autoload.php'; $app = require_once __DIR__ . '/bootstrap/app.php'; config ...

随机推荐

  1. 在配置tensorflow时踩的无数个坑

    在下午尝试配置tensorflow环境时,遇到了许多天坑,讲真的心态炸了好几次,特此写下这篇记录,希望能给看到朋友一点帮助. 先说一下这抓狂的一天的起因,比赛项目想用SVM进行一下数据分析,除了常规的 ...

  2. python在pycharm中导包一直出错的问题

    之前的net在code的子目录中,怎么调试都无法解决.最后用一个简单粗暴的方式,将net直接拿到项目的根目录中.如此即可

  3. java中int转成String位数不足前面补零

    java中int转成String位数不足前面补零 转载自:http://ych0108.iteye.com/blog/2174134 java中int转String位数不够前面补零 String.fo ...

  4. BZOJ 2946 [Poi2000]公共串 (二分+Hash/二分+后缀数组/后缀自动机)

    求多串的最长公共字串. 法1: 二分长度+hash 传送门 法2: 二分+后缀数组 传送门 法3: 后缀自动机 拿第一个串建自动机,然后用其他串在上面匹配.每次求出SAM上每个节点的最长匹配长度后,再 ...

  5. Vue:选中商品规格改变字体和边框颜色(默认选中第一种规格)

    效果图: CSS: <div class="label"> <p>标签类别</p> <ul> <li v-for=" ...

  6. Web上传超大文件解决方案

    文件上传下载,与传统的方式不同,这里能够上传和下载10G以上的文件.而且支持断点续传. 通常情况下,我们在网站上面下载的时候都是单个文件下载,但是在实际的业务场景中,我们经常会遇到客户需要批量下载的场 ...

  7. SDOI2019R2游记

    Day 0 上午到了济南,住在了山下.下午颓颓颓,zhy在玩炉石,我在...打元气!我的机器人终于不掉HP通关了呢,送的皮肤好好看啊. Day 1 到考场后,打开题面,一看第一题似乎很可做啊,好像可以 ...

  8. Liblinear Visual studio 2013 Error C3057

    使用LibLinear时编译时出现Error C3057的错误: OpenMP报错,查询MSDN: #pragma omp threadprivate(var)//var在编译之前必须是确定值 所以修 ...

  9. Suitable Replacement

    D. Suitable Replacement 这个题统计出 s 和 t 中的各字母个数以及"?"的个数,直接暴力即可,s中不足的字母可用 "?"来替代 这个题 ...

  10. 3.Linux系统文件名字体不同的颜色都代表什么

    在Linux中,文件的颜色都是有含义的.其中, Linux中文件名颜色不同,代表文件类型不一样.如下所示: 白色:表示普通文件浅蓝色:表示链接文件: 灰色:表示其他文件: 绿色:表示可执行文件: 红色 ...