mesos集群部署参见上篇

运行在mesos上面和 spark standalone模式的区别是:

1)stand alone

需要自己启动spark master

需要自己启动spark slaver(即工作的worker)

2)运行在mesos

启动mesos master

启动mesos slaver

启动spark的 ./sbin/start-mesos-dispatcher.sh -m mesos://127.0.0.1:5050

配置spark的可执行程序的路径(也就是mesos里面所谓EXECUTOR),提供给mesos下载运行。

在mesos上面的运行流程:

1)通过spark-submit提交任务到spark-mesos-dispatcher

2)spark-mesos-dispatcher 把通过driver 提交到mesos master,并收到任务ID

3)mesos master 分配到slaver 让它执行任务

4) spark-mesos-dispatcher,通过任务ID查询任务状态

前期配置:

1) spark-env.sh

配置mesos库,以及spark可以执行的二进制程序包(这里偷懒用了官网的包,这里URI支持hdfs,http)。

#到spark的安装目录
vim conf/spark-env.sh # Options read by executors and drivers running inside the cluster
# - SPARK_LOCAL_IP, to set the IP address Spark binds to on this node
# - SPARK_PUBLIC_DNS, to set the public DNS name of the driver program
# - SPARK_CLASSPATH, default classpath entries to append
# - SPARK_LOCAL_DIRS, storage directories to use on this node for shuffle and RDD data
# - MESOS_NATIVE_JAVA_LIBRARY, to point to your libmesos.so if you use Mesos
MESOS_NATIVE_JAVA_LIBRARY=/usr/local/lib/libmesos.so
SPARK_EXECUTOR_URI=http://d3kbcqa49mib13.cloudfront.net/spark-1.5.1-bin-hadoop2.6.tgz

2)spark-defaults.conf

# Default system properties included when running spark-submit.
# This is useful for setting default environmental settings. # Example:
# spark.master spark://master:7077
# spark.eventLog.enabled true
# spark.eventLog.dir hdfs://namenode:8021/directory
# spark.serializer org.apache.spark.serializer.KryoSerializer
# spark.driver.memory 5g
# spark.executor.extraJavaOptions -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three" spark.executor.uri http://d3kbcqa49mib13.cloudfront.net/spark-1.5.1-bin-hadoop2.6.tgz
spark.master mesos://10.230.136.197:5050

3)  修改测试WordCount.java

这里测试例子,可以参考 提交任务到spark

/**
* Illustrates a wordcount in Java
*/
package com.oreilly.learningsparkexamples.mini.java; import java.util.Arrays;
import java.util.List;
import java.lang.Iterable; import scala.Tuple2; import org.apache.commons.lang.StringUtils; import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaPairRDD;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.api.java.function.FlatMapFunction;
import org.apache.spark.api.java.function.Function2;
import org.apache.spark.api.java.function.PairFunction; public class WordCount {
public static void main(String[] args) throws Exception {
String inputFile = args[0];
String outputFile = args[1];
// Create a Java Spark Context.
SparkConf conf = new SparkConf().setMaster("mesos://10.230.136.197:5050").setAppName("wordCount").set("spark.executor.uri", "http://d3kbcqa49mib13.cloudfront.net/spark-1.5.1-bin-hadoop2.6.tgz");
JavaSparkContext sc = new JavaSparkContext(conf);
// Load our input data.
JavaRDD<String> input = sc.textFile(inputFile);
// Split up into words.
JavaRDD<String> words = input.flatMap(
new FlatMapFunction<String, String>() {
public Iterable<String> call(String x) {
return Arrays.asList(x.split(" "));
}});
// Transform into word and count.
JavaPairRDD<String, Integer> counts = words.mapToPair(
new PairFunction<String, String, Integer>(){
public Tuple2<String, Integer> call(String x){
return new Tuple2(x, 1);
}}).reduceByKey(new Function2<Integer, Integer, Integer>(){
public Integer call(Integer x, Integer y){ return x + y;}});
// Save the word count back out to a text file, causing evaluation.
counts.saveAsTextFile(outputFile);
}
}

进入到examples/mini-complete-example 中,重新编译生成jar包。

4)启动spark-mesos-dispather

./sbin/start-mesos-dispatcher.sh -m mesos://10.230.136.197:5050
Spark Command: /app/otter/jdk1..0_80/bin/java -cp /home/qingpingzhang/dev/spark-1.5.-bin-hadoop2./sbin/../conf/:/home/qingpingzhang/dev/spark-1.5.-bin-hadoop2./lib/spark-assembly-1.5.-hadoop2.6.0.jar:/home/qingpingzhang/dev/spark-1.5.-bin-hadoop2./lib/datanucleus-core-3.2..jar:/home/qingpingzhang/dev/spark-1.5.-bin-hadoop2./lib/datanucleus-rdbms-3.2..jar:/home/qingpingzhang/dev/spark-1.5.-bin-hadoop2./lib/datanucleus-api-jdo-3.2..jar -Xms1g -Xmx1g -XX:MaxPermSize=256m org.apache.spark.deploy.mesos.MesosClusterDispatcher --host vg-log-analysis-prod --port  -m mesos://10.230.136.197:5050
========================================
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
// :: INFO MesosClusterDispatcher: Registered signal handlers for [TERM, HUP, INT]
// :: WARN Utils: Your hostname, vg-log-analysis-prod resolves to a loopback address: 127.0.0.1; using 10.230.136.197 instead (on interface eth0)
// :: WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
// :: INFO MesosClusterDispatcher: Recovery mode in Mesos dispatcher set to: NONE
// :: WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
// :: INFO SecurityManager: Changing view acls to: qingpingzhang
// :: INFO SecurityManager: Changing modify acls to: qingpingzhang
// :: INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(qingpingzhang); users with modify permissions: Set(qingpingzhang)
// :: INFO SecurityManager: Changing view acls to: qingpingzhang
// :: INFO SecurityManager: Changing modify acls to: qingpingzhang
// :: INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(qingpingzhang); users with modify permissions: Set(qingpingzhang)
// :: INFO Utils: Successfully started service on port .
// :: INFO MesosClusterUI: Started MesosClusterUI at http://10.230.136.197:8081
WARNING: Logging before InitGoogleLogging() is written to STDERR
W1105 ::37.927594 sched.cpp:]
**************************************************
Scheduler driver bound to loopback interface! Cannot communicate with remote master(s). You might want to set 'LIBPROCESS_IP' environment variable to use a routable IP address.
**************************************************
I1105 03:31:37.931098 3408 sched.cpp:164] Version: 0.24.0
I1105 03:31:37.939507 3406 sched.cpp:262] New master detected at master@10.230.136.197:5050
I1105 03:31:37.940353 3406 sched.cpp:272] No credentials provided. Attempting to register without authentication
I1105 03:31:37.943528 3406 sched.cpp:640] Framework registered with 20151105-021937-16777343-5050-32543-0001
15/11/05 11:31:37 INFO MesosClusterScheduler: Registered as framework ID 20151105-021937-16777343-5050-32543-0001

// :: INFO Utils: Successfully started service on port .
// :: INFO MesosRestServer: Started REST server for submitting applications on port

5)提交任务

./bin/spark-submit  --master mesos://10.230.136.197:7077 --deploy-mode cluster --class com.oreilly.learningsparkexamples.mini.java.WordCount  /home/qingpingzhang/dev/spark-1.5.1-bin-hadoop2.6/examples/mini-complete-example/target/learning-spark-mini-example-0.0.1.jar  /home/qingpingzhang/dev/spark-1.5.1-bin-hadoop2.6/README.md /home/qingpingzhang/dev/spark-1.5.1-bin-hadoop2.6/wordcounts.txt

然后我们可以看到mesos-master的输出,收到任务,然后发送给slaver,最后更新任务状态:

I1105 05:08:33.312283  7490 master.cpp:2094] Received SUBSCRIBE call for framework 'Spark Cluster' at scheduler-fae58488-7d56-4661-a078-938d12871930@10.230.136.197:57392
I1105 ::33.312641 master.cpp:] Subscribing framework Spark Cluster with checkpointing enabled and capabilities [ ]
I1105 ::33.313761 hierarchical.hpp:] Added framework -----
I1105 05:08:33.315335 7490 master.cpp:4613] Sending 1 offers to framework 20151105-050710-3314083338-5050-7469-0000 (Spark Cluster) at scheduler-fae58488-7d56-4661-a078-938d12871930@10.230.136.197:57392

I1105 ::33.426009 master.cpp:] Processing ACCEPT call for offers: [ -----O0 ] on slave -----S0 at slave()@10.29.23.28: (ip----.ec2.internal) for framework --$
--- (Spark Cluster) at scheduler-fae58488-7d56--a078-938d12871930@10.230.136.197:
I1105 ::33.427104 hierarchical.hpp:] Recovered cpus(*):; mem(*):; disk(*):; ports(*):[-] (total: cpus(*):; mem(*):; disk(*):; ports(*):[-], allocated: ) on slave -----S0 from framework
-----
I1105 ::39.177790 master.cpp:] Sending offers to framework ----- (Spark Cluster) at scheduler-fae58488-7d56--a078-938d12871930@10.230.136.197:
I1105 ::39.181149 master.cpp:] Processing ACCEPT call for offers: [ -----O1 ] on slave -----S0 at slave()@10.29.23.28: (ip----.ec2.internal) for framework --$
--- (Spark Cluster) at scheduler-fae58488-7d56--a078-938d12871930@10.230.136.197:
I1105 ::39.181699 hierarchical.hpp:] Recovered cpus(*):; mem(*):; disk(*):; ports(*):[-] (total: cpus(*):; mem(*):; disk(*):; ports(*):[-], allocated: ) on slave -----S0 from framework
-----
I1105 ::44.183100 master.cpp:] Sending offers to framework ----- (Spark Cluster) at scheduler-fae58488-7d56--a078-938d12871930@10.230.136.197:
I1105 ::44.186468 master.cpp:] Processing ACCEPT call for offers: [ -----O2 ] on slave -----S0 at slave()@10.29.23.28: (ip----.ec2.internal) for framework --$
--- (Spark Cluster) at scheduler-fae58488-7d56--a078-938d12871930@10.230.136.197:
I1105 ::44.187100 hierarchical.hpp:] Recovered cpus(*):; mem(*):; disk(*):; ports(*):[-] (total: cpus(*):; mem(*):; disk(*):; ports(*):[-], allocated: ) on slave -----S0 from framework
----- I1105 ::18.668609 master.cpp:] Status update TASK_FAILED (UUID: 8d30c637-b885-487b-b174-47232cc0e49f) for task driver-- of framework ----- from slave -----S0 at slave()@
.29.23.: (ip----.ec2.internal)
I1105 ::18.668689 master.cpp:] Forwarding status update TASK_FAILED (UUID: 8d30c637-b885-487b-b174-47232cc0e49f) for task driver-- of framework -----
I1105 05:12:18.669001 7489 master.cpp:5576] Updating the latest state of task driver-20151105131213-0002 of framework 20151105-050710-3314083338-5050-7469-0000
to TASK_FAILED
I1105 ::18.669373 hierarchical.hpp:] Recovered cpus(*):; mem(*): (total: cpus(*):; mem(*):; disk(*):; ports(*):[-], allocated: ) on slave -----S0 from framework ----- I1105 ::18.670912 master.cpp:] Removing task driver-- with resources cpus(*):; mem(*): of framework ----- on slave -----S0 at slave()@10.29.23.28: (ip----
.ec2.internal)

mesos-slaver的输出,收到任务,然后下载spark的可执行文件失败,最后执行任务失败:

1105 05:11:31.363765 17084 slave.cpp:1270] Got assigned task driver-20151105131130-0001 for framework 20151105-050710-3314083338-5050-7469-0000
I1105 05:11:31.365025 17084 slave.cpp:1386] Launching task driver-20151105131130-0001 for framework 20151105-050710-3314083338-5050-7469-0000
I1105 05:11:31.376075 17084 slave.cpp:4852] Launching executor driver-20151105131130-0001 of framework 20151105-050710-3314083338-5050-7469-0000 with resources cpus(*):0.1; mem(*):32 in work directory '/tmp/mesos/slaves/20151105-045733-3314083338-5050-7152-S0/frameworks/$
0151105-050710-3314083338-5050-7469-0000/executors/driver-20151105131130-0001/runs/461ceb14-9247-4a59-b9d4-4b0a7947e353'
I1105 05:11:31.376448 17085 containerizer.cpp:640] Starting container '461ceb14-9247-4a59-b9d4-4b0a7947e353' for executor 'driver-20151105131130-0001' of framework '20151105-050710-3314083338-5050-7469-0000'
I1105 05:11:31.376878 17084 slave.cpp:1604] Queuing task 'driver-20151105131130-0001' for executor driver-20151105131130-0001 of framework '20151105-050710-3314083338-5050-7469-0000
I1105 ::31.379096 linux_launcher.cpp:] Cloning child process with flags =
I1105 ::31.382968 containerizer.cpp:] Checkpointing executor's forked pid 17098 to '/tmp/mesos/meta/slaves/-----S0/frameworks/-----/executors/driver--/runs/461ceb14--4a$
-b9d4-4b0a7947e353/pids/forked.pid'
E1105 05:11:31.483093 17078 fetcher.cpp:515] Failed to run mesos-fetcher: Failed to fetch all URIs for container '461ceb14-9247-4a59-b9d4-4b0a7947e353' with exit status: 256
E1105 ::31.483355 slave.cpp:] Container '461ceb14-9247-4a59-b9d4-4b0a7947e353' for executor 'driver-20151105131130-0001' of framework '20151105-050710-3314083338-5050-7469-0000' failed to start: Failed to fetch all URIs for container '461ceb14-9247-4a59-b9d$
-4b0a7947e353' with exit status: 256
I1105 ::31.483444 containerizer.cpp:] Destroying container '461ceb14-9247-4a59-b9d4-4b0a7947e353'
I1105 ::31.485548 cgroups.cpp:] Freezing cgroup /sys/fs/cgroup/freezer/mesos/461ceb14--4a59-b9d4-4b0a7947e353
I1105 ::31.487112 cgroups.cpp:] Successfully froze cgroup /sys/fs/cgroup/freezer/mesos/461ceb14--4a59-b9d4-4b0a7947e353 after .48992ms
I1105 ::31.488673 cgroups.cpp:] Thawing cgroup /sys/fs/cgroup/freezer/mesos/461ceb14--4a59-b9d4-4b0a7947e353
I1105 ::31.490102 cgroups.cpp:] Successfullly thawed cgroup /sys/fs/cgroup/freezer/mesos/461ceb14--4a59-b9d4-4b0a7947e353 after .363968ms
I1105 ::31.583328 containerizer.cpp:] Executor for container '461ceb14-9247-4a59-b9d4-4b0a7947e353' has exited
I1105 ::31.583977 slave.cpp:] Executor 'driver-20151105131130-0001' of framework ----- exited with status
I1105 ::31.585134 slave.cpp:] Handling status update TASK_FAILED (UUID: 0c429164-a159-4e72--c5fbd2ef9004) for task driver-- of framework ----- from @0.0.0.0:
W1105 ::31.585384 containerizer.cpp:] Ignoring update for unknown container: 461ceb14--4a59-b9d4-4b0a7947e353
I1105 ::31.585605 status_update_manager.cpp:] Received status update TASK_FAILED (UUID: 0c429164-a159-4e72--c5fbd2ef9004) for task driver-- of framework -----
I1105 ::31.585911 status_update_manager.cpp:] Checkpointing UPDATE for status update TASK_FAILED (UUID: 0c429164-a159-4e72--c5fbd2ef9004) for task driver-- of framework -----
I1105 05:11:31.596305 17081 slave.cpp:3016] Forwarding the update TASK_FAILED (UUID: 0c429164-a159-4e72-9872-c5fbd2ef9004) for task driver-20151105131130-0001 of framework 20151105-050710-3314083338-5050-7469-0000 to master@10.230.136.197:5050

I1105 ::31.611620 status_update_manager.cpp:] Received status update acknowledgement (UUID: 0c429164-a159-4e72--c5fbd2ef9004) for task driver-- of framework -----
I1105 ::31.611702 status_update_manager.cpp:] Checkpointing ACK for status update TASK_FAILED (UUID: 0c429164-a159-4e72--c5fbd2ef9004) for task driver-- of framework -----
I1105 ::31.616345 slave.cpp:] Cleaning up executor 'driver-20151105131130-0001' of framework -----

那么这个问题咋解决呢?

去mesos的slaver上面查看运行日志(默认是在/tmp/mesos/slaves/目录下),发现尼玛,这个跟stand alone还不一样(stand alone模式下面,spark的master会建立一个http server,把jar包提供给spark worker下载执行,但是mesos模式居然不是这样,jar包需要是放到hdfs 或者http 路径下面?坑爹了),这有点不合理啊,亲。

/tmp/mesos/slaves/-----S0/frameworks/-----/executors/driver--/runs/bed1c620-e849--b130-c95c47133599$ cat stderr
I1105 ::17.387609 fetcher.cpp:] Fetcher Info: {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/20151105-045733-3314083338-5050-7152-S0\/qingpingzhang","items":[{"action":"BYPASS_CACHE","uri":{"extract":true,"value":"\/home\/qingpingzhang\/dev\/spark-1.5.1-bin-hadoop2.6\/examples\/mini-complete-example\/target\/learning-spark-mini-example-0.0.1.jar"}},{"action":"BYPASS_CACHE","uri":{"extract":true,"value":"http:\/\/d3kbcqa49mib13.cloudfront.net\/spark-1.5.1-bin-hadoop2.6.tgz"}}],"sandbox_directory":"\/tmp\/mesos\/slaves\/20151105-045733-3314083338-5050-7152-S0\/frameworks\/20151105-070418-3314083338-5050-12075-0000\/executors\/driver-20151105151012-0001\/runs\/bed1c620-e849-4076-b130-c95c47133599","user":"qingpingzhang"}
I1105 07:10:17.390316 17897 fetcher.cpp:369] Fetching URI '/home/qingpingzhang/dev/spark-1.5.1-bin-hadoop2.6/examples/mini-complete-example/target/learning-spark-mini-example-0.0.1.jar'
I1105 07:10:17.390344 17897 fetcher.cpp:243] Fetching directly into the sandbox directory
I1105 07:10:17.390384 17897 fetcher.cpp:180] Fetching URI '/home/qingpingzhang/dev/spark-1.5.1-bin-hadoop2.6/examples/mini-complete-example/target/learning-spark-mini-example-0.0.1.jar'
I1105 07:10:17.390418 17897 fetcher.cpp:160] Copying resource with command:cp '/home/qingpingzhang/dev/spark-1.5.1-bin-hadoop2.6/examples/mini-complete-example/target/learning-spark-mini-example-0.0.1.jar' '/tmp/mesos/slaves/20151105-045733-3314083338-5050-7152-S0/frameworks/20151105-070418-3314083338-5050-12075-0000/executors/driver-20151105151012-0001/runs/bed1c620-e849-4076-b130-c95c47133599/learning-spark-mini-example-0.0.1.jar'
cp: cannot stat ‘/home/qingpingzhang/dev/spark-1.5.-bin-hadoop2./examples/mini-complete-example/target/learning-spark-mini-example-0.0..jar’: No such file or directory
Failed to fetch '/home/qingpingzhang/dev/spark-1.5.1-bin-hadoop2.6/examples/mini-complete-example/target/learning-spark-mini-example-0.0.1.jar': Failed to copy with command 'cp '/home/qingpingzhang/dev/spark-1.5.-bin-hadoop2./examples/mini-complete-example/target/learning-spark-mini-example-0.0..jar' '/tmp/mesos/slaves/-----S0/frameworks/-----/executors/driver--/runs/bed1c620-e849--b130-c95c47133599/learning-spark-mini-example-0.0..jar'', exit status:
Failed to synchronize with slave (it's probably exited)

好吧,为了能够跑通测试,现在把jar包和readme.txt文件copy到mesos slaver的机器上面去。

./bin/spark-submit  --master mesos://10.230.136.197:7077 --deploy-mode cluster --class com.oreilly.learningsparkexamples.mini.java.WordCount  /tmp/learning-spark-mini-example-0.0.1.jar  /tmp/README.md /tmp/wordcounts.txt

果然任务就执行成功了,meso-slave输入出如下:

I1105 ::26.748515  slave.cpp:] Got assigned task driver-- for framework -----
I1105 ::26.749575 gc.cpp:] Unscheduling '/tmp/mesos/slaves/20151105-045733-3314083338-5050-7152-S0/frameworks/20151105-070418-3314083338-5050-12075-0000' from gc
I1105 ::26.749703 gc.cpp:] Unscheduling '/tmp/mesos/meta/slaves/20151105-045733-3314083338-5050-7152-S0/frameworks/20151105-070418-3314083338-5050-12075-0000' from gc
I1105 ::26.749825 slave.cpp:] Launching task driver-- for framework -----
I1105 ::26.760673 slave.cpp:] Launching executor driver-- of framework ----- with resources cpus(*):0.1; mem(*): in work directory '/tmp/mesos/slaves/20151105-045733-3314083338-5050-7152-S0/frameworks$
-----/executors/driver--/runs/910a4983--41dd-b014-66827c044c16'
I1105 ::26.760967 containerizer.cpp:] Starting container '910a4983-2732-41dd-b014-66827c044c16' for executor 'driver-20151105154326-0002' of framework '20151105-070418-3314083338-5050-12075-0000'
I1105 ::26.761265 slave.cpp:] Queuing task 'driver-20151105154326-0002' for executor driver-- of framework '20151105-070418-3314083338-5050-12075-0000
I1105 ::26.763134 linux_launcher.cpp:] Cloning child process with flags =
I1105 ::26.766726 containerizer.cpp:] Checkpointing executor's forked pid 18129 to '/tmp/mesos/meta/slaves/-----S0/frameworks/-----/executors/driver--/runs/910a4983--$dd-b014-66827c044c16/pids/forked.pid'
I1105 ::33.153153 slave.cpp:] Got registration for executor 'driver-20151105154326-0002' of framework ----- from executor()@10.29.23.28:
I1105 07:43:33.154284 17246 slave.cpp:1760] Sending queued task 'driver-20151105154326-0002' to executor 'driver-20151105154326-0002' of framework 20151105-070418-3314083338-5050-12075-0000
I1105 07:43:33.160464 17243 slave.cpp:2717] Handling status update TASK_RUNNING (UUID: eb83c08d-19fc-4d20-ab17-6f69ee8d90b2) for task driver-20151105154326-0002 of framework 20151105-070418-3314083338-5050-12075-0000 from executor(1)@10.29.23.28:54580
I1105 07:43:33.160643 17242 status_update_manager.cpp:322] Received status update TASK_RUNNING (UUID: eb83c08d-19fc-4d20-ab17-6f69ee8d90b2) for task driver-20151105154326-0002 of framework 20151105-070418-3314083338-5050-12075-0000

I1105 ::33.160940 status_update_manager.cpp:] Checkpointing UPDATE for status update TASK_RUNNING (UUID: eb83c08d-19fc-4d20-ab17-6f69ee8d90b2) for task driver-- of framework -----
I1105 ::33.168092 slave.cpp:] Forwarding the update TASK_RUNNING (UUID: eb83c08d-19fc-4d20-ab17-6f69ee8d90b2) for task driver-- of framework ----- to master@10.230.136.197:
I1105 ::33.168218 slave.cpp:] Sending acknowledgement for status update TASK_RUNNING (UUID: eb83c08d-19fc-4d20-ab17-6f69ee8d90b2) for task driver-- of framework ----- to executor()@10.29.23.28:
I1105 ::33.171906 status_update_manager.cpp:] Received status update acknowledgement (UUID: eb83c08d-19fc-4d20-ab17-6f69ee8d90b2) for task driver-- of framework -----
I1105 ::33.172025 status_update_manager.cpp:] Checkpointing ACK for status update TASK_RUNNING (UUID: eb83c08d-19fc-4d20-ab17-6f69ee8d90b2) for task driver-- of framework -----
I1105 ::36.454344 slave.cpp:] Current disk usage 1.10%. Max allowed age: .223128215149259days
I1105 ::39.174698 slave.cpp:] Got assigned task for framework -----
I1105 ::39.175014 slave.cpp:] Launching task for framework -----
I1105 ::39.185343 slave.cpp:] Launching executor -----S0 of framework ----- with resources cpus(*):; mem(*): in work directory '/tmp/mesos/slaves/20151105-045733-3314083338-5050-7152-$
/frameworks/-----/executors/-----S0/runs/8b233e11-54a6-41b6-a8ad-f82660d640b5'
I1105 ::39.185643 slave.cpp:] Queuing task '' for executor -----S0 of framework '20151105-070418-3314083338-5050-12075-0001
I1105 ::39.185694 containerizer.cpp:] Starting container '8b233e11-54a6-41b6-a8ad-f82660d640b5' for executor '20151105-045733-3314083338-5050-7152-S0' of framework '20151105-070418-3314083338-5050-12075-0001'
I1105 ::39.185786 slave.cpp:] Got assigned task for framework -----
I1105 ::39.185931 slave.cpp:] Launching task for framework -----
I1105 ::39.185968 slave.cpp:] Queuing task '' for executor -----S0 of framework '20151105-070418-3314083338-5050-12075-0001
I1105 ::39.187925 linux_launcher.cpp:] Cloning child process with flags =
I1105 ::46.388809 slave.cpp:] Got registration for executor '20151105-045733-3314083338-5050-7152-S0' of framework ----- from executor()@10.29.23.28:
I1105 ::46.389571 slave.cpp:] Sending queued task '' to executor '20151105-045733-3314083338-5050-7152-S0' of framework -----
I1105 ::46.389883 slave.cpp:] Sending queued task '' to executor '20151105-045733-3314083338-5050-7152-S0' of framework -----
I1105 ::49.534858 slave.cpp:] Handling status update TASK_RUNNING (UUID: b6d77a8f-ef6f-414b-9bdc-0fa1c2a96841) for task of framework ----- from executor()@10.29.23.28:
I1105 ::49.535087 slave.cpp:] Handling status update TASK_RUNNING (UUID: 6633c11b-c403-45c6-82c5-f495fcc9ae70) for task of framework ----- from executor()@10.29.23.28:
#......更多日志....
I1105 ::53.012852 slave.cpp:] Sending acknowledgement for status update TASK_FINISHED (UUID: 72c9805d-ea9d-49c1-ac2a-5c3940aa77f5) for task of framework ----- to executor()@10.29.23.28:
I1105 ::53.016926 status_update_manager.cpp:] Received status update acknowledgement (UUID: 5f175f09-2a35-46dd-9ac9-fbe648d9780a) for task of framework -----
I1105 ::53.017357 status_update_manager.cpp:] Received status update acknowledgement (UUID: 72c9805d-ea9d-49c1-ac2a-5c3940aa77f5) for task of framework -----
I1105 ::53.146410 slave.cpp:] Asked to shut down framework ----- by master@10.230.136.197:
I1105 ::53.146461 slave.cpp:] Shutting down framework -----
I1105 ::53.146515 slave.cpp:] Shutting down executor '20151105-045733-3314083338-5050-7152-S0' of framework -----
I1105 ::53.637172 slave.cpp:] Handling status update TASK_FINISHED (UUID: 9fad65ef-d651-4d2c-91a7-bac8cf4b55a4) for task driver-- of framework ----- from executor()@10.29.23.28:
I1105 ::53.637706 status_update_manager.cpp:] Received status update TASK_FINISHED (UUID: 9fad65ef-d651-4d2c-91a7-bac8cf4b55a4) for task driver-- of framework -----
I1105 07:43:53.637755 17246 status_update_manager.cpp:826] Checkpointing UPDATE for status update TASK_FINISHED (UUID: 9fad65ef-d651-4d2c-91a7-bac8cf4b55a4) for task driver-20151105154326-0002 of framework 20151105-070418-3314083338-5050-12075-0000
I1105 07:43:53.643517 17242 slave.cpp:3016] Forwarding the update TASK_FINISHED (UUID: 9fad65ef-d651-4d2c-91a7-bac8cf4b55a4) for task driver-20151105154326-0002 of framework 20151105-070418-3314083338-5050-12075-0000 to master@10.230.136.197:5050

I1105 ::53.643635 slave.cpp:] Sending acknowledgement for status update TASK_FINISHED (UUID: 9fad65ef-d651-4d2c-91a7-bac8cf4b55a4) for task driver-- of framework ----- to executor()@10.29.23.28:
I1105 ::53.647647 status_update_manager.cpp:] Received status update acknowledgement (UUID: 9fad65ef-d651-4d2c-91a7-bac8cf4b55a4) for task driver-- of framework -----
I1105 ::53.647703 status_update_manager.cpp:] Checkpointing ACK for status update TASK_FINISHED (UUID: 9fad65ef-d651-4d2c-91a7-bac8cf4b55a4) for task driver-- of framework -----
I1105 ::54.689678 containerizer.cpp:] Executor for container '910a4983-2732-41dd-b014-66827c044c16' has exited
I1105 ::54.689708 containerizer.cpp:] Destroying container '910a4983-2732-41dd-b014-66827c044c16'
I1105 ::54.691368 cgroups.cpp:] Freezing cgroup /sys/fs/cgroup/freezer/mesos/910a4983--41dd-b014-66827c044c16
I1105 ::54.693023 cgroups.cpp:] Successfully froze cgroup /sys/fs/cgroup/freezer/mesos/910a4983--41dd-b014-66827c044c16 after .624064ms
I1105 ::54.694628 cgroups.cpp:] Thawing cgroup /sys/fs/cgroup/freezer/mesos/910a4983--41dd-b014-66827c044c16
I1105 ::54.695976 cgroups.cpp:] Successfullly thawed cgroup /sys/fs/cgroup/freezer/mesos/910a4983--41dd-b014-66827c044c16 after 1312us
I1105 ::54.697335 slave.cpp:] Executor 'driver-20151105154326-0002' of framework ----- exited with status
I1105 ::54.697371 slave.cpp:] Cleaning up executor 'driver-20151105154326-0002' of framework -----
I1105 ::54.697621 gc.cpp:] Scheduling '/tmp/mesos/slaves/20151105-045733-3314083338-5050-7152-S0/frameworks/20151105-070418-3314083338-5050-12075-0000/executors/driver-20151105154326-0002/runs/910a4983-2732-41dd-b014-66827c044c16' for gc .99999192621333days i
n the future
I1105 ::54.697669 gc.cpp:] Scheduling '/tmp/mesos/slaves/20151105-045733-3314083338-5050-7152-S0/frameworks/20151105-070418-3314083338-5050-12075-0000/executors/driver-20151105154326-0002' for gc .99999192552296days in the future
I1105 ::54.697700 gc.cpp:] Scheduling '/tmp/mesos/meta/slaves/20151105-045733-3314083338-5050-7152-S0/frameworks/20151105-070418-3314083338-5050-12075-0000/executors/driver-20151105154326-0002/runs/910a4983-2732-41dd-b014-66827c044c16' for gc 6.99999192509333d
ays in the future
I1105 ::54.697713 slave.cpp:] Cleaning up framework -----
I1105 ::54.697726 gc.cpp:] Scheduling '/tmp/mesos/meta/slaves/20151105-045733-3314083338-5050-7152-S0/frameworks/20151105-070418-3314083338-5050-12075-0000/executors/driver-20151105154326-0002' for gc .99999192474667days in the future
I1105 ::54.697756 status_update_manager.cpp:] Closing status update streams for framework -----
I1105 ::54.697813 gc.cpp:] Scheduling '/tmp/mesos/slaves/20151105-045733-3314083338-5050-7152-S0/frameworks/20151105-070418-3314083338-5050-12075-0000' for gc .99999192391407days in the future
I1105 ::54.697856 gc.cpp:] Scheduling '/tmp/mesos/meta/slaves/20151105-045733-3314083338-5050-7152-S0/frameworks/20151105-070418-3314083338-5050-12075-0000' for gc .99999192340148days in the future
I1105 ::58.147456 slave.cpp:] Killing executor '20151105-045733-3314083338-5050-7152-S0' of framework -----
I1105 ::58.147546 containerizer.cpp:] Destroying container '8b233e11-54a6-41b6-a8ad-f82660d640b5'
I1105 ::58.149194 cgroups.cpp:] Freezing cgroup /sys/fs/cgroup/freezer/mesos/8b233e11-54a6-41b6-a8ad-f82660d640b5
I1105 ::58.200407 containerizer.cpp:] Executor for container '8b233e11-54a6-41b6-a8ad-f82660d640b5' has exited

然后/tmp/wordcounts.txt/目录下也如愿以偿的出现了统计结果:

ll /tmp/wordcounts.txt/
total
drwxr-xr-x qingpingzhang qingpingzhang Nov : ./
drwxrwxrwt root root Nov : ../
-rw-r--r-- qingpingzhang qingpingzhang Nov : part-
-rw-r--r-- qingpingzhang qingpingzhang Nov : .part-.crc
-rw-r--r-- qingpingzhang qingpingzhang Nov : part-
-rw-r--r-- qingpingzhang qingpingzhang Nov : .part-.crc
-rw-r--r-- qingpingzhang qingpingzhang Nov : _SUCCESS
-rw-r--r-- qingpingzhang qingpingzhang Nov : ._SUCCESS.crc

至此,spark 运行在mesos框架上也算是跑通了。

这里没有把mesos,spark的webui 中的图贴出来。

总结一下:

1)搭建好mesos集群

2)安装好spark,启动spark的mesos-dispather(连接到mesos的matser上)

3)采用spark-submit来提交任务(提交任务的时候,需要把jar包放到可以下载的地方,例如:hdfs,http等)

-----------------

使用mesos的好处,主要是能够动态的启动和分配任务,尽量利用好机器资源。

由于我们的日志分析就2台机器,所以还是打算用spark的stand alone 模式启动。

让spark运行在mesos上 -- 分布式计算系统spark学习(五)的更多相关文章

  1. Spark:一个高效的分布式计算系统

    概述 什么是Spark ◆ Spark是UC Berkeley AMP lab所开源的类Hadoop MapReduce的通用的并行计算框架,Spark基于map reduce算法实现的分布式计算,拥 ...

  2. 【转】Spark:一个高效的分布式计算系统

    原文地址:http://tech.uc.cn/?p=2116 概述 什么是Spark Spark是UC Berkeley AMP lab所开源的类Hadoop MapReduce的通用的并行计算框架, ...

  3. Spark:一个高效的分布式计算系统--转

    原文地址:http://soft.chinabyte.com/database/431/12914931.shtml 概述 什么是Spark ◆ Spark是UC Berkeley AMP lab所开 ...

  4. 执行Spark运行在yarn上的命令报错 spark-shell --master yarn-client

    1.执行Spark运行在yarn上的命令报错 spark-shell --master yarn-client,错误如下所示: // :: ERROR SparkContext: Error init ...

  5. 系统架构--分布式计算系统spark学习(三)

    通过搭建和运行example,我们初步认识了spark. 大概是这么一个流程 ------------------------------                 -------------- ...

  6. Spark Standalone Mode 单机启动Spark -- 分布式计算系统spark学习(一)

    spark是个啥? Spark是一个通用的并行计算框架,由UCBerkeley的AMP实验室开发. Spark和Hadoop有什么不同呢? Spark是基于map reduce算法实现的分布式计算,拥 ...

  7. 提交任务到spark master -- 分布式计算系统spark学习(四)

    部署暂时先用默认配置,我们来看看如何提交计算程序到spark上面. 拿官方的Python的测试程序搞一下. qpzhang@qpzhangdeMac-mini:~/project/spark-1.3. ...

  8. Spark Standalone Mode 多机启动 -- 分布式计算系统spark学习(二)(更新一键启动slavers)

    捣鼓了一下,先来个手动挡吧.自动挡要设置ssh无密码登陆啥的,后面开搞. 一.手动多台机链接master 手动链接master其实上篇已经用过. 这里有两台机器: 10.60.215.41 启动mas ...

  9. spark运行模式

    一.Spark运行模式 Spark有以下四种运行模式: local:本地单进程模式,用于本地开发测试Spark代码; standalone:分布式集群模式,Master-Worker架构,Master ...

随机推荐

  1. Struts2对Ognl的支持

                                                      Struts2对Ognl的支持 一. 写作背景 由于工作性质的变化,最近一直在研究struts2,从 ...

  2. Intellij Idea系列之JavaSE项目的创建(一)

    Intellij Idea系列之JavaSE项目的创建(一) 一.Intellij Idea于 Intellij Idea是捷克的Jetbrain公司的一款优秀的针对Java程序员的IDE,其自从问世 ...

  3. 24. Longest Consecutive Sequence

    Longest Consecutive Sequence Given an unsorted array of integers, find the length of the longest con ...

  4. 在CentOS上搭建PHP服务器环境

    您也可以使用一键自动部署环境的工具,请参见网友开发的这个工具 http://www.centos.bz/2013/08/ezhttp-tutorial/     安装apache: yum insta ...

  5. Oracle中获取当前时间半小时前的时间

    最近项目中有个要根据半个小时前的数据情况判断某一栏位的值,但是一直没想到怎样获取当前时间的半小时前的时间,今天突然想到可以通过sysdate做差来获取,比如sysdate-1这样的,刚开始没有对结果进 ...

  6. 用 Navicat 写mysql的游标

    千言万语也比不上一个简单直接明了的小例子: CREATE PROCEDURE pro_users() begin DECLARE myid int; DECLARE no int; ); ); ); ...

  7. jdbc中c3p0的配置信息

    <c3p0-config> <!-- 这是默认配置信息 --> <default-config> <!-- 连接四大参数配置 --> <prope ...

  8. java基础回顾(八)——Queue

    今天回顾了下关于Queue的一些相关知识 我们可以看到,Deque也是一个接口,它继承了Queue的接口规范.其中LinkedList和ArrayDeque都是实现Deque接口,所以,可以说他们俩都 ...

  9. 关于提交form不刷新的问题

    最近在做一个项目,除去主页面是html页面,点击菜单按钮都由ajax加载生成,在这种情景下,F5刷新或者提交form表单就会将页面回复到刚刚打开主页面. 现在有一个这样的场景,点击子菜单生成一个子页面 ...

  10. Windows上x86程序正常但x64程序崩溃问题

    先看下面代码: #include <stdio.h> #include <windows.h> #include <memory> class Test { pub ...