grunt> cat t.txt
kw1 2
kw3 1
kw2 4
kw1 5
kw2 2 cat test.pig
A = LOAD '/user/input/t.txt' as (k:chararray,c:int);
B = group A BY k;
C = foreach B generate group,SUM(A.c);
-- DUMP C;
store C into 'test.output';

$ pig -e 'illustrate -script test.pig'

2014-05-03 17:11:25,182 [main] INFO  org.apache.pig.Main - Logging error messages to: /opt/dataset/pig_1399108285179.log

2014-05-03 17:11:25,330 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://10.0.3.142:9000

2014-05-03 17:11:25,514 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: 10.0.3.142:9001

2014-05-03 17:11:26,103 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://10.0.3.142:9000

2014-05-03 17:11:26,104 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: 10.0.3.142:9001

2014-05-03 17:11:26,291 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false

2014-05-03 17:11:26,305 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1

2014-05-03 17:11:26,306 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1

2014-05-03 17:11:26,315 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job

2014-05-03 17:11:26,330 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3

2014-05-03 17:11:26,474 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1

2014-05-03 17:11:26,475 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1

2014-05-03 17:11:26,513 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false

2014-05-03 17:11:26,520 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1

2014-05-03 17:11:26,521 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1

2014-05-03 17:11:26,521 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.

2014-05-03 17:11:26,522 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job

2014-05-03 17:11:26,523 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3

2014-05-03 17:11:26,531 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job

2014-05-03 17:11:26,534 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30

2014-05-03 17:11:26,534 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1

2014-05-03 17:11:26,597 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false

2014-05-03 17:11:26,599 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1

2014-05-03 17:11:26,599 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1

2014-05-03 17:11:26,600 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.

2014-05-03 17:11:26,601 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job

2014-05-03 17:11:26,601 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3

2014-05-03 17:11:26,608 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job

2014-05-03 17:11:26,611 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30

2014-05-03 17:11:26,611 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1

2014-05-03 17:11:26,639 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false

2014-05-03 17:11:26,641 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1

2014-05-03 17:11:26,642 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1

2014-05-03 17:11:26,642 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.

2014-05-03 17:11:26,643 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job

2014-05-03 17:11:26,643 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3

2014-05-03 17:11:26,650 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job

2014-05-03 17:11:26,652 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30

2014-05-03 17:11:26,652 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1

2014-05-03 17:11:26,677 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false

2014-05-03 17:11:26,679 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1

2014-05-03 17:11:26,679 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1

2014-05-03 17:11:26,680 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.

2014-05-03 17:11:26,680 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job

2014-05-03 17:11:26,681 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3

2014-05-03 17:11:26,686 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job

2014-05-03 17:11:26,688 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30

2014-05-03 17:11:26,688 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1

2014-05-03 17:11:26,710 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic?

false

2014-05-03 17:11:26,712 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1

2014-05-03 17:11:26,712 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1

2014-05-03 17:11:26,713 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.

2014-05-03 17:11:26,714 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job

2014-05-03 17:11:26,714 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3

2014-05-03 17:11:26,721 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job

2014-05-03 17:11:26,724 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30

2014-05-03 17:11:26,724 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1

2014-05-03 17:11:26,744 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false

2014-05-03 17:11:26,746 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1

2014-05-03 17:11:26,746 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1

2014-05-03 17:11:26,747 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.

2014-05-03 17:11:26,747 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job

2014-05-03 17:11:26,748 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3

2014-05-03 17:11:26,754 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job

2014-05-03 17:11:26,757 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30

2014-05-03 17:11:26,757 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1

2014-05-03 17:11:26,772 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false

2014-05-03 17:11:26,774 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1

2014-05-03 17:11:26,774 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1

2014-05-03 17:11:26,775 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.

2014-05-03 17:11:26,775 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job

2014-05-03 17:11:26,776 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3

2014-05-03 17:11:26,782 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job

2014-05-03 17:11:26,784 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30

2014-05-03 17:11:26,784 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1

2014-05-03 17:11:26,804 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false

2014-05-03 17:11:26,806 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1

2014-05-03 17:11:26,806 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1

2014-05-03 17:11:26,807 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.

2014-05-03 17:11:26,807 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job

2014-05-03 17:11:26,808 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3

2014-05-03 17:11:26,812 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job

2014-05-03 17:11:26,821 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30

2014-05-03 17:11:26,821 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1

(kw1,2)

2014-05-03 17:11:26,840 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic?

false

2014-05-03 17:11:26,842 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1

2014-05-03 17:11:26,842 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1

2014-05-03 17:11:26,842 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.

2014-05-03 17:11:26,843 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job

2014-05-03 17:11:26,843 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3

2014-05-03 17:11:26,846 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job

2014-05-03 17:11:26,849 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30

2014-05-03 17:11:26,849 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1

2014-05-03 17:11:26,862 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false

2014-05-03 17:11:26,863 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1

2014-05-03 17:11:26,863 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1

2014-05-03 17:11:26,864 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.

2014-05-03 17:11:26,864 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job

2014-05-03 17:11:26,865 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3

2014-05-03 17:11:26,868 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job

2014-05-03 17:11:26,870 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30

2014-05-03 17:11:26,870 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1

2014-05-03 17:11:26,882 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic?

false

2014-05-03 17:11:26,884 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1

2014-05-03 17:11:26,884 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1

2014-05-03 17:11:26,884 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.

2014-05-03 17:11:26,885 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job

2014-05-03 17:11:26,885 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3

2014-05-03 17:11:26,887 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job

2014-05-03 17:11:26,889 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30

2014-05-03 17:11:26,890 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1

2014-05-03 17:11:26,901 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false

2014-05-03 17:11:26,903 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1

2014-05-03 17:11:26,903 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1

2014-05-03 17:11:26,903 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.

2014-05-03 17:11:26,904 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job

2014-05-03 17:11:26,904 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3

2014-05-03 17:11:26,906 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job

2014-05-03 17:11:26,908 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30

2014-05-03 17:11:26,908 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1

2014-05-03 17:11:26,919 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic?

false

2014-05-03 17:11:26,920 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1

2014-05-03 17:11:26,920 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1

2014-05-03 17:11:26,921 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.

2014-05-03 17:11:26,921 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job

2014-05-03 17:11:26,922 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3

2014-05-03 17:11:26,924 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job

2014-05-03 17:11:26,926 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30

2014-05-03 17:11:26,926 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1

2014-05-03 17:11:26,937 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic?

false

2014-05-03 17:11:26,938 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1

2014-05-03 17:11:26,938 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1

2014-05-03 17:11:26,938 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.

2014-05-03 17:11:26,939 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job

2014-05-03 17:11:26,939 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3

2014-05-03 17:11:26,941 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job

2014-05-03 17:11:26,943 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30

2014-05-03 17:11:26,943 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1

2014-05-03 17:11:26,954 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false

2014-05-03 17:11:26,955 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1

2014-05-03 17:11:26,955 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1

2014-05-03 17:11:26,956 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.

2014-05-03 17:11:26,956 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job

2014-05-03 17:11:26,956 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3

2014-05-03 17:11:26,959 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job

2014-05-03 17:11:26,961 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30

2014-05-03 17:11:26,961 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1

2014-05-03 17:11:26,973 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic?

false

2014-05-03 17:11:26,974 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1

2014-05-03 17:11:26,974 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1

2014-05-03 17:11:26,974 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.

2014-05-03 17:11:26,975 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job

2014-05-03 17:11:26,975 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3

2014-05-03 17:11:26,978 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job

2014-05-03 17:11:26,980 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30

2014-05-03 17:11:26,980 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1

-------------------------------------

| A     | k:chararray    | c:int    | 

-------------------------------------

|       | kw1            | 2        | 

|       | kw1            | 5        | 

-------------------------------------

-----------------------------------------------------------------------------

| B     | group:chararray    | A:bag{:tuple(k:chararray,c:int)}             | 

-----------------------------------------------------------------------------

|       | kw1                | {(kw1, 2), (kw1, 5)}                         | 

-----------------------------------------------------------------------------

-----------------------------------------

| C     | group:chararray    | :long    | 

-----------------------------------------

|       | kw1                | 7        | 

-----------------------------------------

-------------------------------------------------

| Store : C     | group:chararray    | :long    | 

-------------------------------------------------

|               | kw1                | 7        | 

-------------------------------------------------



$ pig -e 'explain -script test.pig'
2014-05-03 17:19:59,359 [main] INFO  org.apache.pig.Main - Logging error messages to: /opt/dataset/pig_1399108799355.log
2014-05-03 17:19:59,497 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://10.0.3.142:9000
2014-05-03 17:19:59,685 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: 10.0.3.142:9001
#-----------------------------------------------
# New Logical Plan:
#-----------------------------------------------
C: (Name: LOStore Schema: group#19:chararray,#34:long)
|
|---C: (Name: LOForEach Schema: group#19:chararray,#34:long)
    |   |
    |   (Name: LOGenerate[false,false] Schema: group#19:chararray,#34:long)ColumnPrune:InputUids=[19, 30]ColumnPrune:OutputUids=[34, 19]
    |   |   |
    |   |   group:(Name: Project Type: chararray Uid: 19 Input: 0 Column: (*))
    |   |   |
    |   |   (Name: UserFunc(org.apache.pig.builtin.IntSum) Type: long Uid: 34)
    |   |   |
    |   |   |---(Name: Dereference Type: bag Uid: 33 Column:[1])
    |   |       |
    |   |       |---A:(Name: Project Type: bag Uid: 30 Input: 1 Column: (*))
    |   |
    |   |---(Name: LOInnerLoad[0] Schema: group#19:chararray)
    |   |
    |   |---A: (Name: LOInnerLoad[1] Schema: k#19:chararray,c#20:int)
    |
    |---B: (Name: LOCogroup Schema: group#19:chararray,A#30:bag{#37:tuple(k#19:chararray,c#20:int)})
        |   |
        |   k:(Name: Project Type: chararray Uid: 19 Input: 0 Column: 0)
        |
        |---A: (Name: LOForEach Schema: k#19:chararray,c#20:int)
            |   |
            |   (Name: LOGenerate[false,false] Schema: k#19:chararray,c#20:int)ColumnPrune:InputUids=[19, 20]ColumnPrune:OutputUids=[19, 20]
            |   |   |
            |   |   (Name: Cast Type: chararray Uid: 19)
            |   |   |
            |   |   |---k:(Name: Project Type: bytearray Uid: 19 Input: 0 Column: (*))
            |   |   |
            |   |   (Name: Cast Type: int Uid: 20)
            |   |   |
            |   |   |---c:(Name: Project Type: bytearray Uid: 20 Input: 1 Column: (*))
            |   |
            |   |---(Name: LOInnerLoad[0] Schema: k#19:bytearray)
            |   |
            |   |---(Name: LOInnerLoad[1] Schema: c#20:bytearray)
            |
            |---A: (Name: LOLoad Schema: k#19:bytearray,c#20:bytearray)RequiredFields:null #-----------------------------------------------
# Physical Plan:
#-----------------------------------------------
C: Store(hdfs://namenode:9000/user/deve_test_user/test.output:org.apache.pig.builtin.PigStorage) - scope-19
|
|---C: New For Each(false,false)[bag] - scope-18
    |   |
    |   Project[chararray][0] - scope-12
    |   |
    |   POUserFunc(org.apache.pig.builtin.IntSum)[long] - scope-16
    |   |
    |   |---Project[bag][1] - scope-15
    |       |
    |       |---Project[bag][1] - scope-14
    |
    |---B: Package[tuple]{chararray} - scope-9
        |
        |---B: Global Rearrange[tuple] - scope-8
            |
            |---B: Local Rearrange[tuple]{chararray}(false) - scope-10
                |   |
                |   Project[chararray][0] - scope-11
                |
                |---A: New For Each(false,false)[bag] - scope-7
                    |   |
                    |   Cast[chararray] - scope-2
                    |   |
                    |   |---Project[bytearray][0] - scope-1
                    |   |
                    |   Cast[int] - scope-5
                    |   |
                    |   |---Project[bytearray][1] - scope-4
                    |
                    |---A: Load(/user/input/t.txt:org.apache.pig.builtin.PigStorage) - scope-0 2014-05-03 17:20:00,316 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-05-03 17:20:00,326 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.CombinerOptimizer - Choosing to move algebraic foreach to combiner
2014-05-03 17:20:00,349 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:20:00,349 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
#--------------------------------------------------
# Map Reduce Plan                                  
#--------------------------------------------------
MapReduce node scope-20
Map Plan
B: Local Rearrange[tuple]{chararray}(false) - scope-33
|   |
|   Project[chararray][0] - scope-34
|
|---C: New For Each(false,false)[bag] - scope-21
    |   |
    |   Project[chararray][0] - scope-22
    |   |
    |   POUserFunc(org.apache.pig.builtin.IntSum$Initial)[tuple] - scope-23
    |   |
    |   |---Project[bag][1] - scope-24
    |       |
    |       |---Project[bag][1] - scope-25
    |
    |---Pre Combiner Local Rearrange[tuple]{Unknown} - scope-35
        |
        |---A: New For Each(false,false)[bag] - scope-7
            |   |
            |   Cast[chararray] - scope-2
            |   |
            |   |---Project[bytearray][0] - scope-1
            |   |
            |   Cast[int] - scope-5
            |   |
            |   |---Project[bytearray][1] - scope-4
            |
            |---A: Load(/user/input/t.txt:org.apache.pig.builtin.PigStorage) - scope-0--------
Combine Plan
B: Local Rearrange[tuple]{chararray}(false) - scope-37
|   |
|   Project[chararray][0] - scope-38
|
|---C: New For Each(false,false)[bag] - scope-26
    |   |
    |   Project[chararray][0] - scope-27
    |   |
    |   POUserFunc(org.apache.pig.builtin.IntSum$Intermediate)[tuple] - scope-28
    |   |
    |   |---Project[bag][1] - scope-29
    |
    |---POCombinerPackage[tuple]{chararray} - scope-31--------
Reduce Plan
C: Store(hdfs://namenode:9000/user/deve_test_user/test.output:org.apache.pig.builtin.PigStorage) - scope-19
|
|---C: New For Each(false,false)[bag] - scope-18
    |   |
    |   Project[chararray][0] - scope-12
    |   |
    |   POUserFunc(org.apache.pig.builtin.IntSum$Final)[long] - scope-16
    |   |
    |   |---Project[bag][1] - scope-30
    |
    |---POCombinerPackage[tuple]{chararray} - scope-39--------
Global sort: false
----------------

pig 调试(explain&illerstrate)的更多相关文章

  1. Pig Latin程序设计1

    Pig是一个大规模数据分析平台.Pig的基础结构层包括一个产生MapReduce程序的编译器.在编译器中,大规模并行执行依据存在.Pig的语言包括一个叫Pig Latin的文本语言,此语言有如下特性: ...

  2. Apache Pig入门学习文档(一)

    1,Pig的安装    (一)软件要求    (二)下载Pig      (三)编译Pig 2,运行Pig    (一)Pig的所有执行模式    (二)pig的交互式模式    (三)使用pig脚本 ...

  3. 微信调试、API、AJAX的调试 SocketLog

    SocketLog适合Ajax调试和API调试, 举一个常见的场景,用SocketLog来做微信调试, 我们在做微信API开发的时候,如果API有bug,微信只提示"改公众账号暂时无法提供服 ...

  4. HiveQL(HiveSQL)跟普通SQL最大区别一直使用PIG,而今也需要兼顾HIVE

    HiveQL(Hive SQL)跟普通SQL最大区别 一直使用PIG,而今也需要兼顾HIVE.网上搜了点资料,感觉挺有用,这里翻译过来.翻译估计不太准确,待自己熟悉HIVE后再慢慢总结. * No t ...

  5. 监听调试web service的好工具TCPMon

    监听调试web service的好工具TCPMonhttp://ws.apache.org/commons/tcpmon/download.cgi TCPMon Tutorial Content In ...

  6. 异步调试神器Slog,“从此告别看日志,清日志文件了”

    微信调试.API调试和AJAX的调试的工具,能将日志通过WebSocket输出到Chrome浏览器的console中  — Edit 92 commits 4 branches 3 releases ...

  7. 本地调试WordPress计划终告失败

    小猪本来想把博客的网站数据迁移到自己的电脑上面,mysql数据库还是放在主机供应商,这样就能缓解一下每次写博客时访问速度捉急的状况. 计划是美满的,但是只到实施的时候才发现各种问题.先是直接运行程序时 ...

  8. Hadoop、Pig、Hive、Storm、NOSQL 学习资源收集

    (一)hadoop 相关安装部署 1.hadoop在windows cygwin下的部署: http://lib.open-open.com/view/1333428291655 http://blo ...

  9. Pig On Mac

    Install 首先是 Mac OS 下的安装 1 2 export JAVA_HOME=$(/usr/libexec/java_home) brew install pig Run Pig 运行分为 ...

随机推荐

  1. C#-WebService基础02

    WebService WSDL是web service的交换格式 跨平台数据交互 什么是web服务 SOA 面向服务的体系结构  service-Oriented Architecture Servi ...

  2. VC6.0VB6.0 Scratch等软件

    VC6.0VB6.0 Scratch等软件 http://pan.baidu.com/s/1nv4hJrb

  3. 使用Java语言实现,自己主动生成10个整数(1~100,求出生成数列中的最大值和最小值,不同意使用Arrays类的sort方法

    这是考察主要的java基础,没啥难点,直接上代码,近期在准备面试,所以做一些基础的面试题练练手 public class Demo1 { public static void main(String[ ...

  4. 关于sql中的with(nolock)

    SQL Server 中的 NOLOCK 究竟是什么意思 一般用于此类语句中:select * from t with(NOLOCK) nolock是不加锁查询.能够读取被事务锁定的数据,也称为脏读. ...

  5. Brute force Attack

    1 Introduction A common threat that webdevelopers face is a password-guessing attack known as a brut ...

  6. .Net MVC的学习(一)

    套种间作,也挺有意思的--近来学习感悟.DRP学习的同一时候,折腾了点曾经不曾学习可是却非常多次耳闻过的东西--Asp.Net中的MVC架构模式. 一.是什么? MVC,即(Model-View-Co ...

  7. IBM AppScan官方帮助文档错别字缺陷,IBM的測试人员也太粗心了吧

    袁术=元素?

  8. [JZOJ NOIP2018模拟10.20 A组]

    由于T3数据出锅,还不清楚自己的分数...估分150,前100已经拿到了,T3的50没拍过(写的就是暴力怎么拍),感觉很不稳 考试的时候就是特别的困,大概是因为早上在房间里腐败...腐败完了才睡觉 T ...

  9. xBIM 高级01 IFC多模型合并

    系列目录    [已更新最新开发文章,点击查看详细]  多模型合并可以实现以下功能: 覆盖多个模型以表现得像一个模型 统一访问数据,就像它是单个模型一样 只读.要修改模型的内容,您必须使用特定模型 不 ...

  10. kindEditor编写插件遇到的问题

    kindEditor是一个功能强大的在线文本编辑器,而且提供了插件扩展功能,更好的满足用户各方面的需求.在项目中,我们就有如此的需求:在kindEditor编辑器中,添加一条下划线,并且在下划线的中间 ...