2015年1月22日 星期四

[CCDH] Exercise9 - Testing with LocalJobRunner (P34)

Preface 
Files and Directories Used in this Exercise 
Eclipse project: toolrunner
Test data (local):
~/training_materials/developer/data/shakespeare

Exercise directory:~/workspace/toolrunner

In this Hands-On Exercise, you will practice running a job locally for debugging and testing purposes. 

In the "Using ToolRunner and Passing Paremeters" exercise, you modified the Average Word Length program to use ToolRunner. This makes it simple to set job configuration properties on the command line! 

Lab Experiment 
Run the Average Word Length program using LocalJobRunner on the command line 
1. Run the Average Word Length program again. Specify -jt=local to run the job locally instead of submitting to the cluster, and -fs=file:/// to use the local file system instead of HDFS. Your input and output files should refer to the local file rather than HDFS files. 
$ rm -rf localout # Clean previous result
$ hadoop jar toolrunner.jar solution.AvgWordLength -fs=file:/// -jt=local ~/training_materials/developer/data/shakespeare localout

2. Review the job output in the local output folder you specified.

沒有留言:

張貼留言

[Git 常見問題] error: The following untracked working tree files would be overwritten by merge

  Source From  Here 方案1: // x -----删除忽略文件已经对 git 来说不识别的文件 // d -----删除未被添加到 git 的路径中的文件 // f -----强制运行 #   git clean -d -fx 方案2: 今天在服务器上  gi...