2014年12月15日 星期一

[CCDH] Exercise10 - Logging (P38)

Preface 
Files and Directories Used in this Exercise 
Eclipse project: logging
Java Files:
AverageReducer.java (Reducer from ToolRunner)
LetterMapper.java (Mapper from ToolRunner)
AvgWordLength.java (Driver from ToolRunner)

Test datas (HDFS):
shakespeare

Exercise directory: ~/workspace/logging

In this Hands-On Exercise, you will practice using log4j with MapReduce. 

Modify the Average Word Length program you built in the Using ToolRunner and Passing Parameters exercise so that the Mapper logs a debug message indicating whether it is comparing with or without case sensitivity. 

Lab Experiment 
Enable Mapper Logging for the Job 
1. Before adding additional logging messages, try re-running the toolrunner exercise solution with Mapper debug logging enabled by adding -Dmapred.map.child.log.level=DEBUG to the command line. E.g. 
$ hadoop jar toolrunner.jar solution.AvgWordLength -Dmapred.map.child.log.level=DEBUG shakespeare output

2. Take note of the Job ID in the terminal window by using mapred job command: 
$ mapred job -list
1 jobs currently running
JobId State StartTime UserName Priority SchedulingInfo
job_201412110227_0054 4 1418629550194 training NORMAL NA

3. When the job is complete, view the logs. In a browser on your VM, visit the Job Tracker UI: http://localhost:50030/jobtracker.jsp. Find the job you just ran in the Completed Jobs list and click its Job ID. E.g.: 


4. In the task summary, click map to view the map tasks. 
 

5. In the list of tasks, click on the map task to view the details of that task: 
 

6. Under Task Logs, click "All". The logs should include both INFO and DEBUG message. 

Add Debug Logging Output to the Mapper 
7. Copy the code from toolrunner project to the logging project stubs package.
8. Using log4j to output a debug log message indicating whether the Mapper is doing case sensitive or insensitive mapping:
 
  1. package solution;  
  2.   
  3. import java.io.IOException;  
  4.   
  5. import org.apache.hadoop.io.IntWritable;  
  6. import org.apache.hadoop.io.LongWritable;  
  7. import org.apache.hadoop.io.Text;  
  8. import org.apache.hadoop.mapreduce.Mapper;  
  9. import org.apache.hadoop.conf.Configuration;  
  10. import org.apache.log4j.Logger;  
  11.   
  12.   
  13. public class LetterMapper extends Mapper {  
  14.   
  15.     boolean caseSensitive = false;  
  16.   
  17.     /* 
  18.      * Set up a logger for this class. 
  19.      */  
  20.     private static final Logger LOGGER = Logger.getLogger(LetterMapper.class);  
  21.       
  22.     @Override  
  23.     public void map(LongWritable key, Text value, Context context)  
  24.             throws IOException, InterruptedException {  
  25.   
  26.   
  27.         String line = value.toString();  
  28.   
  29.         for (String word : line.split("\\W+")) {  
  30.             if (word.length() > 0) {  
  31.   
  32.                 /* 
  33.                  * Obtain the first letter of the word 
  34.                  */  
  35.                 String letter;  
  36.   
  37.                 if (caseSensitive)  
  38.                     letter = word.substring(01);  
  39.                 else  
  40.                     letter = word.substring(01).toLowerCase();  
  41.   
  42.                 context.write(new Text(letter), new IntWritable(word.length()));  
  43.             }  
  44.         }  
  45.     }  
  46.   
  47.     @Override  
  48.     public void setup(Context context) {  
  49.           
  50.         Configuration conf = context.getConfiguration();  
  51.         caseSensitive = conf.getBoolean("caseSensitive"false);  
  52.   
  53.         /* 
  54.          * If Debug logging is enabled, log a message indicating  
  55.          * the value of caseSensitive. 
  56.          */  
  57.         if (LOGGER.isDebugEnabled()) {  
  58.             LOGGER.debug("Case sensitive: " + caseSensitive);  
  59.         }  
  60.     }  
  61.   
  62. }  
Build and Test Your Code 
9. Following the eariler steps, test your code with Mapper debug logging enabled. View the map task logs in the Job Tracker UI to confirm that your message is included in the log. (Hint: search for LetterMapper in the page to find your message.) 
 

10. Optional: Try running map logging set to INFO (the default) or WARN instead of DEBUG and compare the log output.

沒有留言:

張貼留言

[Git 常見問題] error: The following untracked working tree files would be overwritten by merge

  Source From  Here 方案1: // x -----删除忽略文件已经对 git 来说不识别的文件 // d -----删除未被添加到 git 的路径中的文件 // f -----强制运行 #   git clean -d -fx 方案2: 今天在服务器上  gi...