Source From Here
How-To
If your installation uses the TaskTracker (pre-YARN distributions), your lens builds can fail with out-of-memory errors. Better tuning of the TaskTracker task properties and Java virtual memory (JVM) settings can fix these errors.
When creating a MapReduce job, Hadoop does not dynamically detect system resources to determine the number of map or reduce task slots to allocate. Instead, the MapReduce job tries to use as many task slots as it is allowed with as much Java virtual memory (JVM) allowed. A Platfora configuration can pass JVM allocation for a MapReduce job but it cannot set the allowable tasks. The Hadoop configuration controls the allowable tasks.
The maximum number of simultaneous tasks that can run on a Hadoop TaskTracker node is configured by the following MapReduce configuration properties:
* mapred.tasktracker.map.tasks.maximum
* mapred.tasktracker.reduce.tasks.maximum
Note.
Your Hadoop administrator configures these on the Hadoop TaskTracker nodes.
The maximum number of tasks able to run on a single TaskTracker node has nothing to do with the number of tasks a job needs. For example, a job may require 10 map tasks. If the maximum map task slots is 5, then the job runs 5 tasks at a time until it completes all 10. Likewise the maximum reduce task slots could be set to 10, but a job may only need to use 2 reduce slots.
Your Hadoop administrator should make sure that the number of task slots is sized according to the amount of memory and CPU available on your Hadoop TaskTracker nodes, and the typical job workload. If the tracker nodes have swap enabled, administrators can reduce these limits to take that into account.
The total JVM size that Hadoop allocates per task slot is set by the mapred.java.child.opts property. You set this in Platfora's local mapred-site.xml file. Platfora needs at least a 1 GB JVM size for its task slots. But if you decide to use a higher JVM size to optimize lens build performance, you must make sure not to over-allocate system memory on your Hadoop TaskTracker nodes.
In addition to its operating system requirements, a TaskTracker node requires enough RAM to support the TaskTracker process, the data node virtual machine, and any other process a node may run. Think of this as the node's RAM requirements. A good rule of thumb for setting mapred.java.child.opts for Platfora is:
For example, if a TaskTracker node returns 32 GB of RAM, minus 4 GB reserve memory, then 28 GB is available for all MapReduce tasks. If the maximum map tasks allowed is 5 and the maximum reduce tasks is 3, then no more than 8 tasks can run at one time on the node. 28 GB divided by 8 is 3.5 GB per task ( -Xmx3500M ).
Supplement
* Hadoop 參數設定 – mapred-site.xml
* FAQ - Out of Memory Error in Hadoop
This will start the hadoop JVMs with more heap space.How-To
If your installation uses the TaskTracker (pre-YARN distributions), your lens builds can fail with out-of-memory errors. Better tuning of the TaskTracker task properties and Java virtual memory (JVM) settings can fix these errors.
When creating a MapReduce job, Hadoop does not dynamically detect system resources to determine the number of map or reduce task slots to allocate. Instead, the MapReduce job tries to use as many task slots as it is allowed with as much Java virtual memory (JVM) allowed. A Platfora configuration can pass JVM allocation for a MapReduce job but it cannot set the allowable tasks. The Hadoop configuration controls the allowable tasks.
The maximum number of simultaneous tasks that can run on a Hadoop TaskTracker node is configured by the following MapReduce configuration properties:
* mapred.tasktracker.map.tasks.maximum
* mapred.tasktracker.reduce.tasks.maximum
Note.
Your Hadoop administrator configures these on the Hadoop TaskTracker nodes.
The maximum number of tasks able to run on a single TaskTracker node has nothing to do with the number of tasks a job needs. For example, a job may require 10 map tasks. If the maximum map task slots is 5, then the job runs 5 tasks at a time until it completes all 10. Likewise the maximum reduce task slots could be set to 10, but a job may only need to use 2 reduce slots.
Your Hadoop administrator should make sure that the number of task slots is sized according to the amount of memory and CPU available on your Hadoop TaskTracker nodes, and the typical job workload. If the tracker nodes have swap enabled, administrators can reduce these limits to take that into account.
The total JVM size that Hadoop allocates per task slot is set by the mapred.java.child.opts property. You set this in Platfora's local mapred-site.xml file. Platfora needs at least a 1 GB JVM size for its task slots. But if you decide to use a higher JVM size to optimize lens build performance, you must make sure not to over-allocate system memory on your Hadoop TaskTracker nodes.
In addition to its operating system requirements, a TaskTracker node requires enough RAM to support the TaskTracker process, the data node virtual machine, and any other process a node may run. Think of this as the node's RAM requirements. A good rule of thumb for setting mapred.java.child.opts for Platfora is:
For example, if a TaskTracker node returns 32 GB of RAM, minus 4 GB reserve memory, then 28 GB is available for all MapReduce tasks. If the maximum map tasks allowed is 5 and the maximum reduce tasks is 3, then no more than 8 tasks can run at one time on the node. 28 GB divided by 8 is 3.5 GB per task ( -Xmx3500M ).
Supplement
* Hadoop 參數設定 – mapred-site.xml
* FAQ - Out of Memory Error in Hadoop
沒有留言:
張貼留言