Hadoop场景案例参数调优
创始人
2025-05-29 08:48:14
0

目录

1 需求

2 HDFS参数调优

(1)修改:hadoop-env.sh

(2)修改hdfs-site.xml

(3)修改core-site.xml

(4)分发配置

3 MapReduce参数调优

(1)修改mapred-site.xml

(2)分发配置

4 Yarn参数调优

(1)修改yarn-site.xml配置参数如下

(2)分发配置

5 执行程序

(1)重启集群

(2)执行WordCount程序

(3)观察Yarn任务执行页面


1 需求

(1)需求:从1G数据中,统计每个单词出现次数。服务器3台,每台配置4G内存,4核CPU,4线程。

(2)需求分析:

1G / 128m = 8个MapTask;1个ReduceTask;1个mrAppMaster

平均每个节点运行10个 / 3台 ≈ 3个任务(4     3     3)

2 HDFS参数调优

(1)修改:hadoop-env.sh

export HDFS_NAMENODE_OPTS="-Dhadoop.security.logger=INFO,RFAS -Xmx1024m"export HDFS_DATANODE_OPTS="-Dhadoop.security.logger=ERROR,RFAS -Xmx1024m"

(2)修改hdfs-site.xml

dfs.namenode.handler.count21

(3)修改core-site.xml

fs.trash.interval60

(4)分发配置

xsync hadoop-env.sh hdfs-site.xml core-site.xml

3 MapReduce参数调优

(1)修改mapred-site.xml

mapreduce.task.io.sort.mb100mapreduce.map.sort.spill.percent0.80mapreduce.task.io.sort.factor10mapreduce.map.memory.mb-1The amount of memory to request from the scheduler for each    map task. If this is not specified or is non-positive, it is inferred from mapreduce.map.java.opts and mapreduce.job.heap.memory-mb.ratio. If java-opts are also not specified, we set it to 1024.mapreduce.map.cpu.vcores1mapreduce.map.maxattempts4mapreduce.reduce.shuffle.parallelcopies5mapreduce.reduce.shuffle.input.buffer.percent0.70mapreduce.reduce.shuffle.merge.percent0.66mapreduce.reduce.memory.mb-1The amount of memory to request from the scheduler for each    reduce task. If this is not specified or is non-positive, it is inferredfrom mapreduce.reduce.java.opts and mapreduce.job.heap.memory-mb.ratio.If java-opts are also not specified, we set it to 1024.mapreduce.reduce.cpu.vcores2mapreduce.reduce.maxattempts4mapreduce.job.reduce.slowstart.completedmaps0.05mapreduce.task.timeout600000

(2)分发配置

xsync mapred-site.xml

4 Yarn参数调优

(1)修改yarn-site.xml配置参数如下

The class to use as the resource scheduler.yarn.resourcemanager.scheduler.classorg.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerNumber of threads to handle scheduler interface.yarn.resourcemanager.scheduler.client.thread-count8Enable auto-detection of node capabilities such asmemory and CPU.yarn.nodemanager.resource.detect-hardware-capabilitiesfalseFlag to determine if logical processors(such ashyperthreads) should be counted as cores. Only applicable on Linuxwhen yarn.nodemanager.resource.cpu-vcores is set to -1 andyarn.nodemanager.resource.detect-hardware-capabilities is true.yarn.nodemanager.resource.count-logical-processors-as-coresfalseMultiplier to determine how to convert phyiscal cores tovcores. This value is used if yarn.nodemanager.resource.cpu-vcoresis set to -1(which implies auto-calculate vcores) andyarn.nodemanager.resource.detect-hardware-capabilities is set to true. The number of vcores will be calculated as number of CPUs * multiplier.yarn.nodemanager.resource.pcores-vcores-multiplier1.0Amount of physical memory, in MB, that can be allocatedfor containers. If set to -1 andyarn.nodemanager.resource.detect-hardware-capabilities is true, it isautomatically calculated(in case of Windows and Linux).In other cases, the default is 8192MB.yarn.nodemanager.resource.memory-mb4096Number of vcores that can be allocatedfor containers. This is used by the RM scheduler when allocatingresources for containers. This is not used to limit the number ofCPUs used by YARN containers. If it is set to -1 andyarn.nodemanager.resource.detect-hardware-capabilities is true, it isautomatically determined from the hardware in case of Windows and Linux.In other cases, number of vcores is 8 by default.yarn.nodemanager.resource.cpu-vcores4The minimum allocation for every container request at the RM  in MBs. Memory requests lower than this will be set to the value of this    property. Additionally, a node manager that is configured to have less memory than this value will be shut down by the resource manager.yarn.scheduler.minimum-allocation-mb1024The maximum allocation for every container request at the RM  in MBs. Memory requests higher than this will throw an    InvalidResourceRequestException.yarn.scheduler.maximum-allocation-mb2048The minimum allocation for every container request at the RM  in terms of virtual CPU cores. Requests lower than this will be set to the value of this property. Additionally, a node manager that is configured to  have fewer virtual cores than this value will be shut down by the resource    manager.yarn.scheduler.minimum-allocation-vcores1The maximum allocation for every container request at the RM  in terms of virtual CPU cores. Requests higher than this will throw anInvalidResourceRequestException.yarn.scheduler.maximum-allocation-vcores2Whether virtual memory limits will be enforced forcontainers.yarn.nodemanager.vmem-check-enabledfalseRatio between virtual memory to physical memory when    setting memory limits for containers. Container allocations are    expressed in terms of physical memory, and virtual memory usage   is allowed to exceed this allocation by this ratio.yarn.nodemanager.vmem-pmem-ratio2.1

(2)分发配置

xsync yarn-site.xml

5 执行程序

(1)重启集群

sbin/stop-yarn.shsbin/start-yarn.sh

(2)执行WordCount程序

hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar wordcount /input /output

(3)观察Yarn任务执行页面

http://hadoop103:8088/cluster/apps

相关内容

热门资讯

企业IP打造指南:小公司低成本... 小公司做企业IP,不是为了装门面,而是让客户在没见到你之前,就能通过内容知道你是谁、你解决什么问题、...
官方:赵心童入选世界斯诺克名人... 北京时间5月8日消息,世界斯诺克巡回赛(WST)今日正式公布了2025/26赛季年终奖项及名人堂更新...
小灰熊AI学员王锋:希望能跟上... 35了,老程序员了。 从进入互联网行业到现在,其实已经做了很多年移动端开发。最早那几年,安卓行业发展...
原创 2... 2026年全国两会把稳定房地产市场列为重点工作,政府工作报告明确提出因城施策控增量、去库存、优供给。...
一年翻倍,六年未归——徽商银行... 文:向善财经 今年的港股市场,与A股市场出现了明显的分化。 A股这边,科技板块在AI浪潮中热闹非凡;...
古井贡酒2025:在行业深度调... 以“稳”为底、以“新”为翼。 文/每日财报 杜康 在行业库存高企、价格倒挂的背景下,当多数酒企在为...
好上好8408万收购鼎瑞芯加码... 5月7日晚,好上好(001298.SZ)抛出一份收购公告,拟以8408万元现金收购深圳市鼎瑞芯科技有...
全面大撤离!李嘉诚英国“套现”... 突发,李嘉诚又卖了。 这次,套现了455亿。 金额不少,但更值得关注的是透露着不同寻常的信号。 因为...
油气价格上涨加剧法国一季度贸易... 据新华社,法国海关7日发布的数据显示,受中东局势推高国际油气价格影响,法国今年第一季度贸易逆差扩大至...
昆仑芯启动科创板IPO上市辅导... 5月8日,据证监会官网显示,昆仑芯(北京)科技股份有限公司于2026年5月7日正式启动科创板上市辅导...
贵州茅台酒股份有限公司关于回购... 来源:上海证券报 证券代码:600519 证券简称:贵州茅台 公告编号:临2026-016 贵州茅...
百度昆仑芯启动科创板上市辅导,... 5月8日,证监会官网显示,昆仑芯(北京)科技股份有限公司 (下称“昆仑芯”)于2026年5月7日正式...
滕州信华的承压时刻:罚单、失信... 2026年4月末,滕州信华美元债单日跌近2%,关联方被列“老赖”。半年前,这家AA+城投曾因非市场化...
002808,或被终止上市! 【导读】因触及财务类退市指标,*ST恒久或被终止上市 中国基金报记者 李智 又一A股或被终止上市。 ...
院士团队掌舵,溧阳这家企业已完... 近日,溧阳天目先导电池材料科技有限公司(下称“天目先导”)官宣完成B轮融资,投资方包括知卓创新资本、...
工商银行全新推出“工盈研选”品... 深圳商报·读创客户端记者 詹钰叶 近日,工商银行重磅推出「工盈研选」基金销售服务品牌,以客户盈利为核...
和讯信息胡云龙:逼空走势,周五... 今天市场出现逼空走势,场内投资者因持有筹码而尤为受益。五一前布局的投资者当前收获颇丰。然而,随着上证...
今晚,油价上调! 4月21日国内成品油价格下调以来,国际市场原油价格剧烈震荡,前期大幅上涨后近日有所回落,本次调价的前...
南方东英旗下两倍做多海力士,成... 【导读】南方东英旗下两倍做多海力士,成为全球最大的个股杠杆及反向产品 中国基金报记者 伊万 人工智能...
原创 金... 黄金,这东西从古至今就没离开过中国人的生活。从老辈人压箱底的小黄鱼,到如今年轻人结婚绕不开的“三金”...