Hadoop场景案例参数调优
创始人
2025-05-29 08:48:14
0

目录

1 需求

2 HDFS参数调优

(1)修改:hadoop-env.sh

(2)修改hdfs-site.xml

(3)修改core-site.xml

(4)分发配置

3 MapReduce参数调优

(1)修改mapred-site.xml

(2)分发配置

4 Yarn参数调优

(1)修改yarn-site.xml配置参数如下

(2)分发配置

5 执行程序

(1)重启集群

(2)执行WordCount程序

(3)观察Yarn任务执行页面


1 需求

(1)需求:从1G数据中,统计每个单词出现次数。服务器3台,每台配置4G内存,4核CPU,4线程。

(2)需求分析:

1G / 128m = 8个MapTask;1个ReduceTask;1个mrAppMaster

平均每个节点运行10个 / 3台 ≈ 3个任务(4     3     3)

2 HDFS参数调优

(1)修改:hadoop-env.sh

export HDFS_NAMENODE_OPTS="-Dhadoop.security.logger=INFO,RFAS -Xmx1024m"export HDFS_DATANODE_OPTS="-Dhadoop.security.logger=ERROR,RFAS -Xmx1024m"

(2)修改hdfs-site.xml

dfs.namenode.handler.count21

(3)修改core-site.xml

fs.trash.interval60

(4)分发配置

xsync hadoop-env.sh hdfs-site.xml core-site.xml

3 MapReduce参数调优

(1)修改mapred-site.xml

mapreduce.task.io.sort.mb100mapreduce.map.sort.spill.percent0.80mapreduce.task.io.sort.factor10mapreduce.map.memory.mb-1The amount of memory to request from the scheduler for each    map task. If this is not specified or is non-positive, it is inferred from mapreduce.map.java.opts and mapreduce.job.heap.memory-mb.ratio. If java-opts are also not specified, we set it to 1024.mapreduce.map.cpu.vcores1mapreduce.map.maxattempts4mapreduce.reduce.shuffle.parallelcopies5mapreduce.reduce.shuffle.input.buffer.percent0.70mapreduce.reduce.shuffle.merge.percent0.66mapreduce.reduce.memory.mb-1The amount of memory to request from the scheduler for each    reduce task. If this is not specified or is non-positive, it is inferredfrom mapreduce.reduce.java.opts and mapreduce.job.heap.memory-mb.ratio.If java-opts are also not specified, we set it to 1024.mapreduce.reduce.cpu.vcores2mapreduce.reduce.maxattempts4mapreduce.job.reduce.slowstart.completedmaps0.05mapreduce.task.timeout600000

(2)分发配置

xsync mapred-site.xml

4 Yarn参数调优

(1)修改yarn-site.xml配置参数如下

The class to use as the resource scheduler.yarn.resourcemanager.scheduler.classorg.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerNumber of threads to handle scheduler interface.yarn.resourcemanager.scheduler.client.thread-count8Enable auto-detection of node capabilities such asmemory and CPU.yarn.nodemanager.resource.detect-hardware-capabilitiesfalseFlag to determine if logical processors(such ashyperthreads) should be counted as cores. Only applicable on Linuxwhen yarn.nodemanager.resource.cpu-vcores is set to -1 andyarn.nodemanager.resource.detect-hardware-capabilities is true.yarn.nodemanager.resource.count-logical-processors-as-coresfalseMultiplier to determine how to convert phyiscal cores tovcores. This value is used if yarn.nodemanager.resource.cpu-vcoresis set to -1(which implies auto-calculate vcores) andyarn.nodemanager.resource.detect-hardware-capabilities is set to true. The number of vcores will be calculated as number of CPUs * multiplier.yarn.nodemanager.resource.pcores-vcores-multiplier1.0Amount of physical memory, in MB, that can be allocatedfor containers. If set to -1 andyarn.nodemanager.resource.detect-hardware-capabilities is true, it isautomatically calculated(in case of Windows and Linux).In other cases, the default is 8192MB.yarn.nodemanager.resource.memory-mb4096Number of vcores that can be allocatedfor containers. This is used by the RM scheduler when allocatingresources for containers. This is not used to limit the number ofCPUs used by YARN containers. If it is set to -1 andyarn.nodemanager.resource.detect-hardware-capabilities is true, it isautomatically determined from the hardware in case of Windows and Linux.In other cases, number of vcores is 8 by default.yarn.nodemanager.resource.cpu-vcores4The minimum allocation for every container request at the RM  in MBs. Memory requests lower than this will be set to the value of this    property. Additionally, a node manager that is configured to have less memory than this value will be shut down by the resource manager.yarn.scheduler.minimum-allocation-mb1024The maximum allocation for every container request at the RM  in MBs. Memory requests higher than this will throw an    InvalidResourceRequestException.yarn.scheduler.maximum-allocation-mb2048The minimum allocation for every container request at the RM  in terms of virtual CPU cores. Requests lower than this will be set to the value of this property. Additionally, a node manager that is configured to  have fewer virtual cores than this value will be shut down by the resource    manager.yarn.scheduler.minimum-allocation-vcores1The maximum allocation for every container request at the RM  in terms of virtual CPU cores. Requests higher than this will throw anInvalidResourceRequestException.yarn.scheduler.maximum-allocation-vcores2Whether virtual memory limits will be enforced forcontainers.yarn.nodemanager.vmem-check-enabledfalseRatio between virtual memory to physical memory when    setting memory limits for containers. Container allocations are    expressed in terms of physical memory, and virtual memory usage   is allowed to exceed this allocation by this ratio.yarn.nodemanager.vmem-pmem-ratio2.1

(2)分发配置

xsync yarn-site.xml

5 执行程序

(1)重启集群

sbin/stop-yarn.shsbin/start-yarn.sh

(2)执行WordCount程序

hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar wordcount /input /output

(3)观察Yarn任务执行页面

http://hadoop103:8088/cluster/apps

相关内容

热门资讯

银价推涨光伏组件报价,下游企业... 来源:第一财经 受成本端银价上涨影响,本周光伏组件价格再次上调。据行业机构Infolink Cons...
黄金史诗级暴跌,原因可能与一纸... 当地时间1月30日,随着美联储前理事凯文·沃什(Kevin Warsh)正式被美国总统特朗普提名为下...
深圳国资七亿下场扫货白石洲? 来源:市场资讯 (来源:深圳房产在线) 最近看到,近日一则消息引发关注,就是今年1月发生一宗白石洲大...
国投智能2025业绩承压 AI... 来源:财联社 财联社1月30日讯(记者 方彦博)2025年,AI应用的商业化落地是众多AI企业面临的...
原创 男... 在爱情的海洋中,星座的波涛有时能揭示出隐藏的情感暗流。当男人在愤怒的风暴中显露出四种迹象时,或许他并...
农业银行董事长谷澍会见英格兰银... 来源:市场资讯 来源:中国农业银行 1月29日,农业银行董事长谷澍会见了英格兰银行副行长兼英国审慎监...
“易中天”,业绩大爆发!需求增... “易中天”2025年度业绩持续爆发! 1月30日晚间,中际旭创发布2025年度业绩预告,预计2025...
双平台战略提速:仙乐健康谋“A... 中国营养健康食品行业的龙头企业仙乐健康,在1月30日向市场投下了一枚重磅消息:公司已正式向香港联交所...
左季庆染指淳厚基金股权纷争为谁... 2026年1月6日,证监会一纸批复核准上海长宁国有资产经营投资有限公司(下称“长宁国资”)成为淳厚基...
上市即巅峰?拉芳家化首度亏损,... 为什么消费端对“拉芳”爱不起来了? 作者 | 方璐 编辑丨于婞 来源 | 野马财经 拉芳家化(603...
原创 黄... 1月31日晚间,英伟达CEO黄仁勋现身中国台湾台北市砖窑古早味怀旧餐厅,宴请了35位与英伟达合作的供...
山西太钢不锈钢股份有限公司 2... 来源:证券日报 证券代码:000825 证券简称:太钢不锈 公告编号:2026-001 本公司及董...
把自己的银行贷款出借给别人,有... 新京报讯(记者张静姝 通讯员邸越洋)因贷款出借后未被归还,原告牛女士将被告杨甲、杨乙诉至法院,要求二...
金价暴跌,刚买的金饰能退吗?有... 黄金价格大跌,多品牌设置退货手续费。 在过去两三天,现货黄金价格经历了“过山车”般的行情,受金价下跌...
预计赚超2500万!“豆腐大王... 图片来源:图虫创意 在经历了一年亏损后,“豆腐大王”祖名股份(003030.SZ)成功实现扭亏为盈。...
特朗普提名“自己人”沃什执掌美... 据新华社报道,当地时间1月30日,美国总统特朗普通过社交媒体宣布,提名美国联邦储备委员会前理事凯文·...
爱芯元智将上市:连年大额亏损,... 撰稿|多客 来源|贝多商业&贝多财经 1月30日,爱芯元智半导体股份有限公司(下称“爱芯元智”,HK...
一夜之间,10只A股拉响警报:... 【导读】深康佳A等10家公司昨夜拉响退市警报 中国基金报记者 夏天 1月30日晚间,A股市场迎来一波...
谁在操控淳厚基金?左季庆为谁趟... 2026年1月6日,证监会一纸批复核准上海长宁国有资产经营投资有限公司(下称“长宁国资”)成为淳厚基...
工商银行党委副书记、行长刘珺会... 人民财讯1月31日电,1月29日,工商银行党委副书记、行长刘珺会见来访的上海电气集团党委书记、董事长...