Hadoop场景案例参数调优
创始人
2025-05-29 08:48:14
0

目录

1 需求

2 HDFS参数调优

(1)修改:hadoop-env.sh

(2)修改hdfs-site.xml

(3)修改core-site.xml

(4)分发配置

3 MapReduce参数调优

(1)修改mapred-site.xml

(2)分发配置

4 Yarn参数调优

(1)修改yarn-site.xml配置参数如下

(2)分发配置

5 执行程序

(1)重启集群

(2)执行WordCount程序

(3)观察Yarn任务执行页面


1 需求

(1)需求:从1G数据中,统计每个单词出现次数。服务器3台,每台配置4G内存,4核CPU,4线程。

(2)需求分析:

1G / 128m = 8个MapTask;1个ReduceTask;1个mrAppMaster

平均每个节点运行10个 / 3台 ≈ 3个任务(4     3     3)

2 HDFS参数调优

(1)修改:hadoop-env.sh

export HDFS_NAMENODE_OPTS="-Dhadoop.security.logger=INFO,RFAS -Xmx1024m"export HDFS_DATANODE_OPTS="-Dhadoop.security.logger=ERROR,RFAS -Xmx1024m"

(2)修改hdfs-site.xml

dfs.namenode.handler.count21

(3)修改core-site.xml

fs.trash.interval60

(4)分发配置

xsync hadoop-env.sh hdfs-site.xml core-site.xml

3 MapReduce参数调优

(1)修改mapred-site.xml

mapreduce.task.io.sort.mb100mapreduce.map.sort.spill.percent0.80mapreduce.task.io.sort.factor10mapreduce.map.memory.mb-1The amount of memory to request from the scheduler for each    map task. If this is not specified or is non-positive, it is inferred from mapreduce.map.java.opts and mapreduce.job.heap.memory-mb.ratio. If java-opts are also not specified, we set it to 1024.mapreduce.map.cpu.vcores1mapreduce.map.maxattempts4mapreduce.reduce.shuffle.parallelcopies5mapreduce.reduce.shuffle.input.buffer.percent0.70mapreduce.reduce.shuffle.merge.percent0.66mapreduce.reduce.memory.mb-1The amount of memory to request from the scheduler for each    reduce task. If this is not specified or is non-positive, it is inferredfrom mapreduce.reduce.java.opts and mapreduce.job.heap.memory-mb.ratio.If java-opts are also not specified, we set it to 1024.mapreduce.reduce.cpu.vcores2mapreduce.reduce.maxattempts4mapreduce.job.reduce.slowstart.completedmaps0.05mapreduce.task.timeout600000

(2)分发配置

xsync mapred-site.xml

4 Yarn参数调优

(1)修改yarn-site.xml配置参数如下

The class to use as the resource scheduler.yarn.resourcemanager.scheduler.classorg.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerNumber of threads to handle scheduler interface.yarn.resourcemanager.scheduler.client.thread-count8Enable auto-detection of node capabilities such asmemory and CPU.yarn.nodemanager.resource.detect-hardware-capabilitiesfalseFlag to determine if logical processors(such ashyperthreads) should be counted as cores. Only applicable on Linuxwhen yarn.nodemanager.resource.cpu-vcores is set to -1 andyarn.nodemanager.resource.detect-hardware-capabilities is true.yarn.nodemanager.resource.count-logical-processors-as-coresfalseMultiplier to determine how to convert phyiscal cores tovcores. This value is used if yarn.nodemanager.resource.cpu-vcoresis set to -1(which implies auto-calculate vcores) andyarn.nodemanager.resource.detect-hardware-capabilities is set to true. The number of vcores will be calculated as number of CPUs * multiplier.yarn.nodemanager.resource.pcores-vcores-multiplier1.0Amount of physical memory, in MB, that can be allocatedfor containers. If set to -1 andyarn.nodemanager.resource.detect-hardware-capabilities is true, it isautomatically calculated(in case of Windows and Linux).In other cases, the default is 8192MB.yarn.nodemanager.resource.memory-mb4096Number of vcores that can be allocatedfor containers. This is used by the RM scheduler when allocatingresources for containers. This is not used to limit the number ofCPUs used by YARN containers. If it is set to -1 andyarn.nodemanager.resource.detect-hardware-capabilities is true, it isautomatically determined from the hardware in case of Windows and Linux.In other cases, number of vcores is 8 by default.yarn.nodemanager.resource.cpu-vcores4The minimum allocation for every container request at the RM  in MBs. Memory requests lower than this will be set to the value of this    property. Additionally, a node manager that is configured to have less memory than this value will be shut down by the resource manager.yarn.scheduler.minimum-allocation-mb1024The maximum allocation for every container request at the RM  in MBs. Memory requests higher than this will throw an    InvalidResourceRequestException.yarn.scheduler.maximum-allocation-mb2048The minimum allocation for every container request at the RM  in terms of virtual CPU cores. Requests lower than this will be set to the value of this property. Additionally, a node manager that is configured to  have fewer virtual cores than this value will be shut down by the resource    manager.yarn.scheduler.minimum-allocation-vcores1The maximum allocation for every container request at the RM  in terms of virtual CPU cores. Requests higher than this will throw anInvalidResourceRequestException.yarn.scheduler.maximum-allocation-vcores2Whether virtual memory limits will be enforced forcontainers.yarn.nodemanager.vmem-check-enabledfalseRatio between virtual memory to physical memory when    setting memory limits for containers. Container allocations are    expressed in terms of physical memory, and virtual memory usage   is allowed to exceed this allocation by this ratio.yarn.nodemanager.vmem-pmem-ratio2.1

(2)分发配置

xsync yarn-site.xml

5 执行程序

(1)重启集群

sbin/stop-yarn.shsbin/start-yarn.sh

(2)执行WordCount程序

hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar wordcount /input /output

(3)观察Yarn任务执行页面

http://hadoop103:8088/cluster/apps

相关内容

热门资讯

山西太钢不锈钢股份有限公司 2... 来源:证券日报 证券代码:000825 证券简称:太钢不锈 公告编号:2026-001 本公司及董...
把自己的银行贷款出借给别人,有... 新京报讯(记者张静姝 通讯员邸越洋)因贷款出借后未被归还,原告牛女士将被告杨甲、杨乙诉至法院,要求二...
金价暴跌,刚买的金饰能退吗?有... 黄金价格大跌,多品牌设置退货手续费。 在过去两三天,现货黄金价格经历了“过山车”般的行情,受金价下跌...
预计赚超2500万!“豆腐大王... 图片来源:图虫创意 在经历了一年亏损后,“豆腐大王”祖名股份(003030.SZ)成功实现扭亏为盈。...
特朗普提名“自己人”沃什执掌美... 据新华社报道,当地时间1月30日,美国总统特朗普通过社交媒体宣布,提名美国联邦储备委员会前理事凯文·...
爱芯元智将上市:连年大额亏损,... 撰稿|多客 来源|贝多商业&贝多财经 1月30日,爱芯元智半导体股份有限公司(下称“爱芯元智”,HK...
一夜之间,10只A股拉响警报:... 【导读】深康佳A等10家公司昨夜拉响退市警报 中国基金报记者 夏天 1月30日晚间,A股市场迎来一波...
谁在操控淳厚基金?左季庆为谁趟... 2026年1月6日,证监会一纸批复核准上海长宁国有资产经营投资有限公司(下称“长宁国资”)成为淳厚基...
工商银行党委副书记、行长刘珺会... 人民财讯1月31日电,1月29日,工商银行党委副书记、行长刘珺会见来访的上海电气集团党委书记、董事长...
布米普特拉北京投资基金管理有限... 从亚马逊到联合包裹,一场席卷美国企业的“瘦身”行动正在持续。多家企业近期承认,近年来的扩张步伐迈得过...
酒价内参1月31日价格发布 飞... 来源:酒业内参 新浪财经“酒价内参”过去24小时收集的数据显示,中国白酒市场十大单品的终端零售均价在...
筹码集中的绩优滞涨热门赛道股出... 2025年以来,在受多重因素的刺激下,科技、航天、基础化工等热门赛道中走出轮番上涨的结构性行情,其中...
2026年A股上市公司退市潮开... 来源:界面新闻 界面新闻记者 赵阳戈 随着2026年序幕拉开,A股市场新一轮“出清”即将上演。...
雷军官宣新直播:走进小米汽车工... 【太平洋科技快讯】1 月 31 日消息,小米创办人、董事长兼 CEO 雷军在社交媒体发文宣布,将于 ...
现货黄金直线跳水,跌破5200... 新闻荐读 1月29日晚,现货黄金白银快速走低,回吐盘中全部涨幅。23:15左右,现货黄金跌破5300...
加拿大拟与多国联合设立国防银行 新华社北京1月31日电 加拿大财政部长商鹏飞1月30日说,加拿大将在未来数月与国际伙伴密切合作,推进...
马斯克大消息!SpaceX申请... 据券商中国,美东时间1月30日,路透社报道,据两位知情人士透露,马斯克旗下SpaceX公司2025年...
澳网:雷巴金娜2-1萨巴伦卡女... 北京时间1月31日,2026赛季网球大满贯澳大利亚公开赛继续进行,在女单决赛中,5号种子雷巴金娜6-...
春节前白酒促销热:“扫码抽黄金... 春节临近,白酒市场再现价格异动。 近日,飞天茅台批价拉升,有酒商直言“年前要冲2000元关口”,引发...