当前位置: 首页 > 编程日记 > 正文

一:HDFS 用户指导

1.hdfs的牛逼特性

  • Hadoop, including HDFS, is well suited for distributed storage and distributed processing using commodity hardware. It is fault tolerant, scalable, and extremely simple to expand. MapReduce, well known for its simplicity and applicability for large set of distributed applications, is an integral part of Hadoop. 分布式存储
  • HDFS is highly configurable with a default configuration well suited for many installations. Most of the time, configuration needs to be tuned only for very large clusters. 适当的配置
  • Hadoop is written in Java and is supported on all major platforms. 平台适应性
  • Hadoop supports shell-like commands to interact with HDFS directly. shell-like的操作方式
  • The NameNode and Datanodes have built in web servers that makes it easy to check current status of the cluster. 内置web服务,方便检查集群
  • New features and improvements are regularly implemented in HDFS. The following is a subset of useful features in HDFS:
    • File permissions and authentication.  文件权限验证
    • Rack awareness: to take a node's physical location into account while scheduling tasks and allocating storage.
    • Safemode: an administrative mode for maintenance.  安全模式,用于运维
    • fsck: a utility to diagnose health of the file system, to find missing files or blocks.  检查文件系统的工具,发现丢失的文件或者块
    • fetchdt: a utility to fetch DelegationToken and store it in a file on the local system.
    • Balancer: tool to balance the cluster when the data is unevenly distributed among DataNodes.
    • Upgrade and rollback: after a software upgrade, it is possible to rollback to HDFS' state before the upgrade in case of unexpected problems.
    • Secondary NameNode: performs periodic checkpoints of the namespace and helps keep the size of file containing log of HDFS modifications within certain limits at the NameNode.
    • Checkpoint node: performs periodic checkpoints of the namespace and helps minimize the size of the log stored at the NameNode containing changes to the HDFS. Replaces the role previously filled by the Secondary NameNode, though is not yet battle hardened. The NameNode allows multiple Checkpoint nodes simultaneously, as long as there are no Backup nodes registered with the system.
    • Backup node: An extension to the Checkpoint node. In addition to checkpointing it also receives a stream of edits from the NameNode and maintains its own in-memory copy of the namespace, which is always in sync with the active NameNode namespace state. Only one Backup node may be registered with the NameNode at once.
      来源: http://hadoop.apache.org/docs/r2.6.4/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html
2.webUI
默认是50070端口
3.hdfs基本管理命令
bin/hdfs dfsadmin -参数
  • -report: reports basic statistics of HDFS. Some of this information is also available on the NameNode front page. 报告状态
  • -safemode: though usually not required, an administrator can manually enter or leave Safemode.  开启安全模式
  • -finalizeUpgrade: removes previous backup of the cluster made during last upgrade. 删除上次集群更新时的备份
  • -refreshNodes: Updates the namenode with the set of datanodes allowed to connect to the namenode. Namenodes re-read datanode hostnames in the file defined bydfs.hostsdfs.hosts.exclude. Hosts defined in dfs.hosts are the datanodes that are part of the cluster. If there are entries in dfs.hosts, only the hosts in it are allowed to register with the namenode. Entries in dfs.hosts.exclude are datanodes that need to be decommissioned. Datanodes complete decommissioning when all the replicas from them are replicated to other datanodes. Decommissioned nodes are not automatically shutdown and are not chosen for writing for new replicas.
  • -printTopology : Print the topology of the cluster. Display a tree of racks and datanodes attached to the tracks as viewed by the NameNode. 打印拓扑
4.secondary namenode
namenode把文件系统的修改以日志追加方式写到本地文件系统,namenode启动时,先从镜像中读取HDFS的状态,然后再把日志中的修改合并到镜像中,再打开一个新的日志文件接收新的修改。namenode仅仅在启动时才合并状态镜像和日志,所以日志可能会变的非常大,在下次启动时需要合并的内容太多导致启动时间很长。
secondary namenode定时的从namenode合并日志,并且保证日志大小限制在一定的范围内。一般不和主namenode放一起,但机器的配置要和namenode一样。
secondary namenode上的checkpoint 里程由以下两个参数控制:
  • dfs.namenode.checkpoint.period, set to 1 hour by default, specifies the maximum delay between two consecutive checkpoints, and
  • dfs.namenode.checkpoint.txns, set to 1 million by default, defines the number of uncheckpointed transactions on the NameNode which will force an urgent checkpoint, even if the checkpoint period has not been reached.
dfs.namenode.checkpoint.preiod  两次执行checkpoint之间的最大时间间隔
dfs.namenode.checkpoint.txns    当没有checkpoint的事务达到多少时执行,即使未达到上面的参数设置的时间,默认是100万(比如10分钟修改了100万个,那么10分钟就执行一次checkpoint而非1小时)
5.checkpoint node
和secondary namenode极为相似,不同的地方是checkpoint下载hdfs状态镜像和日志文件,并在本地合并,合并后还上传到正在运行的namenode.
dfs.namenode.backup.address       地址
dfs.namenode.backup.http-address  ip端口
dfs.namenode.checkpoint.preiod 和dfs.namenode.checkpoint.txns  同样影响checkpoint
checkpoint node和secondary namenode实际上就是一个东西,只是名称有所不同
6.backup node
backup node的功能和checkpoint node一样,但是backup node能实时的从namenode读取namespace变化数据并合并到本地(注意:namenode是不合并,只有重启后才合并),所以backup node是namenode的完全实时备份。
目前一个集群只能有一个backup node,未来可以支持多个。一旦有个backup node,checkpoint node就无法再注册进集群。backup node的配置文件和checkpoint一致(dfs.namenode.backup.address \ dfs.namenode.backup.http-address),以bin/hdfs namenode -backup启动
7.import checkpoint
如果镜像文件和日志文件丢失,可以用import checkpoint方式从checkpoint节点读取。需要配置三个参数:
dfs.namenode.name.dir namenode的元数据文件夹
dfs.namenode.checkpoint.dir checkpoint node上传镜像的文件夹
以-importCheckpoint的方式启动namenode
8.balancer
HDFS中数据可能不是均衡的放在集群中。考虑到一下情况:
  • Policy to keep one of the replicas of a block on the same node as the node that is writing the block.  在当前读写的节点中保存一个数据备份。
  • Need to spread different replicas of a block across the racks so that cluster can survive loss of whole rack. 保存数据分布到各个机架,可以允许整个机架的丢失
  • One of the replicas is usually placed on the same rack as the node writing to the file so that cross-rack network I/O is reduced.
  • Spread HDFS data uniformly across the DataNodes in the cluster.
    来源: http://hadoop.apache.org/docs/r2.6.4/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Balancer
9.机架感知,略
10.safemode
当集群重新启动时,namenode读取状态镜像和日志信息,此时namenode等待datanode报告块信息,所以不会立即打开集群,此时namenode处于safemode,集群处于只读状态。等datanode报告完块信息后,集群自动打开,解除safemode状态。可以手动设置safemode状态。
11.fsck
fsck命令用来检查文件(文件块)不一致,与传统的fsck不一样的地方是,该命令并不修正错误,默认下不检查已经打开的文件.fsck命令不是hadoop shell 命令,但是可以以bin/hdfs fsck启动.
12.fecthdt
HDFS支持fecthdt命令来读取口令并存放在本地文件系统中.该口令可用于非安全验证的客户端连接到安全的服务器上(比如namenode).略..
13.recovery mode
恢复模式.如果仅有的namemode元数据丢失了,可以通过recovery mode找到部分数据,此时以namenode -recover启动namenode,然后按照提示输入文件位置,可以使用force参数不输入让hdfs自己找文件位置
14.upgrade and rollback
升级和回滚.略
15.File permissions and security
文件权限和安全.HDFS的文件权限类似LINUX.启动namenode的用户被视为HDFS的超级用户.
16.可扩展性
HDFS可以支持数千个节点的集群.每个集群只有一个namenode,因此namenode的内存成为集群大小的限制

<wiz_tmp_tag id="wiz-table-range-border" contenteditable="false" style="display: none;">



来自为知笔记(Wiz)



转载于:https://www.cnblogs.com/skyrim/p/7455503.html

相关文章:

uni-app 封装企业微信config

第一步&#xff0c;在项目根目录加一个html文件&#xff0c; index.html 代码如下&#xff1a; <!DOCTYPE html> <html lang"zh-CN"><head><meta charset"utf-8"><meta http-equiv"X-UA-Compatible" content"I…

sqoop架构_SQOOP架构的深入介绍

sqoop架构by Jayvardhan Reddy通过杰伊瓦尔丹雷迪(Jayvardhan Reddy) SQOOP架构的深入介绍 (An in-depth introduction to SQOOP architecture) Apache Sqoop is a data ingestion tool designed for efficiently transferring bulk data between Apache Hadoop and structure…

JS实现录音,播放完整代码带示例图

效果图&#xff1a; 实现代码&#xff1a; <!DOCTYPE html> <html><head><script src"recorder.js" type"text/javascript" charset"utf-8"></script><meta name"viewport" content"widthdevi…

r.json()

requests模块中&#xff0c;r.json()为Requests中内置的JSON解码器 其中只有response返回为json格式时&#xff0c;用r.json()打印出响应的内容&#xff0c; 如果response返回不为json格式&#xff0c;使用r.json()会报错 报错内容&#xff1a;ValueError: Expecting property …

冒泡排序语法树

转载于:https://www.cnblogs.com/alfredzhu/p/4939268.html

valve 的设计_向Valve Portal开发人员学习游戏设计原则

valve 的设计In this talk, Valve programers who created the game Portal discuss problems they faced in development and how they solved them. Leaning about how they solved Portal problems can give you insight into how to design better games.在本次演讲中&…

Android之控件使用

Android系统为我们提供了大量的控件&#xff0c;例如&#xff1a;开关控件、单选按钮、多选按钮、单选菜单等等&#xff0c;那么这些控件如何使用呢&#xff1f;本篇我将带领大家一道学习一下如何使用这些控件。所谓无图无真相&#xff0c;先让大家看一下效果图&#xff1a; 下…

《对软件工程课程的期望》

要学习到的能力的预期&#xff1a;要学会个人&#xff0c;结对&#xff0c;团队的代码编辑流程&#xff0c;学会和别人进行交流。 对项目课程的期望&#xff1a;希望不是枯燥的代码详解。 对项目的愿景规划&#xff1a;希望团队里的每个人都能学到有用的知识。转载于:https://w…

HTML发送语音,上传音频PHP接收

实现需求&#xff1a;网页录制音频上传给后端接收&#xff0c;接收后PHP把文件的名字存到数据库的表里面&#xff0c;这里我的后端用的是PHP&#xff0c;并且把代码贴出来了。 前端实现代码&#xff1a; <!DOCTYPE HTML> <html><head><meta http-equiv&q…

html:漂亮的原生表格_HTML表格:关于它们的所有知识

html:漂亮的原生表格by Alexander Gilmanov亚历山大吉尔马诺夫(Alexander Gilmanov) HTML表格&#xff1a;关于它们的所有知识 (HTML Tables: All there is to know about them) Judging by the fact that we created wpDataTables, it’s no secret that we like tables. So …

[BZOJ] 1606: [Usaco2008 Dec]Hay For Sale 购买干草

1606: [Usaco2008 Dec]Hay For Sale 购买干草 Time Limit: 5 Sec Memory Limit: 64 MBSubmit: 1335 Solved: 989[Submit][Status][Discuss]Description 约翰遭受了重大的损失&#xff1a;蟑螂吃掉了他所有的干草&#xff0c;留下一群饥饿的牛&#xff0e;他乘着容量为C(1≤C≤…

PHP TP5框架 安装运行 Warning: require(E:\phpstudy_pro\WWW\TP5\tp5\public/../thinkphp/base.php): failed to

创建一个新的项目&#xff1a;进入项目的根目录执行 git 命令&#xff1a; 先执行 git clone -b 5.1 https://git.coding.net/liu21st/thinkphp5.git tp5 进入 tp5目录 cd tp5再执行 git clone -b 5.1 https://git.coding.net/liu21st/framework.git thinkphp 执行更新框…

python之模块base64

# -*- coding: cp936 -*- #python 27 #xiaodeng>>> help(base64) #用来作base64编码解码 FUNCTIONS #函数(功能) •b16decode(s, casefoldFalse)Decode a Base16 encoded string. #解码 decode_stringbase64…

github pages_使用GitHub Pages和Lighthouse增强您的开发人员产品组合

github pagesFor someone who is trying to break into software development, it doesn’t matter where you look — LinkedIn, career advice boards, youtube tutorials — the advice is always the same: you need a portfolio. freeCodeCamp knows this advise, and the…

Angular 4+ HttpClient

个人博客迁移至 http://www.sulishibaobei.com 处&#xff1b; 这篇&#xff0c;算是上一篇Angular 4 Http的后续&#xff1b; Angular 4.3.0-rc.0 版本已经发布?。在这个版本中&#xff0c;我们等到了一个令人兴奋的新功能 - HTTPClient API 的改进版本&#xff1b; HttpCli…

PHP TP5入门 二:写接口,添加控制器并访问

默认访问地址&#xff1a;http://localhost/TP5/tp5/public/index.php/index/hello_world 实现代码&#xff1a; <?php namespace app\index\controller;class HelloWorld {public function index(){return 22hello&#xff0c;world&#xff01;;} } 添加一个控制器如…

Possion 分布

泊松分布的概率函数为&#xff1a; \[P(Xk)\frac{\lambda^k}{k!}e^{-\lambda},k0,1,2,\cdots\] 如果 $X_i \sim P(\lambda_i)$,并且 互相独立&#xff0c;那么: \[Y\left( \sum\limits_{i1}^n{X_i} \right) \sim P \left( \sum\limits_{i1}^n{\lambda_i} \right)\] 从上面公式…

如何使您的Kotlin Android动画可访问

When researching examples for a first ever Android contribution, few examples existed for animations written in Kotlin. There were also few code examples of accessibility considerations within native animations.在研究有史以来第一个Android贡献的示例时&#…

指针空间的申请与释放

一、malloc()和free()的基本概念以及基本用法&#xff1a; 1、函数原型及说明&#xff1a; void *malloc(long NumBytes)&#xff1a;该函数分配了NumBytes个字节&#xff0c;并返回了指向这块内存的指针。如果分配失败&#xff0c;则返回一个空指针&#xff08;NULL&#xff0…

UIGraphicsBeginImageContext - 位图上下文

UIGraphicsBeginImageContext 首先&#xff0c;先来认识一个UIGraphicsBeginImageContext&#xff0c;它会创建一个基于位图的上下文(context)&#xff08;默认创建一个透明的位图上下文&#xff09;,并将其设置为当前上下文。 位图图形上下文UIKit是不会负责创建的&#xff0c…

小程序双击事件

代码&#xff1a; <button data-time"{{lastTapTime}}" data-title"标题" bindtap"doubleClick">双击</button> js data: {lastTapTime:0,}, doubleClick: function (e) {var curTime e.timeStampvar lastTime e.currentTarget…

快速了解Kubernetes微服务中的通信

by Adam Henson亚当汉森(Adam Henson) 快速了解Kubernetes微服务中的通信 (A quick look at communication in Kubernetes microservices) “服务”概念和一个Node.js示例 (The “service” concept and a Node.js example) Based on complexity, a layer of microservices ca…

连接 linux服务器

操作步骤&#xff1a; xshell 下载 https://xshell.en.softonic.com/ 点击下载后&#xff0c;会有邮箱验证&#xff0c;点击验证通过就会自动下载&#xff0c;然后安装就行。 打开工具&#xff0c;点击新建会话 然后 浏览文件后直接点击确认&#xff0c;出来这样就登录成功了…

【bzoj3924】[Zjoi2015]幻想乡战略游戏 动态点分治

题目描述 傲娇少女幽香正在玩一个非常有趣的战略类游戏&#xff0c;本来这个游戏的地图其实还不算太大&#xff0c;幽香还能管得过来&#xff0c;但是不知道为什么现在的网游厂商把游戏的地图越做越大&#xff0c;以至于幽香一眼根本看不过来&#xff0c;更别说和别人打仗了。 …

面试题05-UI控件

怎么解决缓存池满的问题(cell)ios中不存在缓存池满的情况&#xff0c;因为通常我们ios中开发&#xff0c;对象都是在需要的时候才会创建&#xff0c;有种常用的说话叫做懒加载&#xff0c;还有在UITableView中一般只会创建刚开始出现在屏幕中的cell&#xff0c;之后都是从缓存池…

全球链界科技发展大会_如何成为科技界的团队合作者

全球链界科技发展大会by Ofer Vugman由Ofer Vugman 如何成为科技界的团队合作者 (How to be a team player in the tech world) 这些技巧将增进您的关系并提高团队的工作效率 (These tips will boost your relationships and your team’s efficiency at work) When I landed …

linux驱动之i2c子系统mpu6050设备驱动

以下是mpu6050简单的驱动实现&#xff0c;mpu6050是I2C接口的6轴传感器&#xff0c;可以作为字符设备注册到内核&#xff0c;本代码运行环境是3.4.2内核&#xff0c;4.3.2版本的编译链&#xff0c;12.04版本的Ubuntu&#xff0c;硬件环境是jz2440开发板&#xff1b; 按照之前分…

小程序使用富文本完整代码及示例图

先看示例图&#xff1a; 富文本html代码&#xff1a; 效果图&#xff1a; 实现步骤&#xff1a; 1.下载 wxParse代码放到你的小程序项目目录里面 https://github.com/icindy/wxParse 基本使用方法 Copy文件夹wxParse - wxParse/-wxParse.js(必须存在)-html2json.js(必须存在…

C# 百分比的获取

这里介绍 C# 百分比转换有2种方式 例&#xff1a; double a50; double b100; a/b.ToString("0.00%"); 或 a/b.ToString("P3"); p后的数字表示能显示小数点后几位的精度数 实际如&#xff1a; 方法一&#xff1a;a/b.ToString("0.00%"); 方法二&a…

css 网格布局_我从CSS网格布局中学到的东西

css 网格布局by Jennifer Wjertzoch珍妮弗维佐奇 我从CSS网格布局中学到的东西 (Things I’ve learned about CSS grid layout) With CSS Grid you can create complex web designs. It is very intuitive and very well supported by the major browsers. In this article I …