当前位置:文档之家› hadoop调查报告书

hadoop调查报告书

hadoop调查报告书
hadoop调查报告书

Hadoop

Preferred Infrastructure

20 8 25

?

( NTT Preferred Infras-tructure( Preferred Infrastructure ) NTT Preferred Infrastructure

NTT

?

Preferred Infrastructure

NTT

?

Preferred Infrastructure:E-mail:info@preferred.jp

NTT E-mail:pr@nttr.co.jp

Copyright c NTT Resonant Inc.2008

2008 8 25

1

1 8 1.1 (8)

2 Hadoop 9

3 GFS HDFS 10 3.1GFS (10)

3.1.1 (10)

3.1.2 (10)

3.1.3HDFS (11)

3.2 (11)

3.3 (12)

3.3.1 (12)

3.3.2 (13)

3.3.3 (13)

3.3.4 (13)

3.3.5 (14)

3.3.6 (14)

3.3.7 (14)

3.3.8 (15)

3.3.9 (15)

3.3.10 (15)

3.3.11 (16)

3.3.12 (16)

3.3.13 (17)

3.3.14 (17)

3.3.15 (18)

3.3.16 (18)

3.4 (18)

3.4.1 (18)

3.4.2 (19)

3.4.3 (19)

3.4.4 (20)

3.5 (20)

3.5.1 (20)

3.6 (21)

3.6.1 (21)

3.6.2 (21)

3.6.3 (22)

3.6.4 (22)

3.6.5 (22)

3.6.6 (23)

3.6.7 (23)

3.6.8 (24)

3.6.9 (24)

3.6.10 (Read-Only ) (24)

3.7 (25)

4 Google MapReduce Hadoop MapReduce 26 4.1Google MapReduce (26)

4.1.1 (26)

4.1.2 (27)

4.1.3Hadoop MapReduce (27)

4.2 (27)

4.3 (28)

4.3.1MapReduce (28)

4.3.2 (29)

4.3.3Shu?e (29)

4.3.4Map Reduce (29)

4.3.5Map (30)

4.3.6 (30)

4.3.7 (30)

4.3.8 (31)

4.3.9 (32)

4.4 (32)

4.4.1 (32)

4.4.2 (32)

4.4.3 (33)

4.5 (33)

4.5.1Combine (33)

4.5.2 (34)

4.5.3Map Shu?e (34)

4.5.4 (35)

4.5.5Map (35)

4.6 (35)

4.6.1 (35)

4.6.2 (36)

4.6.3 (36)

4.6.4 (37)

4.6.5 (37)

4.7 (37)

5 38 5.1 (38)

5.2org.apache.hadoop.util (39)

5.2.1MergeSort (39)

5.2.2PriorityQueue (39)

5.2.3Re?ectionUtils (39)

5.2.4RunJar (39)

5.2.5Tool (39)

5.3org.apache.hadoop.io (39)

5.3.1Writable (39)

5.3.2SequenceFile (39)

5.3.3compress (40)

5.4org.apache.hadoop.ipc (40)

5.4.1VersionedProtocol (40)

5.4.2RPC,Server,Client (40)

https://www.doczj.com/doc/7a1558758.html, (41)

5.5.1DNS (41)

5.5.2Node,NodeBase (41)

5.5.3NetworkTopology (41)

5.6org.apache.hadoop.fs (42)

5.6.1FileSystem (42)

5.6.2LocalFileSystem (42)

5.6.3InMemoryFileSystem (42)

5.6.4FSOutputSummer,FSInputStream (43)

5.6.5Path (43)

5.6.6Trash (43)

5.6.7FileUtil (43)

5.6.8FsShell (43)

5.6.9DU,DF (43)

5.7org.apache.hadoop.dfs (43)

5.7.1ClientProtocol (43)

5.7.2DatanodeProtocol (43)

5.7.3NamenodeProtocol (43)

5.7.4DistributedFileSystem (44)

5.7.5DFSClient (44)

5.7.6DataNode (45)

5.7.7NameNode (45)

5.7.8FSNamesystem (45)

5.7.9FSImage,FSEditLog (46)

5.7.10ReplicationTargetChooser (46)

5.7.11SecondaryNameNode (46)

5.7.12Balancer (46)

5.7.13NamenodeFsck (46)

5.8org.apache.hadoop.mapred (47)

5.8.1JobConf (47)

5.8.2InputFormat (47)

5.8.3OutputFormat (48)

5.8.4JobClient (49)

5.8.5JobTracker (49)

5.8.6TaskTracker (50)

5.8.7StatusHttpServer (51)

5.9 (51)

6 52 6.1 (52)

6.1.1 (52)

6.2HDFS (52)

6.2.1 (53)

6.2.2 (54)

6.3 (54)

6.3.1 (54)

6.3.2 (57)

6.4 (58)

7 59 60

2.1Google,OSS (9)

5.1 (40)

5.2 (40)

5.3 (41)

5.4 (41)

5.5 (42)

5.6JobConf (47)

5.7JobConf (48)

6.1bonnie++ (53)

6.21G*100 (53)

6.31G*100 ( (MB)/ ) (53)

6.41G*100 (54)

6.51G*100 ( (MB)/ ) (54)

6.6100G (randomwriter.conf) (55)

6.7100G (55)

6.8100G ( / ) (56)

6.9100G ( (MB)/ ) (56)

6.10 (57)

6.11100G ( / ) (57)

6.12100G ( (MB)/ ) (57)

3.1Google File System Hadoop (11)

4.1Google MapReduce Hadoop (27)

5.1 (38)

6.1 (52)

1

1

1.1

Hadoop[4]

2 Hadoop

3,4 Google Google File System[10] MapReduce[9] Hadoop

5 Hadoop

6 Hadoop

7

Hadoop 0.16.4

2 Hadoop

2

Hadoop

Hadoop Yahoo!Inc. Doug Cutting Lucene[8] Lucene Hadoop Google Google File System( GFS) MapReduce

Hadoop HDFS Hadoop Distributed File System Hadoop MapReduce Framework Google GFS MapReduce 2.1 BigTable hBase

2.1Google,OSS

Hadoop Java MapReduce Java Hadoop Streaming[5] C/C++ Ruby Python MapReduce

3 GFS HDFS

3

GFS HDFS

GFS HDFS GFS Hadoop

3.1GFS

GFS [10]

3.1.1

GFS PC

?

TB

?

PC ?

3.1.2

GFS 64MB PC

GFS 3

?

?

?

GFS

GFS GFS

3.1.3HDFS

HDFS GFS HDFS NameNode DataNode HDFS

3.2

3.1 GFS Hadoop HDFS

3.1:Google File System Hadoop

Hadoop

Hadoop

(Read-Only )

3.3

3.3.1

Hadoop

NameNode NameNode

DFSClient::mkdirs

3.3.2

Hadoop

NameNode NameNode

DFSClient::delete

3.3.3

Hadoop

NameNode NameNode

DFSClient::create

3.3.4

Hadoop

NameNode NameNode

DFSClient::delete

3.3.5

(5.6.6)

Hadoop

delete /trash /trash

NameNode.emptier

3.3.6

Hadoop

DFSInputStream (5.7.5) NameNode DataNode 1

DFSInputStream::read

3.3.7

Hadoop

DFSOutputStream (5.7.5) NameNode DataNode

DFSOutputStream::writeChunk

3.3.8

Hadoop

DFSInputStream (5.7.5) NameNode DataNode NameNode

DFSInputStream::read

3.3.9

Hadoop

3.3.10

Hadoop

DFSClient NameNode NameNode

DFSClient::rename

3.3.11

Hadoop

DFSClient NameNode NameNode

DFSClient::listPaths

3.3.12

Hadoop

whoami bash-c groups ( ) (bin/hadoop dfs)

DFSClient::getFileInfo

https://www.doczj.com/doc/7a1558758.html,/core/docs/current/hdfs permissions guide.html

3.3.13

Hadoop

whoami bash-c groups ( ) (bin/hadoop dfs)

DFSClient::getFileInfo

https://www.doczj.com/doc/7a1558758.html,/core/docs/current/hdfs permissions guide.html

3.3.14

Hadoop

HADOOP-1700

https://www.doczj.com/doc/7a1558758.html,/jira/browse/HADOOP-1700

相关主题
文本预览
相关文档 最新文档