Hadoop
Preferred Infrastructure
20 8 25
?
( NTT Preferred Infras-tructure( Preferred Infrastructure ) NTT Preferred Infrastructure
NTT
?
Preferred Infrastructure
NTT
?
Preferred Infrastructure:E-mail:info@preferred.jp
NTT E-mail:pr@nttr.co.jp
Copyright c NTT Resonant Inc.2008
2008 8 25
1
1 8 1.1 (8)
2 Hadoop 9
3 GFS HDFS 10 3.1GFS (10)
3.1.1 (10)
3.1.2 (10)
3.1.3HDFS (11)
3.2 (11)
3.3 (12)
3.3.1 (12)
3.3.2 (13)
3.3.3 (13)
3.3.4 (13)
3.3.5 (14)
3.3.6 (14)
3.3.7 (14)
3.3.8 (15)
3.3.9 (15)
3.3.10 (15)
3.3.11 (16)
3.3.12 (16)
3.3.13 (17)
3.3.14 (17)
3.3.15 (18)
3.3.16 (18)
3.4 (18)
3.4.1 (18)
3.4.2 (19)
3.4.3 (19)
3.4.4 (20)
3.5 (20)
3.5.1 (20)
3.6 (21)
3.6.1 (21)
3.6.2 (21)
3.6.3 (22)
3.6.4 (22)
3.6.5 (22)
3.6.6 (23)
3.6.7 (23)
3.6.8 (24)
3.6.9 (24)
3.6.10 (Read-Only ) (24)
3.7 (25)
4 Google MapReduce Hadoop MapReduce 26 4.1Google MapReduce (26)
4.1.1 (26)
4.1.2 (27)
4.1.3Hadoop MapReduce (27)
4.2 (27)
4.3 (28)
4.3.1MapReduce (28)
4.3.2 (29)
4.3.3Shu?e (29)
4.3.4Map Reduce (29)
4.3.5Map (30)
4.3.6 (30)
4.3.7 (30)
4.3.8 (31)
4.3.9 (32)
4.4 (32)
4.4.1 (32)
4.4.2 (32)
4.4.3 (33)
4.5 (33)
4.5.1Combine (33)
4.5.2 (34)
4.5.3Map Shu?e (34)
4.5.4 (35)
4.5.5Map (35)
4.6 (35)
4.6.1 (35)
4.6.2 (36)
4.6.3 (36)
4.6.4 (37)
4.6.5 (37)
4.7 (37)
5 38 5.1 (38)
5.2org.apache.hadoop.util (39)
5.2.1MergeSort (39)
5.2.2PriorityQueue (39)
5.2.3Re?ectionUtils (39)
5.2.4RunJar (39)
5.2.5Tool (39)
5.3org.apache.hadoop.io (39)
5.3.1Writable (39)
5.3.2SequenceFile (39)
5.3.3compress (40)
5.4org.apache.hadoop.ipc (40)
5.4.1VersionedProtocol (40)
5.4.2RPC,Server,Client (40)
https://www.doczj.com/doc/7a1558758.html, (41)
5.5.1DNS (41)
5.5.2Node,NodeBase (41)
5.5.3NetworkTopology (41)
5.6org.apache.hadoop.fs (42)
5.6.1FileSystem (42)
5.6.2LocalFileSystem (42)
5.6.3InMemoryFileSystem (42)
5.6.4FSOutputSummer,FSInputStream (43)
5.6.5Path (43)
5.6.6Trash (43)
5.6.7FileUtil (43)
5.6.8FsShell (43)
5.6.9DU,DF (43)
5.7org.apache.hadoop.dfs (43)
5.7.1ClientProtocol (43)
5.7.2DatanodeProtocol (43)
5.7.3NamenodeProtocol (43)
5.7.4DistributedFileSystem (44)
5.7.5DFSClient (44)
5.7.6DataNode (45)
5.7.7NameNode (45)
5.7.8FSNamesystem (45)
5.7.9FSImage,FSEditLog (46)
5.7.10ReplicationTargetChooser (46)
5.7.11SecondaryNameNode (46)
5.7.12Balancer (46)
5.7.13NamenodeFsck (46)
5.8org.apache.hadoop.mapred (47)
5.8.1JobConf (47)
5.8.2InputFormat (47)
5.8.3OutputFormat (48)
5.8.4JobClient (49)
5.8.5JobTracker (49)
5.8.6TaskTracker (50)
5.8.7StatusHttpServer (51)
5.9 (51)
6 52 6.1 (52)
6.1.1 (52)
6.2HDFS (52)
6.2.1 (53)
6.2.2 (54)
6.3 (54)
6.3.1 (54)
6.3.2 (57)
6.4 (58)
7 59 60
2.1Google,OSS (9)
5.1 (40)
5.2 (40)
5.3 (41)
5.4 (41)
5.5 (42)
5.6JobConf (47)
5.7JobConf (48)
6.1bonnie++ (53)
6.21G*100 (53)
6.31G*100 ( (MB)/ ) (53)
6.41G*100 (54)
6.51G*100 ( (MB)/ ) (54)
6.6100G (randomwriter.conf) (55)
6.7100G (55)
6.8100G ( / ) (56)
6.9100G ( (MB)/ ) (56)
6.10 (57)
6.11100G ( / ) (57)
6.12100G ( (MB)/ ) (57)
3.1Google File System Hadoop (11)
4.1Google MapReduce Hadoop (27)
5.1 (38)
6.1 (52)
1
1
1.1
Hadoop[4]
2 Hadoop
3,4 Google Google File System[10] MapReduce[9] Hadoop
5 Hadoop
6 Hadoop
7
Hadoop 0.16.4
2 Hadoop
2
Hadoop
Hadoop Yahoo!Inc. Doug Cutting Lucene[8] Lucene Hadoop Google Google File System( GFS) MapReduce
Hadoop HDFS Hadoop Distributed File System Hadoop MapReduce Framework Google GFS MapReduce 2.1 BigTable hBase
2.1Google,OSS
Hadoop Java MapReduce Java Hadoop Streaming[5] C/C++ Ruby Python MapReduce
3 GFS HDFS
3
GFS HDFS
GFS HDFS GFS Hadoop
3.1GFS
GFS [10]
3.1.1
GFS PC
?
TB
?
PC ?
3.1.2
GFS 64MB PC
GFS 3
?
?
?
GFS
GFS GFS
3.1.3HDFS
HDFS GFS HDFS NameNode DataNode HDFS
3.2
3.1 GFS Hadoop HDFS
3.1:Google File System Hadoop
Hadoop
Hadoop
(Read-Only )
3.3
3.3.1
Hadoop
NameNode NameNode
DFSClient::mkdirs
3.3.2
Hadoop
NameNode NameNode
DFSClient::delete
3.3.3
Hadoop
NameNode NameNode
DFSClient::create
3.3.4
Hadoop
NameNode NameNode
DFSClient::delete
3.3.5
(5.6.6)
Hadoop
delete /trash /trash
NameNode.emptier
3.3.6
Hadoop
DFSInputStream (5.7.5) NameNode DataNode 1
DFSInputStream::read
3.3.7
Hadoop
DFSOutputStream (5.7.5) NameNode DataNode
DFSOutputStream::writeChunk
3.3.8
Hadoop
DFSInputStream (5.7.5) NameNode DataNode NameNode
DFSInputStream::read
3.3.9
Hadoop
3.3.10
Hadoop
DFSClient NameNode NameNode
DFSClient::rename
3.3.11
Hadoop
DFSClient NameNode NameNode
DFSClient::listPaths
3.3.12
Hadoop
whoami bash-c groups ( ) (bin/hadoop dfs)
DFSClient::getFileInfo
https://www.doczj.com/doc/7a1558758.html,/core/docs/current/hdfs permissions guide.html
3.3.13
Hadoop
whoami bash-c groups ( ) (bin/hadoop dfs)
DFSClient::getFileInfo
https://www.doczj.com/doc/7a1558758.html,/core/docs/current/hdfs permissions guide.html
3.3.14
Hadoop
HADOOP-1700
https://www.doczj.com/doc/7a1558758.html,/jira/browse/HADOOP-1700