BigTable的存储与服务请求的响应
◦ 划分为子表存储,每个子表对应一个子表文件,子表文件存储 于GFS之上 ◦ BigTable通过元数据组织子表
Tablet 1: <startRowKey1, Tablet 2: <startRowKey2, Tablet 3: <startRowKey3, Tablet 4: <startRowKey4, endRowKey1>, endRowKey2>, endRowKey3>, endRowKey4>, root\bigtable\tablet1,…… root\bigtable\tablet2,…… root\bigtable\tablet3,…… root\bigtable\中查询行 ◦ 获取对应列的数据,解析,得到并展示最终结果
<aaa.asp,0.9027><bbb.asp,0.0088><ccc.asp,0.0885>
数据处理是定期的,非实时响应查询
Google搜索的总体业务流程
◦ 数据采集: Spider ◦ 数据整理
<com.xxx, <aaa.asp,0.9027> <bbb.asp,0.0088><ccc.asp,0.0885>>
<com.yyy, <bbb.asp,0.0435> <ccc.asp,0.4348><ddd.asp,0.5217>> <com.zzz, <aaa.asp,0.0769> <bbb.asp,0.0769><ddd.asp,0.0769> <ccc.asp,0.7692>>