谷歌工程师讲解云计算视频--梦飞翔的地方(梦翔天空)

载入中。。。 'S bLog

载入中。。。

谷歌工程师讲解云计算视频

[ 2011/2/13 13:47:00 | By: 梦翔儿 ]

谷歌工程师讲解云计算视频：http://v.youku.com/v_show/id_XMTEwMDEwOTU2.html

叶平是 Google 的软件工程师。他于 1996 年在台湾大学物理系取得博士学位，主修实验粒子物理学。于 1998 年起在中研院担任助研究员，2001 年换至台湾大学高能物理实验室任客座及兼任助理教授。自 1993 年起，他参与过美国费米加速器实验室、美国太空总署、欧洲粒子物理中心、日本高能加速器研究机构的粒子物理实验或天文粒子观测，之后于 2007 年夏天加入 Google。叶平目前是 Google 台湾繁体中文搜寻品质工作及大学计划和台湾云计算计划的负责人。

转自：http://www.programmer.com.cn/515/

＝＝＝＝＝＝＝＝

Evolution of Computing:

network computing; -- Network is Computer.(client/server); Seperation of Funtionalities
cluster computing; -- Tightly coupled compputing resources: CPU, Storage, Data (usually within a LAN)
grid computing; -- Resource sharing across domains; Decentralized; Open standards
utility computing. -- Don't buy computers, lease computing powering; Ownership model

Next step: Cloud computing

Definition: services and data, put in the cloud, accessible, scalability

Powers of Cloud Computing:

commodity hardware;
infrastucture software:

GFS;
BigTable;
MapReduce

Web Application:

Serving;
Database;
Storage;
Data Processing

Google's Solution:

Goolge AppEngine;
BigTable;
GFS;
MapReduce

Implimentation:

Hadoop -- GFS and MapReduce
HBase -- BigTable
HDFS -- GFS
...

GFS:

Files are broken into chunks;
Chunks triplicated across three different servers;
Metadata managed by Master; (chunkId, path, serverId, etc)
Data transfers happen directly between clients and chunk servers

GFS used in Google:

200+ clusters;
File system clusters of up to 5000+ machine;
Pools of 10000+ clients;
Read is Faster than Write;

BigTable:

3-dimension: row, column, timestamp;
Distributed;
Scalable;
Self-Managing -- add/remove servers; load balance;
System structure: -- BigTable Cell (ie, cluster)

BigTable client library
Master; -- metadata operations; load balance
Tablet Server; -- serves data
Cluster scheduling system -- handle failover, monitoring;
GFS -- holds tablet data, logs;
Lock service -- hold metadata, master election

Currently -500 cells

Distributed Data Processing:

1. Technical Issues:

File Management: where to store files? Distributed; -- Master
Granularity: Splitting;
Job Allocation: assign which task to which node? prefer local job;
Fault-recovery: what if a node crashes?Redundancy, Crash-detection, Job re-allocation;

2. MapReduce:

Map: (in_key, in_value) --> ( (keyj, value) | j = 1, 2, ...)
Reduce: (key, (value1, value2, ... )) --> (key, f_value)

3. Configuration:

200,000 mappers;
500 reducers;
2000 nodes

Data Playground:

MapReduce + BigTable + GFS

Summary:
Cloud Computing is about scalable web applications and data processing.

Reference:
http://v.youku.com/v_show/id_XMTEwMDEwOTU2.html

阅读全文 | 回复(0) | 引用通告 | 编辑

上一篇：自工作以来,所指导和合作指导的本科毕业论文题目列表
下一篇：关于倒排索引,反向索引-Inverted index

发表评论：

梦翔儿网站梦飞翔的地方 http://www.dreamflier.net
中华人民共和国信息产业部TCP/IP系统备案序号：辽ICP备09000550号