歡迎來到裝配圖網(wǎng)! | 幫助中心 裝配圖網(wǎng)zhuangpeitu.com!
裝配圖網(wǎng)
ImageVerifierCode 換一換
首頁 裝配圖網(wǎng) > 資源分類 > PPT文檔下載  

Google云計算技術(shù)MapReduce國外課件.ppt

  • 資源ID:14142270       資源大?。?span id="xkytwbp" class="font-tahoma">2.03MB        全文頁數(shù):48頁
  • 資源格式: PPT        下載積分:9.9積分
快捷下載 游客一鍵下載
會員登錄下載
微信登錄下載
三方登錄下載: 微信開放平臺登錄 支付寶登錄   QQ登錄   微博登錄  
二維碼
微信掃一掃登錄
下載資源需要9.9積分
郵箱/手機(jī):
溫馨提示:
用戶名和密碼都是您填寫的郵箱或者手機(jī)號,方便查詢和重復(fù)下載(系統(tǒng)自動生成)
支付方式: 支付寶    微信支付   
驗證碼:   換一換

 
賬號:
密碼:
驗證碼:   換一換
  忘記密碼?
    
友情提示
2、PDF文件下載后,可能會被瀏覽器默認(rèn)打開,此種情況可以點擊瀏覽器菜單,保存網(wǎng)頁到桌面,就可以正常下載了。
3、本站不支持迅雷下載,請使用電腦自帶的IE瀏覽器,或者360瀏覽器、谷歌瀏覽器下載即可。
4、本站資源下載后的文檔和圖紙-無水印,預(yù)覽文檔經(jīng)過壓縮,下載后原文更清晰。
5、試題試卷類文檔,如果標(biāo)題沒有明確說明有答案則都視為沒有答案,請知曉。

Google云計算技術(shù)MapReduce國外課件.ppt

MapReduce: Simplified Data Processing on Large Clusters,Jeffrey Dean reduce(String output_key, Iterator intermediate_values): / output_key: a word / output_values: a list of counts int result = 0; for each v in intermediate_values: result += ParseInt(v); Emit(AsString(result);,More Examples,Distributed grep: Map: (key, whole doc/a line) (the matched line, key) Reduce: identity function,More Examples,Count of URL Access Frequency: Map: logs of web page requests (URL, 1) Reduce: (URL, total count),More Examples,Reverse Web-Link Graph: Map: (source, target) (target, source) Reduce: (target, list(source) (target, list(source),MapReduce: Execution overview,Architecture,Master Data Structure Task state: idle, in-progress, completed Identity of worker machine: for in-progress tasks Location of intermediate file regions of map tasks. Receive from map tasks Push to reduce tasks.,Execution overview,Split input files (1) Master and workers (2) Map task workers (3) Buffering of results (4) Copying and sorting (5) Reduce workers (6) Return to user code (7),MapReduce: Execution overview,MapReduce: Example,MapReduce in Parallel: Example,MapReduce: Runtime Environment,Fault Management,Fault Tolerance in a word: redo Master pings workers, re-schedules failed tasks. Note: Completed map tasks are re-executed on failure because their output is stored on the local disk. Master failure: redo Semantics in the presence of failures: Deterministic map/reduce function: Produce the same output as would have been produced by a non-faulting sequential execution of the entire program Rely on atomic commits of map and reduce task outputs to achieve this property.,MapReduce: Fault Tolerance,Handled via re-execution of tasks. Task completion committed through master What happens if Mapper fails ? Re-execute completed + in-progress map tasks What happens if Reducer fails ? Re-execute in progress reduce tasks What happens if Master fails ? Potential trouble !,MapReduce: Refinements Locality Optimization,Leverage GFS to schedule a map task on a machine that contains a replica of the corresponding input data. Thousands of machines read input at local disk speed Without this, rack switches limit read rate,MapReduce: Refinements Redundant Execution,Slow workers are source of bottleneck, may delay completion time. Near end of phase, spawn backup tasks, one to finish first wins. Effectively utilizes computing power, reducing job completion time by a factor.,MapReduce: Refinements Skipping Bad Records,Map/Reduce functions sometimes fail for particular inputs. Fixing the Bug might not be possible : Third Party Libraries. On Error Worker sends signal to Master If multiple error on same record, skip record,MapReduce: Refinements Miscellaneous,Combiner Function at Mapper Sorting Guarantees within each reduce partition. Local execution for debugging/testing User-defined counters,MapReduce:,Walk through of One more Application,MapReduce : PageRank,PageRank models the behavior of a “random surfer”. C(t) is the out-degree of t, and (1-d) is a damping factor (random jump) The “random surfer” keeps clicking on successive links at random not taking content into consideration. Distributes its pages rank equally among all pages it links to. The dampening factor takes the surfer “getting bored” and typing arbitrary URL.,Computing PageRank,PageRank : Key Insights,Effect at each iteration is local. i+1th iteration depends only on ith iteration At iteration i, PageRank for individual nodes can be computed independently,PageRank using MapReduce,Use Sparse matrix representation (M) Map each row of M to a list of PageRank “credit” to assign to out link neighbours. These prestige scores are reduced to a single PageRank value for a page by aggregating over them.,PageRank using MapReduce,Source of Image: Lin 2008,Phase 1: Process HTML,Map task takes (URL, page-content) pairs and maps them to (URL, (PRinit, list-of-urls) PRinit is the “seed” PageRank for URL list-of-urls contains all pages pointed to by URL Reduce task is just the identity function,Phase 2: PageRank Distribution,Reduce task gets (URL, url_list) and many (URL, val) values Sum vals and fix up with d to get new PR Emit (URL, (new_rank, url_list) Check for convergence using non parallel component,MapReduce: Some More Apps,Distributed Grep. Count of URL Access Frequency. Clustering (K-means) Graph Algorithms. Indexing Systems,MapReduce Programs In Google Source Tree,MapReduce Jobs run in Aug, 2004,MapReduce: Extensions and similar apps,PIG (Yahoo) Hadoop (Apache) DryadLinq (Microsoft),Large Scale Systems Architecture using MapReduce,Take Home Messages,Although restrictive, provides good fit for many problems encountered in the practice of processing large data sets. Functional Programming Paradigm can be applied to large scale computation. Easy to use, hides messy details of parallelization, fault-tolerance, data distribution and load balancing from the programmers. And finally, if it works for Google, it should be handy !,

注意事項

本文(Google云計算技術(shù)MapReduce國外課件.ppt)為本站會員(tia****nde)主動上傳,裝配圖網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對上載內(nèi)容本身不做任何修改或編輯。 若此文所含內(nèi)容侵犯了您的版權(quán)或隱私,請立即通知裝配圖網(wǎng)(點擊聯(lián)系客服),我們立即給予刪除!

溫馨提示:如果因為網(wǎng)速或其他原因下載失敗請重新下載,重復(fù)下載不扣分。




關(guān)于我們 - 網(wǎng)站聲明 - 網(wǎng)站地圖 - 資源地圖 - 友情鏈接 - 網(wǎng)站客服 - 聯(lián)系我們

copyright@ 2023-2025  zhuangpeitu.com 裝配圖網(wǎng)版權(quán)所有   聯(lián)系電話:18123376007

備案號:ICP2024067431-1 川公網(wǎng)安備51140202000466號


本站為文檔C2C交易模式,即用戶上傳的文檔直接被用戶下載,本站只是中間服務(wù)平臺,本站所有文檔下載所得的收益歸上傳人(含作者)所有。裝配圖網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對上載內(nèi)容本身不做任何修改或編輯。若文檔所含內(nèi)容侵犯了您的版權(quán)或隱私,請立即通知裝配圖網(wǎng),我們立即給予刪除!