基于 MapReduce 的分布式計算系統(tǒng)
訪問【W(wǎng)RITE-BUG數(shù)字空間】_[內(nèi)附完整源碼和文檔]
本文以 MapReduce 為基礎(chǔ),實現(xiàn)了一套基于瀏覽器實現(xiàn)的分布式系統(tǒng)。加之如今 Chrome 對各個平臺近乎完美的兼容性,實現(xiàn)了一次編寫,處處運行的目標。同時得力于個人移動設(shè)備的普及,手機,平板,甚至是家用游戲機,智能電視。如果急需性能,還可以通過朋友圈的方式,號召朋友們使用自己的設(shè)備,在后臺開啟幾個標簽的方式,成為計算節(jié)點,加快整體計算速度。
一、摘要
以 2003 年,Google 發(fā)表的三篇論文為標志的大數(shù)據(jù)時代,至今已過去近二十年時間,MapReduce 那篇論文雖然只有理論,并為公開底層軟件實現(xiàn)。但這么多年過去,Hadoop,Spark 等框架早已實現(xiàn)論文中所描述的功能,甚至還有所改進。
本文以 MapReduce 為基礎(chǔ),實現(xiàn)了一套基于瀏覽器實現(xiàn)的分布式系統(tǒng)。加之如今 Chrome 對各個平臺近乎完美的兼容性,實現(xiàn)了一次編寫,處處運行的目標。同時得力于個人移動設(shè)備的普及,手機,平板,甚至是家用游戲機,智能電視。如果急需性能,還可以通過朋友圈的方式,號召朋友們使用自己的設(shè)備,在后臺開啟幾個標簽的方式,成為計算節(jié)點,加快整體計算速度。
在 BMR 系統(tǒng)下,用戶甚至不需要學(xué)習(xí) C++,Java 等傳統(tǒng)分布式計算用到的語言;只需要會簡單的 JS,即可完成分布式計算任務(wù)的開發(fā),開發(fā)成本極低。本文對 BMR 系統(tǒng)的設(shè)計,以及實現(xiàn)時做的取舍做了詳細說明,對分布式計算平臺的研究具有一定的指導(dǎo)意義。
關(guān)鍵詞:MapReduce、分布式計算、高性能
MapReduce-based Distributed Computing System
二、Abstract
Intelligent In 2003, Google published three papers as a sign of the era of big data, nearly two decades have passed, MapReduce that paper, although only the theory, and for the public underlying software implementation. However, after so many years, Hadoop, Spark and other frameworks have already achieved the functions described in the paper, and even improved them.
This paper implements a browser-based distributed system based on MapReduce. With the near-perfect compatibility of Chrome with all platforms today, the goal of writing once and running everywhere is achieved. And thanks to the popularity of personal mobile devices, phones, tablets, and even home consoles and smart TVs. If performance is desperately needed, it is also possible to call on friends to use their own devices by opening several tabs in the background to become computing nodes and speed up the overall computing speed.
With the BMR system, users do not even need to learn C++, JAVA, and other languages traditionally used in distributed computing;
they only need to know simple JS to develop distributed computing tasks, and the development cost is extremely low. This paper provides a detailed description of the design of the BMR system and the trade-offs made in its implementation, which is a guideline for the study of distributed computing platforms.
Keywords:
MapReduce, distributed computing, high performance



