Hadoop Scales Really Well - Yahoo! Launches World’s Largest Hadoop Production Application

| Comments

Hadoop
Some exciting news today from Eric Baldeschwieler, Senior Director, Grid Computing on the Yahoo Developer Network, Yahoo! Launches World's Largest Hadoop Production Application. I'll note that my company Hyperix is using Hadoop for our vertical search platform.

Here's some of the stats:

Some Webmap size data:

* Number of links between pages in the index: roughly 1 trillion links
* Size of output: over 300 TB, compressed!
* Number of cores used to run a single Map-Reduce job: over 10,000
* Raw disk used in the production cluster: over 5 Petabytes




blog comments powered by Disqus

Recent Blog Entries

The Next Breakthrough Space Technologies for Canada
The Canadian Space Commerce Association (CSCA) will be holding its annual meeting on March 18th in Toronto and my colleagues…
Canada's Housing Bubble, Real or Imaginary?
The Canadian Centre for Policy Alternatives yesterday released a study title "Canada's Housing Bubble - An Accident Waiting to Happen"…
Do China's Actions Signal a Greater Military Role for their Space Program?
Brian Weeden of the Secure World Foundation has a very interesting article on The Space Review titled Dancing in the…
Amazon Unleashes Cluster Compute Instances for High Performance Computing
I have to say I'm fairly excited at the news today that Amazon is making available a new instance type…
Shame on the New York Times for Forcing Apple to Remove Pulse from iTunes
When the New York Times objected officially to Apple about an iPad application called Pulse they shot themselves in the…
What if Microsoft and Apple Merged?
TechCrunch is reporting that Microsoft could be taking over the search on Apple's iPhone with the upcoming release of the…