A Hitchhiker’s Guide to Fast and Efficient Data Reconstruction in Erasure-coded Data Centers


the morning paper

A Hitchhiker’s guide to fast and efficient data reconstruction in erasure-coded data centers – Rashmi et al.

So far this week we’ve looked at a programming languages paper and a systems paper, so for today I thought it would be fun to look at an algorithm-based paper.

HDFS enables horizontally scalable low-cost storage for the masses, so it becomes feasible to collect and store much more data. Enter the Jevons paradox and soon you’ve got so much data that the storage costs become a real issue again. For redundancy and read efficiency, HDFS and the Google File System store three copies of all data by default.

Although disk storage seems inexpensive for small data sizes, this assumption is no longer true when storing multiple copies at the massive scales of operation of today’s data centers. As a result, several large-scale distributed storage systems now deploy
erasure codes, that provide higher…

View original post 1,348 more words


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: