-
Welcome to Coding Techniques for Cloud and Networked Distributed Storage
Description:The PhD course surveys traditional coding techniques for cloud and networked distributed storage systems as well as cutting-edge research results and recent developments in the topic. Providing robust and repairable storage capabilities originally started in a single computer, e.g., using RAID systems, to be able to withstand the loss of storage disks. The idea to distribute data across multiple storage nodes, which are interconnected over a network, e.g., within one or multiple data centers, is quite natural as it provides a much higher robustness to the loss of individual disks and even entire computers with multiple disks. Although the simplest technique to provide this reliability is to replicate (keep multiple copies of the file), the cost in storage space is massive. The key goal of coding techniques in these systems is to achieve fault tolerance efficiently by being able to repair losses with controlled network and storage use in order to ensure data durability/survival over time. Given the abundance of cloud storage systems and massive content generated each year, reducing storage costs and exploiting storage beyond controlled data centers settings, e.g., leveraging mobile devices, have become a must. This PhD course is structured in two parts. The first part provides the fundamentals of cloud and networked distributed storage, including, key concepts and performance metrics as well as fundamental trade-offs. This first part will also provide key code constructions and examples of codes specifically designed for repairability, including codes on codes, locally repairable codes, and network codes. We shall cover both static, centralized designs suitable for current cloud storage systems as well as dynamic, distributed designs at the edge of current research for exploiting wireless mobile devices for storage systems. The latter introduces interesting constraints and higher node unavailability, which makes the design of very structured codes near impossible. Random code designs are presented as a way to cope with these new challenges. The second part of the course focuses on more practical and application specific designs, particularly, looking at effects of download and access time, data update, and security. This part will be grounded on real-world systems, implementation details, and measurements.
Prerequisites: Basic probability, linear algebra
Key literature:
F. Oggier, A. Datta, “Coding Techniques for Repairability in Networked Distributed Storage Systems,” Now Publishers Inc., 2013
T. C. Jepsen, “Distributed Storage Networks: Architecture, Protocols and Management,” Wiley, 2007 [Supplementary Reading]
Organizer: Daniel E. Lucani
Lecturers: Daniel E. Lucani, Morten V. Pedersen
ECTS: 1
Time: 1 December, 08:30 to 17:30
Place: Niels Jernes Vej 14, room 3-119.
Zip code: 9220
City: Aalborg Ø
Number of seats: 40
Deadline: 23 November 2016
Important information concerning PhD courses
We have over some time experienced problems with no-show for both project and general courses. It has now reached a point where we are forced to take action. Therefore, the Doctoral School has decided to introduce a no-show fee of DKK 5,000 for each course where the student does not show up. Cancellations are accepted no later than 2 weeks before start of the course. Registered illness is of course an acceptable reason for not showing up on those days. Furthermore, all courses open for registration approximately three months before start. This can hopefully also provide new students a chance to register for courses during the year. We look forward to your registrations.