Skip navigation.
Home
latest scoop on virtual machine technology

Research

Program Committee Membership for ICDCS 2010

The program for the the 30th International Conference on Distributed Computing Systems was recently put up. I had the honor of being a member of the program committee for this prestigious venue. The program itself looks amazing and I'd encourage folks to take a look.

Here's just a couple of papers I think are worth reading:

Chairing a session at HotStorage 2010 in Boston

I am honored to have been asked to chair a session at the HotStorage 2010 workshop on Boston. Take a look at the program. My session includes two very interesting papers:

Funnily enough, Jiri chose the session title to be "All Aboard HMS Beagle". Here's his explanation: "the session name refers to Charles Darwin's ship named Beagle. I chose the name because there isn't really much technical commonality other than the words Adaptive and Evolution (hence the reference)".

If folks are in the area, please consider registering and popping in. USENIX workshops are always very exciting mixers for industry and academia.

Log Structured Filesystems (LFS) versus Sequential Reads

Someone was discussing this topic with me so I thought to blog about it. The issue was sequential-read-after-random-write and how LFS can do really bad in those cases. Here's a bunch of ideas I suggested to my colleague.

If anyone knows about or find such workloads, I’d love to learn more about them. Please let me know how common they are.
 
Given that WAFL is essentially LFS and that ZFS has gone all the way to LFS, I’m really curious as to how real the fear of that particular workload really is. The other thing is that most people at the second level cache do read-aheads and often the read-aheads end up becoming random anyway (apart from the initial padded read, the prefetches are issued async from the original IO), LFS implementations that want to protect themselves against Sequential-Read-after-Rand-Write should be able to mitigate the problem by doing fIle-offset-based read-aheads (as opposed to LBN read-ahead).
 
Next issue is extra meta-data IO. That ZFS has to do as well so I’m sure this is doable :)
 
That leaves garbage collection which really is a problem. This has of course been studied in literature and I think we can do a pretty good job with some research. For instance, at one extreme where lossiness is allowed, see Peter Desnoyers' USENIX Annual Technical Conference paper. I’ve always wondered if we dial back the fidelity to requiring full fidelity (but only up to the last version, so older block history isn’t needed) how close we can get to no interference from garbage collection using Peter's ideas. Anyway, garbage collection with uniform block sizes lends itself to many more tricks than the memory objects version of that problem :)

Anyway, thoughts welcome.

Black-Box Performance Control for High-Volume Non-Interactive Systems

One of the interesting papers presented at USENIX 2009 was "Black-Box Performance Control for High-Volume Non-Interactive Systems" [pdf][html[slides]. Since this is right up my alley, I paid close attention and took some notes. The paper was authored by several IBM Research folks: Chunqiang Tang, Sunjit Tara, Rong N. Chang and Chun Zhang.

First of all, this is interesting and thought-provoking work. However, the paper deals with a very constrained environment of throughput-centric systems and with only a single pool of threads. I have reservations about the general applicability of the system to, say, disk scheduling. Nevertheless, their black box treatment of the system (multiple unknown bottlenecks) is quite interesting and it really made me wonder how else it could be extended. The main problem is that if you have multiple controls in the system (e.g. cpu, memory, disk, etc) that the effective online search they are performing will get really tricky. Nevertheless, good food for thought.

Syndicate content