*** This is a post from 2013 that happened to be sitting as a draft in WordPress, decided to publish it anyway 😉 ***
I’ve decided to spend some time in the HotStorage conference track today. The first session was a panel discussion on software defined storage. The panel was composed from representatives from Nexenta, EMC, Nimble Storage, VMWare and Maginatis. They tried to demystify the definition of what is SDS while also providing comparison with the networking world which has a bit more maturity in the domain. The panel seems to suggest that object storage will be the foundation for future storage platform while providing richer semantic to storage through REST based API. The representative from Nimble seem to suggest that converged and software defined storage are not holding their promises regarding cost reduction and flexibility. I can only disagree with his statement as I’m seeing the benefits already with the first iteration of Storage Spaces and I have even more hope with the second version coming with Windows Server 2012 R2.
The second session was comparing the physical and logical backups for virtual machines. As I expected, once you apply deduplication and compression to the backup data, physical backups are as efficient storage wise while being better for backup speed purposes due to sequential reads during the backup operation. Not the most ground-breaking session!
The third session was regarding performance improvements of VM through virtual disk introspection. Surprisingly, they seem to be able to achieve good performance improvements by better understanding the various IO calls for metadata manipulation.
The fourth session covered how a Chinese cloud provider was able to backup thousands of VMs with multiple TB of changed data on a daily basis in a timely fashion. The researcher came up with a mechanism to backup the VMs in parallel while also deduplicating the backup data. They were able to achieve speeds of close to 9GB/s on 100 hosts and 2500 VMs.
The fifth session was comparing different types of PCIe based SSDs, SAS controller based SSDs and PCIe and software based controller SSDs. While having PCIe interconnect end to end provides the lowest latency, CPU usage is very high, 70-90% of a core to sustain the performance. The SAS based card kept CPU much lower, sub 10%.
The sixth session was from the group who runs Titan, one of the largest supercomputer in the world. Some interesting facts: it costs 1 million dollars per day to run titan and 1 second of idle core time wastes 300 hours on a 1M core jobs. The goal of the research was to better redistribute IO across the storage devices.
The seventh session was regarding an efficient storage mechanism to store time series data. The technique basically revolves around storing the data sequentially while using the time data as an index to enable efficient time based range queries.
I finished the day at the International Conference on Autonomic Computing. The first session there was from a researcher from Toronto University who presented a way to manage interactive and batch workloads in geo distributed datacenters. It went into some details on a subject I had read about a little from Google where they were moving workloads globally to leverage lower power costs by lowering cooling requirements to run applications. The model took into consideration temperature and electricity costs amongst other things.
Another session was on how you throttle CPU power in databases workloads by performing an iterative evaluation of the system power behavior while queries were running.
The last session of the day and the conference for me was from a student from Harvard that inferred wireless device patterns by integrating signal analysis in the receiver. The technique allowed for much greater reception range for the receiver, in the order of 4x.