Software predictability and full hardware virtualization

As I was mentioning in my previous post, I got in debate during the I/O virtualization workshop. Basically, I was making the argument that the approach for virtualization is fundamentally flawed. It seems the research community either lacks a global view of the stack that provides software/service. They look for a lot of complicated solutions by considering applications as black boxes. They basically try to figure out on the fly what to do with applications which in my opinion is very difficult solution to the problem we are being confronted with. Other more mature industries were confronted with similar problems. Take the auto industry for instance. Now with advances in simulations, they’re able to actually see what would be the drag coefficient of cars. Why is it that in the software industry we have no idea of how an application will perform before someone actually executes through load testing or other similar method.

When you look at the work being done in the area of static code analysis for the purposes of security and stability, I don’t think it would be far fetch to extract the performance attributes of the code through a similar process. Since most code is based on widely available frameworks, those performance attributes wouldn’t have to be re-evaluated for the core functions. The only thing that would need to be adjusted are the execution variances caused by the different hardware.

By extracting those attributes it would be fairly easy to simulate system behavior.  The following would then be easily predictable:

1) What happens when increase function x usage by 10%

2) Determine CPU, I/O and storage usage accurately

3) Determine application placement in the infrastructure

4) Scale the application more accurately

5) Determine run time costs in cloud infrastructure

6) Simulate application behavior changes after hardware/software change

If we push the concept a little bit more, embedding the performance attribute as part of the application binary or manifest would allow the platform to take the appropriate actions.

When the application statistics are collected, it also facilitates performance comparison of functions that achieve the same goal.

When you think about it, it’s really the next step in application performance profiling.

I then went on to ask to the panel if full hardware virtualization is still the way to go. I think what people are really looking for is a way to provide an isolated environment for their applications. Another requirement would be to have applications a little bit more mobile/highly available. Do we need full hardware virtualization to support such a scenario? I don’t think so. I think application virtualization is the way to go here. There are still something that would need to happen in that space before it would fully meet those major requirements. Right now application virtualization only provides an isolated environment in which the application executes. I think the next step would be to add that live migration functionality which would allow the application to be migrated from compute node to compute node while still maintaining their state. I think everyone would agree that moving an application state vs the whole VM state is much more efficient.

One use case scenario that would be interesting using that technology would be more efficient data processing. After listening to a lot of presentations on cloud computing today, it becomes clear that one of the major problem with processing large amount of data is moving this data. It was suggested by one of the person in the audience that we should look into moving processing closer to the data. If you take the application virtualization principle, you can move the application easily on the node where the data resides while maintaining isolation from the primary application that’s running on the node. You could maybe have cases where the application is moving from node to node while maintaining its state to get a final result. Something similar to what happens in graph traversal which would provide highly parallel tasks and high throughput potentially without over taxing your data center.

Food for thoughts!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s