Virtualization of Servers: The First Step Into the Cloud

June 16, 2010
By Paul Strassmann

Migration into a cloud environment by means of virtualization of servers is extremely attractive and has instant paybacks. Compared with other software-intensive improvements, the ability to combine servers in order to increase computer utilization from  less than 20 percent to over 70 percent is the most attractive choice in the current environment, when cuts in IT budgets for FY12 and beyond are required by end of this July.

Server virtualization is well understood. The technology is mature. There are a number of software vendors who can deliver server virtualization rapidly and at a fixed cost. The question is what are the potential savings that can be proposed as cost reductions?

To compare, look at the number of servers in computer services. I have chosen Akamai (with IT costs of $636 million/year) and Rackspace (with IT costs of $648 million/year) as benchmarks. The combined IT costs of these two firms of $1.3 billion can be compared to the Defense Department operations and maintenance budget for FY10 of $21.7 billion, which is almost 17 times greater. Without growth, this amounts to $108 billion of Defense Department IT spending over five years.

The total number of servers for Akamai and Rackspace is 104,671. Using the dollar share of total operations and maintenance spending, the Defense Department is likely to have about 180,000 servers, of which 100,000 have been already virtualized as the best case. The most complete total cost of ownership model is from Alinean. The model suggests that a reduction in the number of eligible small-scale Defense Department servers from 80,000 to 5,000 mainframe-like computers is feasible.

While they would require an up-front net investment of $27 million, the net IT capital cost reductions over five years would be $3.8 billion. And the net IT operating cost reduction over that period would be $63.1 billion-a 58 percent cut. Such cost reduction is in line with results that have been so far realized by leading commercial firms. Additionally, there would be a reduction of 36,720 kW in electrical power, and space savings in the data center of 7,118 sq.ft.

The cost reductions from the virtualization of servers should be seen only as the first step on the path toward a cloud environment in which the Defense Department operates its information technologies as a private and secure "platform-as-a-service." And the potential savings from virtualization are so large that a concerted effort to proceed with such migration should not be deferred.

Paul A. Strassmann is a Distinguished Professor at the George Mason University. He is the former Director of Defense Information, Office of the Secretary of Defense.

The views expressed by our guest bloggers are their own and do not necessarily reflect the views of AFCEA International or SIGNAL Magazine.

Share Your Thoughts:

Good article. We, at intel IT agree that virtualization is the foundation for our cloud computing strategy. Late last year, we set out on a CIO-led imperative to get to 70-80% virtualized in our Intel IT general purpose and enterprise infrastructure environment by end 2011. Here is a link to the Intel IT paper on this sharing our IT best practices and key learnings.

Great job showing how virtualization can save money. However, I think your numbers for the number of machines that have been virtualized are too optimistic. This may be true for the Defense Department, but research shows that only 20-30% of most business servers have been virtualized. This is referred to as Virtualization Stall-- once companies begin to virtualize, they often find the number of virtual servers ballooning out of control of their IT departments. This second challenge to virtualization is often called Virtualization Sprawl. Sprawl causes Stall--when companies can't keep up with virtualization, they put off virtualizing customer-facing servers. Here's an article on these two challenges to virtualization, and how they may be overcome:

Colleen, Agree with you that virtualization is not for all businesses and applications. At Intel IT, we carefully look at the apps and environments before rushing into virtualization. Need to keep eyes-wide-open. In fact, about 70% of our 100,000 servers inside Intel IT are used to support Design of our chips. In this environment, we do not virtualize the servers but achieve high utilization and shared resources via a grid computing solution. The other 30% of our servers support Office (ie general purpose apps), Enterprise (ie e-biz, supply chain) and Manufacturing (ie factory automoation and support) are candidates for virtualization.

The thing that has changed for us recently inside our Office and Enterprise, is that we are proactively virtualizing. This means that IT / Business stakholders have to justify "why not virtualize" versus the previous approach which was justify "why virtualize" in these environments ... and the enterprise private Cloud Computing solution we are building internally places better IT Business Intelligence on VM management than I believe we had in the traditional legacy environment. We see the management aspect our cloud as a benefit, not a risk.



How do you achieve high server utilization with "grid computing"? Does this imply that you have a monitor that shifts processing from any one of 70,000 over-utilized servers to any one of 70,000 under-utilized servers? If that is so, is the shifting done in real-time? To what extent are the INTEL 100,000 servers acquired at commercial prices, or can INTEL acquire servers at substantial discounts?


Paul, Thanks for the questions. On the utilization question for our grid, we use a job scheduling process that allows multiple design jobs to run simultaneously on the servers. This paper shows some assessment across a range of server types showing more jobs running on newer servers than old to give you an idea of how we are taking full advantage of the better servers to get higher throughput (and therefore untilization)

I do not know the details if this scheduling process is dynamic (ie automated) across the entire network of servers or if there is a manual capacity management part of it. My understanding is that there is a little of both. I am aware that we design a master / slave relationship where 4S Xeon servers are used to consolidate requests for jobs and then divide/parse them out to the cluster (predominently 2S Xeon servers - racks and blades) and then "re-assemble" the results for deliver back to the designer. If you are interested in learning more about the details, I will need to consult with some others in the Intel IT organization. It is a Linux based environment.

As for the purchasing question, I am not at liberty to discuss details. However, I will say that based on my experience in talking with many IT organizations, Intel IT assess and purchases servers in a very similar fashoin to other large IT organizations - no secret sauce here.


Share Your Thoughts: