Virtualization Migration
Business, Disaster Recovery, Infrastructure, Virtualization March 15th. 2008, 7:25am
Over a year ago I started planning a vision for the next step in the evolution of the network and systems architecture at Tradewinds. The vision was to migrate from a physical to a virtualized infrastructure. One where servers are no longer tethered to physical machines. One where flexibility and disaster recovery options are abundant.
We worked crazy hours the past month preparing for the final stages of the migration and worked through this past weekend rolling out the new landscape and making the necessary systems changes. I had expected a massive issue list when everyone returned Monday, but to my surprise, the issues were relatively lightweight, I guess that means we did a thorough job executing the plan…or maybe we just got lucky! At any rate, I did walk away with a few key lessons that may yield value to others proceeding down the path of virtualizing their servers.
If you are like me, you read the trade journals and I doubt you will find one that does not mention something about the emerging virtualization market. I had written a previous post about the growing market and Microsoft’s share of this market.
If I had a large budget, the winner in this market today is clearly VMWare, they have the most experience, the biggest client base, by far the more industrial product and they were the pioneers in this space. However, being on a tight budget, I always opt for the cheaper most qualified candidate, which turns out to be Microsoft in this case.
Yes, I know VMWare has released their Virtual Server product for free to compete with MS’s, but I read way to many cases of system instability and having to reboot the host machine after a week with the VMWare solution. I also couldn’t find a host based backup solution that enabled a clean / consistent state for host based backups or snapshots. This exists in the ESX version of VMWare, but it is not free. I also had been running for over a year several of the servers in my lab as virtual servers on Microsofts product and had very good luck with it. So I decided it was the right choice at Tradewinds.
The project started off with a lot of diagrams and figuring out what components to virtualize and which to leave alone. My biggest business reasons for pushing the project along were:
- Disaster Recovery = the ability to bring virtual servers up on other physical peices of hardware in time of disaster is huge. I am no longer tethered to a physical machine and waiting for parts to get a server back online. If a box dies, I simply bring up the virtual machines that lived on that box somewhere else.
- Server Consolidation = Instead of buying several physical machines, I could purchase fewer machines and have them run multiple servers.
- Flexibility = Being able to shuffle machines around is great. If one machine is getting bogged down, I can offload a virtual server to another machine which has more resources…This is a non-event with a virtual machine.
- Testing = I was always hesitant to do anything to a live production machine. Installing software or updates can be dangerous business, so being able to save the state of a virtual machine and apply an update and revert back easily to a previous state if the update goes south offers peace of mind. I can take a copy of a virtual server and do whatever to it and if I crash it, I can just pitch it.
A few pointers I did come across during the project that are note worthy are as follows:
- Do your homework = There are a number of different options for virtualization. We chose Microsoft, but if your budget permits it, you really should look at VMWare. I also recommend reading the whitepapers that are out there, they are useful.
- Backups = Make sure you leverage a good backup strategy of your virtual machines. On some of my virtual servers, I am running a host based script which enables a “live” backup be taken through VSS writers, and I am also using a guest based backup solution in addition to the host based backup for critical machines. I like to have options in time of disaster. I like having a backup to the backup plan.
- Virtualize Smartly = Just because everything can be put in a virtual server doesn’t mean it should. Anything that is very I/O intensive should probably not reside within a virtual server. We opted not to put our DB server in a virtual environment for performance reasons. We did however setup exchange in a virtual environment. We decided that even though it is an I/O hog, being able to bring it up on another physical machine quickly in time of disaster was critical, so we’ll take an extra second or two checking our emails.
- Test = Be sure to virtualize slowly. Doing this allowed me to flush out many issues and make sure that I was comfortable supporting this new paradigm.
- Distinguish = The number of servers I support now has doubled and it gets confusing when working simultaneously on multiple machines. Be careful you don’t get confused which machine you are working on…as simple as it sounds, changing the desktop color or doing something else to distinguish a machine is a worthwhile and simple exercise. During the migration, I almost ran a command on the wrong machine that could have been disastrous…of course it could have been because I had been working 24 hours straight, but having helpful indicators to help you remember where you are is a good idea.
- Document - I am a huge fan of putting stuff down on paper. It is a good idea to document and outline your physical and virtual environment. Going through the exercise of updating our disaster recovery diagram, having pictures to look at helped me figure out the different scenarios for hosts and guests going down and what services would be unavailable. To give an example of where this helped, we leverage two VPN tunnel paths into the office for redundancy. One of them now lives in a virtualized machine. I had planned to put the virtual machine on the physical machine which hosts our second VPN tunnel. So what would have happened if the host went down? All of our VPN would have been down. If I happened to be working remotely, this would have been a big deal. Going through the exercise of taking the time to document the environment allowed me to see things that weren’t as clear in my head!
We’ve been on this new paradigm a couple of weeks now and are very pleased. I’ve seen a few hiccups with network connections and some clock drift issues, but nothing that I would consider major or warrant regret for how the infrastructure looks today.
Popularity: 99% [?]
Other Posts that May Interest You
Leave a Reply
You must be logged in to post a comment.









