In this Part of “Making IT Projects a Success”, we will discuss challenges in IT Infrastructure Projects and take a look at best practices in Data Center Builds, Migrations, Virtualization and a Move to the Cloud.
Making IT Projects a Success - 3
8) Challenges in IT Infrastructure projects – Following are some of the reasons for failure:
1.The inability to gauge what can cause downtime - To understand how a new installation or an upgrade of an existing one could affect the environment & its associated services as a whole remains a challenge. A thorough analysis of the change and a need to put in place a fall back strategy is essential. This is where ITIL Change Management plays a critical role and a competent Change Advisory Board (CAB) can help the cause.
2.A lack of foresight in choosing the right platform (hardware and operating system) that would minimize maintenance and licensing costs on the long run.
3.While standardization is the norm of the day in order to benefit from bulk discounts, reduction in the need for IT staff with additional skillsets and to avoid compatibility solutions among others, it may not always be the best solution to a Business need or problem.
4.The inability to gauge and prepare for potential capacity shortage leads to performance issues on going LIVE. Projects run the risk of not being able to stimulate stress tests to see what amount of load the various components (ex: hardware, software and bandwidth) can take. The recent use of predictive analysis to address performance issues is at its infancy.
Following are a few selected guidelines for IT Infrastructure related projects and given the need to keep brevity, I will not cover all points:
Ten Considerations for Data Center builds:
1.Perform a thorough site suitability analysis which includes selecting one outside flood plains, earthquakes, tornadoes and hurricane prone areas. Choose Single-story, industrial type buildings with high ceiling and minimum number of windows, large column bays (30*50 feet), level roof without skylight, loading dock for heavy equipment among several other considerations.
2. For floor layout, perform capacity planning by understanding the current footage and then calculating possible rate of expansion. Use 0.8m to 1m raised floors.
3. Create a hot aisle/cold aisle layout design for the floor. Use perforated files for cold aisles.
4. For cooling, the hot and cold aisles should be long rows where racks are placed face to face with a recommended 48 inch aisle. Orient A/C units’ perpendicular to hot aisles. Distribute high-density racks throughout the layout to mitigate hot spots. Use spot cooling as necessary.
5.The complete design considerations for power distribution include a Main Distribution Frame (MDF), Intermediate Distribution Frame (IDF), UPS and diesel backup power systems. One should strive for multiple utility feeds and provision maintenance bypass and emergency shutdown.
6.For power, implement low and high density zones to support the varying power demands of new technology. Plan for the data center to scale from 50 watts to 100 watts per square foot.
7.High speed internet connections and telecommunication services from different service providers should reach the site from different points for redundancy. For mission critical Data Centers, satellite internet should also be considered for key communication requirements.
8.Cabling architecture is the backbone of your Data Center and key considerations are scalability, flexibility, manageability and availability. Use a structured approach with Main Distribution Areas, Horizontal Distribution Areas and Equipment Distributed Areas.
9.Select a smart detecting system whose sensor not only detects smoke but also heat before triggering off an alarm. An alarm system that includes a loud noise and flashing lights is ideal. Use a suppressing agent such as HFC-227ea, FM-200 and HFC125 to help put the fire off.
10.Last but not the least, put in place a Data Center Infrastructure Management (DCIM) system that collects and manages information about a Data Center’s assets, its resource utilization and operational status. One that will send alerts and help you create custom reports while automatically forecasting when you are going to run out of capacity.
The goal is to build a cost effective and expandable IT infrastructure, minimize complexity of the environment and opt for a Green IT solution.
Ten Considerations for Data Center Migrations
1.It is important to understand whether the Data Center move is transitional or transformational. Either ways, best practice is to undertake a phased approach.
2.One needs to be familiar with the complexity of the environment that is to say the application dependencies in a heterogeneous environment, the number of interfaces each one has, IP and DNS related links among others. A detailed inventory of all of these needs to be taken.
3.One should create a baseline through pre-migration testing and have in place staff for end-to-end post migration testing. To realize any cost reduction, know your Data Center’s Total Cost of Ownership (TCO) before moving to the new site.
4. It would be best to have new hardware at the new site if cost permits (especially for a small change window) as it may surprise many on the number of hardware failures during transportation.
5.Build the new infrastructure in parallel with a high speed connection between the two sites. This will facilitate moving images of virtual machines, migration of data and others.
6.It is best to perform a tape backup of critical databases before the event and ship the tape separately.
7. Agreeing upon a change windows and notifying all parties so the Business is not affected is critical.
8. It is a total “NO, NO” to upgrade systems during the critical phase of a migration.
9.Arranging for all vendors (ex: hardware, software, utility etc) to be on standby and co-ordinating logistics for the move is required.
10.It is best to have a mock run of some of the critical activities before the actual cut-over.
The goal is to choose a well-suited facility apart from making the build versus lease decision & reduce possible downtime during the move.
Ten Considerations for Virtualization:
1.Virtualization should be thoroughly analysed before any work commences. How many virtualized guests can one have on a host depends upon the current CPU, RAM and Disk Space of each physical machine keeping in view sufficient room for expandability to meet Business demands. The decision of where data and log storage should be kept needs to be taken. It is best to centralize storage from an efficiency and security perspective.
2.Understand the performance requirements for servers with different roles and cater accordingly. For example, if the I/O capacity usage of a server is high, moving it to a host with several VMs could affect its performance as it shares the I/O with them. A dynamic way to balance workloads between VMs is recommended and to achieve it, the purchase of the respective vendor management tools.
3.Best practice would be to have 8.2 Virtual Machines (VMs) per server with a separate centralized storage infrastructure and consolidate servers from rack mounted to blade units.
4. Do not treat virtual servers any differently from physical systems. Understand its failover capability, back it up regularly (data and when possible, the image) to central storage.
5.Have in place a scalability solution with existing virtual servers running at an optimum 70% utilization.
6.Do not perform large-scale virtualization though it is tempting to achieve rapid return on investment but go about it in phases. Gartner suggest that deploying a minimum of 50 VMs in a year will make good investment in virtualized software.
7.In Virtual Infrastructures, workloads are not physically separated as VMs rely on the same host and a problem with one can extend to the other. Putting limits on VM resources can prevent one VM from starving others from critical resources (ex: CPU, memory access, network I/O controls and storage).
8.Virtualization has brought about new security risks that did not exist with physical servers. The ability to move a Virtual Machine (VM) and its Data out of your Data Center by compromising your network or Cloud is easier than moving a physical server out of a Data Center. Also, there arises the capability to create a rogue installation of a new VM that can then create havoc in your network. The solution would be to limit who can create and move VMs while closely monitoring your VMs. There are several vendors that offer extensive auditing, privilege management and compliance features that help create customizable policies that prevent misuse.
9. Keeping VMs in their desire configuration and not letting multiple admins from changing the initial specification is an extension of the above security measure. Put in place tools that will ensure that the desired configuration is maintained and will continue to meet any necessary compliance requirements.
10. It is advised to use the “undo” and “clone” features in virtualization with great care. For the “undo” feature, it is easy to revert the disk back to a week before but it could re-expose the system to any patched vulnerabilities. Cloning Systems on the other hand leads to the proliferation of machines and could cause likely errors.
The goal of Virtualization is to reduce the data center footprint, go Green by saving energy, reduce hardware maintenance cost, manage virtual machines that one can be moved or re-build with ease among others.
Ten Considerations for your Move to the Cloud:
1.Start by making the right choice between a private, public and hybrid cloud keeping control (ability to customize various components including storage and network) and security in mind, especially for those who need to fulfil regulatory requirements.
2.Understand what works for you best when choosing IaaS, PaaS, SaaS or DaaS.
3.Evaluate your choice between performing a machine replication, a P2V migration or a machine migration based on whether your application is Tier 1, Tier 2 or Tier 3.
4.Cloud Hosting Providers generally cannot reveal detailed internals of their services, for security and business reasons and hence, one needs to design for failure at every level. For example, if one opts for Amazon Web Services (AWS), one needs to opt for multiple availability zones to prevent site failure issues from affecting your services.
5.Internal and third-party monitoring systems needs to be put in place to ensure at all times that your systems are working, you get notified in case of a failure and that you can respond quickly.
6.A pertinent problem in a multi-tenant cloud environment is that of having noisy neighbors. As you might know in a Cloud environment, one physical server has many VMs that need to share resources. While it is easy and effective to partition CPU and RAM resources between VMs, the disk sub-system is one which is extremely difficult to partition and if one VM on the physical host is consuming very large amount of disk I/O, it will result in very poor performance of the other VMs resulting in what we call the noisy neighbor scenario. This problem also exists at the processor’s cache storage as the cache is invisible to the virtualization layer but with Intel’s new E5-2600/1600 v3 processors this problem has been addressed. The ability to constantly collect, monitor and react to things like IO Wait and other metrics that are indicators of performance problems caused by multi-tenants is not acceptable for cloud customers. A properly designed cloud platform should address this problem.
7.Select a Cloud Service Provider (CSP) that can inform you of the instance count per physical node and clearly define the capacity expansion capability.
8.Service Level Agreements (SLAs) in terms of availability (keeping your systems up and running) and Response time to fix critical issues.
9.SLAs with respect to durability (protecting your critical data from loss)
10.Select a CSP that can meet help meet your Security and Compliance requirements.