Neil Rasmussen 2013-09-21 03:38:04
High-density IT equipment stresses the power density capability of modern data centers. Installation and unmanaged proliferation of this equipment can lead to unexpected problems with power and cooling infrastructure. This is an excerpt of the white paper, which describes the principles for achieving power and cooling capacity management. Data center physical infrastructure capacity management is defined as the action or process for ensuring power, cooling, and space is provided efficiently at the right time and in the right amount to support IT loads and processes. This paper discusses power and cooling capacity management only. Issues related to space management are discussed in White Paper 155, Choosing and Specifying an Appropriate Power Density for a Data Center. The critical success factors for effective management of power and cooling capacities are: Providing accurate capacity forecasts Providing appropriate capacity to meet business needs This forecasting and efficient provisioning of capacity is dependent on the ability to establish the power and cooling capability at the rack level. Having this capability is rare today. Data center operators typically do not have the information they need to effectively deploy new equipment at the rate required by the business, and are unable to answer simple questions such as: Where in my data center should I deploy the next server so I donft impact the availabil-ity of existing equipment? From a power and cooling availability standpoint, where is the best location to deploy the proposed IT equipment? Will I be able to install new equipment without negatively impacting my safety margins such as redundancy and backup runtime? Will I still have power or cooling redundancy under fault or maintenance conditions? Can I deploy new hardware technology, such as blade servers, using my existing pow-er and cooling infrastructure? Do I need to spread out my blade servers to get reliable operation? When will I reach the limits of my current power and cooling infrastructure and require additional capacity? The inability to answer these simple questions is common. For data centers which are grossly over-designed or underutilized, the safety margins can allow successful operation with only a primitive understanding of overall system performance. The compromise in availability due to this lack of knowledge may result in a small, but tolerable amount of downtime. While not financially or energy efficient, in the short term, oversizing provides a safety margin until such a time as the available capacity equals capacity utilized. However, three factors are currently placing stress on data centers which are, in turn, exposing the inadequacies of current operating methods: Ultra, high density IT equipment Requirement to control total cost of ownership (TCO) and more fully utilize data center capacity Rapid pace of change due to virtualization and refresh cycle of IT equipment Each of these factors leads to pressure to operate data centers in a more predictable manner. High-density IT equipment IT equipment drawing more than 8 kW per rack enclosure can be considered high density. Fully populated racks of servers can draw from 6 kW to 35 kW per rack. Yet the vast majority of data centers today are designed for a power density of less than 4 kW per rack. As mentioned earlier, more and more users are installing equipment that exceeds the design density of their data centers and the resultant stresses on the power and cooling systems can cause downtime from overloads, overheating, and loss of redundancy. Data center operators need better information regarding how and where to reliably deploy this equipment in both existing and new data centers. Total cost of ownership Most businesses cannot accept gross over-design or oversizing of data centers. The waste of capital and operating costs is significant. It is estimated that the typical data center today could hold up to 30% more IT equipment using the same facility power and cooling capacity if the capacity was properly managed. The typical data center today is not able to fully utilize its available power and cooling capacity, which reduces the system efficiency and drives up electrical power consumption by 20% or more when compared to a system where the capacity is properly managed. Capacity management tools can better utilize power and cooling resources and reduce electrical consumption. Rapid pace of change IT equipment in a typical data center is constantly changing. Equipment refresh cycles are typically below three years and equipment is constantly being added or removed on a daily basis. Furthermore, the power and cooling requirements of the IT devices themselves are not constant but vary minute-by-minute as a result of virtualization and power management features implemented by IT equipment vendors. The historic gtry it and see if it worksh method of deploying IT equipment is no longer viable, with overheating a common result. Capacity management tools must provide real time planning capabilities to address these challenges, and they must provide this capability in a cost effective, easy-to-install, easy-to-use, preengineered form. To better understand the effects of virtualization and cloud computing on the physical infrastructure and how to manage them, see White Paper 118, Virtualization and Cloud Computing: Optimized Power, Cooling and Management Maximizes Benefits. To provide simple answers to the basic questions users have about capacity, a systematic approach to capacity management is required. The foundation of capacity management is the ability to quantify the supply and the demand for both power and cooling. While having power and cooling supply and demand information at the room or facility level helps, it does not provide sufficiently detailed information to answer the questions about specific IT equipment deployments. On the other hand, providing power and cooling supply and demand information at the IT device level is unnecessarily detailed and difficult to achieve. An effective and practical level at which to measure and budget power and cooling capacity is at the rack level, and this paper utilizes that approach. The model described in this paper quantifies power and cooling supply and demand at the rack level in four important ways: As-configured maximum potential demand Current actual demand As-configured potential supply Current actual supply This information allows a complete description of the current status of a data center power and cooling at the rack level. These descriptions are explained below and illustrated in Figure 1. As-configured power and cooling maximum POTENTIAL DEMAND The power management systems in modern servers can cause the power to vary by 2 to 1 or more during typical operation. The maximum gas configuredh power and cooling demand represents the peak values that can be caused by this variance in the rack. This information can be established at the time of system configuration via trending, it may be reported directly by the IT equipment, or it may be derived by other means. The maximum power and cooling demand is always greater than or equal to the actual power and cooling demand and is critical information for capacity management. Current power and cooling ACTUAL DEMAND This is the value of power consumed and heat generated at each rack at any given point in time. Ideally, this is done by real-time measurement of electrical power consumption at the rack level. For virtually all devices, power consumed in watts equals the heat generated in watts. For other devices . Including uninterruptible power systems (UPS), power distribution units (PDU), air conditioners, and VoIP routers . The heat output in watts is not equal to the power consumed, but can be mathematically derived. Rack power consumption can be measured by the power distribution system or it can be measured by the IT equipment itself, and the reported power consumed by the set of IT devices within a rack can be summed to obtain the rack power. The as-configured power and cooling maximum POTENTIAL SUPPLY The as-configured power and cooling maximum potential supply is defined as the amount of power and cooling that could potentially be delivered to the rack level by the installed infrastructure equipment. The potential power and cooling supply will always be greater than or equal to the actual power and cooling supply. If the maximum potential supply for any given load is greater than the actual supply being delivered to that load, this indicates that the system is in a degraded state. This can be caused by a number of factors, such as: Blocked air filters in the cooling system A decrease in outdoor heat rejection capability due to extreme environmental conditions The loss of a power module in a modular UPS It is an important function of a capacity management system to recognize when the current actual supply is not the same as the design value, and to diagnose the source of the con-straints of the system that are preventing realization of the design supply capacity. The current power and cooling ACTUAL SUPPLY The actual power and cooling supply at a rack is determined using information about the power and cooling distribution architecture of the data center power and cooling system, the actual current capacities of the bulk powering and cooling sources, and the effects on the available capacity of other loads. The actual power supply at a given rack is determined by knowing the available branch circuit capacity to the rack, constrained by the availability of unutilized power of upstream sources such as PDUs and UPS. In some cases, the available capacity is further constrained by the design or configuration of the power system. For example, a modular system might not be fully populated or the design may call for dual power feeds. Determining the actual cooling supply at a rack is typically more complex than determining the power supply, and is highly dependent on the air distribution architecture. Unlike the power architecture, where the flow of power is constrained by wires, airflow is typically delivered to an approximate group of racks, where it spreads among the racks based on the draw of the fans in the IT equipment. This makes the computation of available air capacity more complex and sophisticated computer models are required. In cases where the supply or return air are directly ducted to racks, the cooling supply at a rack is better defined and therefore can be computed with improved accuracy. The demand on power and cooling is established at the rack as shown in Figure 2. The supply, as described in the previous section, must also be understood and quantified at the rack. However, the power and cooling supply system is not established rack-by-rack but is hierarchical, with supply devices such as UPSs, PDUs, and air conditioners supplying groups of racks. Bulk supply devices such as the power service entrance and cooling towers also represent sources of capacity supply that must be sufficient for the demand. Therefore, in addition to quantifying power and cooling supply capacity at the rack, it must also be quanti-fied at the aggregate levels aligned with the supply devices. Supply must always be greater than or equal to demand to prevent the data center from experiencing a failure. This must be true at each rack, and it must also be true for each supply device supplying groups of racks. Therefore, at any given time, there is always excess capacity (as long as overall supply is greater than or equal to overall demand). Excess capacity comes in four different forms for purposes of capacity management, which are: Spare capacity Idle capacity Safety margin capacity Stranded capacity Each of these types of excess capacity is explained in the following sections and illustrated in Figure 3. Spare capacity Spare capacity is the current actual excess capacity that can be utilized gright nowh for new IT equipment. Carrying spare capacity has significant capital and operating costs related to the purchase and maintenance of the power and cooling equipment. Furthermore, spare capacity always brings down the operating efficiency of a data center and increases its electrical consumption. In an effective capacity management architecture for a growing and changing data center, certain types of spare capacity, such as spare utility connection capacity, are cost effective. However, power and cooling equipment should ideally be installed only when and where needed to meet growing demand. An effective capacity management system must comprehend and quantify growth plans. For more information or to read the entire white paper offered by Schneider Electric, please visit AFE at www.afe.org.
Published by Facilities Engineering Journal. View All Articles.