The Department of Energy is asking for industry input on where it should place future research and development priorities in reducing data center energy consumption. It has much to learn--and much to teach.

For the immediate future, the DOE should de-emphasize improvements to facility technology as "the" solution. We already know a lot, and users aren't paying much attention. Working to make facility components marginally more efficient, when the driver of data center power consumption is the increasing numbers of servers and storage, misses the root cause of data center energy consumption growth.

The DOE should focus on why users are not implementing existing energy efficiency ideas. The current arguments for data center energy efficiency appear to be economically compelling, yet little has happened. Why? This is a sociological issue, and perhaps it is outside normal research parameters. But something systemically is wrong when users are either ignorant (hard to believe) or ignore the opportunity for significant financial savings.

Conducting research into why CFOs and CIOs are not demanding increased energy efficiency is imperative. Is there a countervailing economic or productivity argument that overwhelms energy efficiency? Alternatively, if users are ignorant, how can data center energy efficiency arguments be re-formulated to break through the clutter of what appears to be unimportant or conflicting information?

We need to think outside the box. And we need to focus on the IT technology instead of the facility technology. It is growth in the units of IT technology that is driving power consumption. Moreover, IT technology rolls over every three to five years, so any IT efficiency improvement works its way through to reducing power consumption quickly. Facility technology has a lifespan of 20 years or more.

Moore's Law postulated that the number of transistors on a chip could double every 24 months, but the actual growth rate has significantly exceeded Moore's 1965 prediction. The overall compute growth (a composite of chips and other factors like disc, memory, etc.) has tripled every two years. However, the rate of energy efficiency improvement has grown more slowly at the rate of two times every two years. Only if the rate of energy efficiency goes up faster than the rate of compute performance will absolute data center energy consumption drop. The long-term trend over the last 30 years has been total consumption going up.

In the early days, when there were only 4,096 transistors on a chip and only small quantities were made, the fact that power consumption went up didn't matter much. Today, when chips contain hundreds of millions of transistors and we make hundreds of millions of chips, numbers are cumulatively important. What appears to be a small difference isn't that small when it grows exponentially. In only four years, power consumption grows by a factor of 2.25, in eight years by a factor of 5 and in 12 years by a factor of 11. I believe these increasing factors are the root cause of increasing data center power consumption. If not addressed, unlimited, ever-increasing compute performance will ultimately consume all the energy on the planet.

The DOE's research should focus on underlying power consumption growth, or at least come up with a better method for predicting where the embedded power consumption in chips and storage (wherever employed, i.e. cars, homes, data centers, etc.) is headed. Alternatively, more energy-efficient compute technologies should be identified and encouraged. Optical computing is one idea.

Finally, much of the original scientific research on electronic component reliability was done for the space program in the early 1960s. The people who did the work on temperature and humidity are long gone, and the science needs to be updated to today's materials and fabrication techniques. This is something that benefits society and is not something an individual company should or can do. Even if an individual company did the basic science, their competitors would likely not buy-in.

The DOE's national laboratories need to re-examine the relationship between component reliability and environmental factors like temperature, dew point and particulate size. This research also needs to look at changes in steady state conditions (i.e. rate of change) for temperature and dew point. Particulate size and contamination are important because the size of the wafer and printed circuit etch is now so small that any particle of contamination is now the size of a 747 airplane relative to the distance between adjacent etches.

A related question is how storage media is affected by temperature, dew point and contamination. Dew point appears to be more important than relative humidity. But back in 1960, instruments that directly measured dew point were non-existent, so relative humidity was the focus of measurement. Shifting to dew point control would have significant benefits for data center energy efficiency.

The point of this scientific research is to determine whether we can significantly relax the current environmental window we maintain in computer rooms and still assure component reliability. If we could use outside air to directly cool computer rooms, this change would be a major CapEx and OpEx benefit, but this is where contamination comes into the picture.

Even if we could expand the control envelope, what role do particulates play in reliability? We currently have problems with zinc whiskers, which are smaller than human hairs. These whiskers naturally grow out of improperly treated sheet metal and are not self-supporting beyond a certain length. When disturbed, these whiskers break off and become airborne and often cause random and mysterious short circuits in power supplies, which are unidentifiable because the whisker itself vanishes in the arc.

Another option that should be considered is the cost of environmentally hardening the electronics instead of providing a highly conditioned computer room. From a total cost of ownership basis, it may be cheaper to harden all the components than it is to condition the spaces they go into.

As we look at future computing technologies, we should be raising awareness that the rate of energy efficiency improvement needs to go up at a faster rate than compute performance in order for total energy consumption to go down. As a society, we need to find new technologies that reduce absolute energy consumption.

'Business' 카테고리의 다른 글

Chinese car's coming  (0) 2008.10.23
AT&T Outlook: Mixed and Cloudy  (0) 2008.10.23
Amazon.com Earnings Up, Stock Slides  (0) 2008.10.23
Billionaires Forced to Bail Out  (0) 2008.10.23
Stocks Tumble amid Economic Fears  (0) 2008.10.23
Posted by CEOinIRVINE
l