Every decent supercomputer installation needs a reliable and battery-backed power supply. You need to choose your uninterruptible power supply (UPS) system based on the following requirements:
- Power rating of the equipment that will be connected to the UPS system. In many cases, it is beneficial to provide backup power for your cooling equipment, too: in smaller machine rooms, if the cooling system goes down but computing equipment continues to operate, air temperature will immediately soar — which threatens a thermal shock to computing equipment.
- Backup time. There must be enough time to automatically save work (or create restore points), and then cleanly shut down applications and operating systems on compute nodes. Alternatively, if there is a backup diesel generator, the UPS backup time must be large enough for the diesel generator to start and reach its steady-state regime.
- Physical constraints, such as size and weight of UPS equipment. Sometimes you only have a limited number of empty units in a rack, or space for new racks on the floor.
- Budgetary constraints.
We present you a tool that can choose a UPS system automatically. At the heart of the tool is the same algorithm based on graph representations of configurations as used in the design of computer clusters.
In the first step, the tool builds a complete list of possible configurations of a UPS system, and imposes user-specified constraints. In the second step, best configuration is determined and presented to the user — or used as a building block for a complex solution.
Let’s take a closer look at what the tool can do, and how it works. Firstly, you will need to supply a graph database (in an XML file) that describes possible configurations of your UPS subsystem.
Our example database describes configurations of a Liebert APM 45 kW model, made by Emerson Network Power. It can have three configurations, with power ratings of 15, 30 and 45 kW. The size that the UPS takes is the same for all three configurations — one rack.
Other characteristics differ, however. Most notably, the number of batteries is fixed for all three configurations, therefore backup time (at full load) provided with these batteries tends to be highest for the 15 kW configuration (49 minutes), medium for the 30 kW configuration (21 minute), and lowest for 45 kW (only 12 minutes).
The UPS can have from one to three 15 kW blocks. Every block adds 34 kilograms to weight, and $6,000 to costs. We estimate that the “bare” system, without any blocks installed, costs $29,000 and weights 383 kilograms. You can therefore easily calculate those characteristics for all three configurations. Of course, this is a toy example, and a production database will likely describe tens of configurations of many models in the product line of a single vendor, or even of many vendors.
Now that we have a set of configurations, and have determined their characteristics, we can impose constraints to filter out unsuitable configurations. For the web service presented on this page, implemented constraints are “power rating” (required) and “battery backup time” (optional). (Of course, in principle, you can impose any constraints that can be formally described in the graph database)
Let us discuss the constraint on backup time, as it is a tricky thing. The longest backup time at full load that our UPS supports is 49 minutes, and this is available when using the 15 kW configuration. No matter how hard you try to combine blocks, you can’t get more than 49 minutes of backup time.
If your required backup time lies in the range of 21 to 49 minutes, you should use the 15 kW configuration, because it is the only one able to satisfy your constraint. If your backup time is 12 to 49 minutes, you can use either 15 kW or 30 kW configurations. Finally, if you required backup time is less than 12 minutes, you can use any of the three configurations: 15 kW, 30 kW or 45 kW. Which one to choose depends on another constraint, “power_rating”: you should choose the configuration whose power rating is more than or equal to the one specified by the user.
The tool, therefore, applies both constraints to the list of configurations, and in the resulting list chooses the configuration with the lowest cost (“ups_cost” in the tool’s output). If this attempt is successful, the tool prints its output and terminates.
However, there could be no configurations that match constraints. For example, suppose you specified a reasonable backup time of 15 minutes, and your required power rating is 200 kW. There are two configurations that match the first constraint: namely, the 15 kW and 30 kW configurations. But none of them can provide 200 kW. In this case, you will need several separate UPS systems to fulfil your requirements.
The tool will query the list of configurations again, this time imposing only the “backup time” constraint. All resulting configurations can be used to build the complex solution that will satisfy the constraint on power rating. But which combination of configurations to use? The tool employs a greedy algorithm to find it out.
It sorts configurations by the value of “ups_cost_per_kw” characteristic: the lower the value, the better the configuration. The 45 kW configuration is twice better by this metric than its 15 kW peer. Then, the tool uses as many best configurations as necessary to fulfil the requirement on power rating (that’s the “greedy” step). Next, the deficit, if any, is filled with the cheapest configuration. It only remains to print which configurations we need to buy to meet our needs.
This is best illustrated with examples.
Example 1. Power rating: 200 kW, backup time: 15 minutes. 15 kW and 30 kW configurations can both be used to get 15 minutes of backup time, but none of them gives 200 kW of power. 30 kW configurations have a lower cost per kW. Therefore, the tool will use six 30 kW configurations in the “greedy” step. That will provide 180 kW of power. The remaining deficit is 20 kW. This is covered with one more 30 kW configuration. The result is seven 30 kW configurations: “ups_partitioning=7*30000”
Example 2. Power rating: 200 kW, backup time: not specified. None of the configurations — 45 kW, 30 kW or 15 kW — can provide 200 kW of power. The per-kW cost is lowest for the 45 kW configuration. The tool will choose four 45 kW configurations. This will provide 180 kW. The deficit is 20 kW. The cheapest configuration that covers it is a 30 kW one. The result it: “ups_partitioning=4*45000+1*30000”
Example 3. Power rating: 200 kW, backup time: 45 minutes. Only 15 kW configurations can provide the battery backup time of 45 minutes. The tool will use 14 of them. Result: “ups_partitioning=14*15000”
Example 4. Power rating: 100 kW, backup time: 10 minutes. Any configuration can provide 10 minutes of backup time, therefore the 45 kW one will be used, as it has the lowest cost per kW. Two of them will be required, which will provide 90 kW. The deficit — 10 kW — is covered with a 15 kW configuration, which is the cheapest. Result: “ups_partitioning=2*45000+1*15000”
As you can see, this all is quite simple (after it has been explained). You will find the link to the web service below. Beware: battery backup time is specified in seconds.
To download the tool (in source code in Python language), proceed to the “Downloads” section. Got questions or comments? Simply leave your feedback below!