I believe this is a slightly controversial topic, at least from what I have gathered so far. Some say its best to leave the server on to spare the life time of the spinning rust. Other seem to prefer to save power and boot the server off each night. So wanted to chip in and hear what folks here do and why do what you do.

Bonus question; Do you guys have a UPS? Is it a must have for a homelab, or does it just depend on the usecase?

  • computergeek125@lemmy.world
    link
    fedilink
    English
    arrow-up
    4
    ·
    5 months ago

    On/off:
    I have 5 main chassis excluding desktops. Prod cluster is all flash, standalone host has one flash array, one spinning rust array, NAS is all spinning rust. I have a big enough server disk array that spinning it up is actually a power sink and the Dell firmware takes a looong time to get all the drives up on reboot.

    TLDR: Not off as a matter of day/night, off as a matter of summer/winter for heat.

    Winter: all on

    Summer:

    • prod cluster on (3x vSAN - it gets really angry if it doesn’t have cluster consistency)
    • NAS on
    • standalone server off, except to test ESXi patches and when vCenter reboots cause it to be WoL’d (vpxd sends a wake to all stand by hosts on program init)
    • main desktop on
    • alt desktops off

    VMs are a different story. Normally I just turn them on and off as needed regardless of season, though I will typically turn off more of my “optional” VMs to reduce summer workload in addition to powering off the one server. Rough goal is to reduce thermal load as to not kill my AC as quickly which is probably running above its duty cycle to keep up. Physical wise, these servers are virtualized so this on/off load doesn’t cycle the array.

    Because all four of my main servers are the same hypervisor (for now, VMware ESXi), VMs can move among the prod cluster to balance load autonomously, and I can move VMs on or off the standalone host by drag-and-drop. When the standalone host is off, I usually move turn it’s VMs off and move them onto the prod cluster so I don’t get daily “backup failure” emails from the NAS.

    UPS: Power in my area is pretty stable, but has a few phase hiccups in the summer. (I know it’s a phase hiccup because I mapped out which wall plus are on which phase, confirmed with a multimeter than I’m on two legs of a 3-phase grid hand-off, and watched which devices blip off during an event) For something like a light that will just flicker or a laptop/phone charger that has a high capacitance, such blips are a non issue. Smaller ones can even be eaten by the massive power supplies my Dell servers have. But, my Cisco switches are a bit sensitive to it and tend to sing me the song of their people when the power flickers - aka fan speed 100% boot up whining. Larger blips will also boop the Dell servers, but I don’t usually see breaks more than 3-5m.

    Current UPS setup is:

    • rack split into A/B power feeds, with servers plugged into both and every other one flipped A or B as it’s primary
    • single plug devices (like NAS) plugged into just one
    • “common purpose” devices on the same power feed (ex: my primary firewall, primary switches, and my NAS for backups are on feed A, but my backup disks and my secondary switches are on feed B)
    • one 1500VA UPS per feed (two total) - aggregate usage is 600-800w
    • one 1500VA desktop UPS handling my main tower, one monitor, and my PS5 (which gets unreasonably upset about losing power, so it gets the battery backup)

    With all that setup, the gauges in the front of the 3 UPSes all show roughly 15-20m run time in summer, and 20-25m in winter. I know one may be lower than displayed because it’s battery is older, but even if it fails and dumps it’s redundant load onto the main newer UPS I’ll still have 7-10m of battery at worst case and that’s all I really need to weather most power related issues at my location.