@computergeek125 - Lemmy

computergeek125@lemmy.world · 4 months ago

A static PNG tile database for world.osm is even larger. Without a solid vector tile solution, this is the most efficient data format for disk space.

Also, there’s a post render CDN cache in front of the rendering layer to offset load, plus there’s I think some internal caching in renderd. It’s a pretty complex machine, but databases of the world are in fact huge.

computergeek125@lemmy.world · 4 months ago

Posterity?

https://www.merriam-webster.com/dictionary/posterity

computergeek125@lemmy.world · edit-2 4 months ago

OSM’s core tile servers have dozens of cores, hundreds of GB of RAM each, and the rendering and lookup databases are a few TB. That’s not trivial to self host, especially since one self hosted tile server cannot always keep up with a user flick scrolling.

Edit: car GPS maps and the old TomTom and Garmin devices have significantly less metadata embedded than a modern map.

computergeek125@lemmy.world · 4 months ago

Oh for sure

computergeek125@lemmy.world · 4 months ago

I saw a meme somewhere along the line that Excel is the third best tool for every job.

computergeek125@lemmy.world · edit-2 4 months ago

A paywall?
WSJ the paywall??

For your consideration, I present an anti-paywal-inator!!! TO THE ARCHIVES! https://archive.is/5VPB5

computergeek125@lemmy.world · edit-2 4 months ago

Virtual servers (as opposed to hardware workstations or servers) will usually have their “KVM” (Keyboard Video Mouse) built in to the hypervisor control plane. ESXi, Proxmox (KVM - Kernel Virtual Machine), XCP-ng/Citrix XenServer (Xen), Nutanix (KVM-like), and many others all provide access to this. It all comes down to what’s configured on the hypervisor OS.

VMs are easy because the video and control feeds are software constructs so you can just hook into what’s already there. Hardware (especially workstations) are harder because you don’t always have a chip on the motherboard that can tap that data. Servers usually have a dedicated co-computer soldered onto the motherboard to do this, but if there’s nothing nailed down to do it, your remote access is limited to what you can plug in. PiKVM is one such plug-in option.

computergeek125@lemmy.world · edit-2 4 months ago

Any system without network unlock usually requires a TPM PIN/PW every reboot. Your instructions (when read a certain way) imply that the command also bypasses the encryption without fetching a recovery key from the TPM or DC.

My home network (ISC DHCPD) behaves this way - either I type the TPM key or I type the 25-char key.

computergeek125@lemmy.world · 4 months ago

deleted by creator

computergeek125@lemmy.world · 4 months ago

Getting production servers back online with a low level fix is pretty straightforward if you have your backup system taking regular snapshots of pet VMs. Just roll back a few hours. Properly managed cattle, just redeploy the OS and reconnect to data. Physical servers of either type you can either restore a backup (potentially with the IPMI integration so it happens automatically), but you might end up taking hours to restore all data, limited by the bandwidth of your giant spinning rust NAS that is cost cut to only sustain a few parallel recoveries. Or you could spend a few hours with your server techs IPMI booting into safe mode, or write a script that sends reboot commands to the IPMI until the host OS pings back.

All that stuff can be added to your DR plan, and many companies now are probably planning for such an event. It’s like how the US CDC posted a plan about preparing for the zombie apocalypse to help people think about it, this was a fire drill for a widespread ransomware attack. And we as a world weren’t ready. There’s options, but they often require humans to be helping it along when it’s so widespread.

The stinger of this event is how many workstations were affected in parallel. First, there do not exist good tools to be able to cover a remote access solution at the firmware level capable of executing power controls over the internet. You have options in an office building for workstations onsite, there are a handful of systems that can do this over existing networks, but more are highly hardware vendor dependent.

But do you really want to leave PXE enabled on a workstation that will be brought home and rebooted outside of your physical/electronic perimeter? The last few years have showed us that WFH isn’t going away, and those endpoints that exist to roam the world need to be configured in a way that does not leave them easily vulnerable to a low level OS replacement the other 99.99% of the time you aren’t getting crypto’d or receive a bad kernel update.

Even if you place trust in your users and don’t use a firmware password, do you want an untrained user to be walked blindly over the phone to open the firmware settings, plug into their router’s Ethernet port, and add https://winfix.companyname.com as a custom network boot option without accidentally deleting the windows bootloader? Plus, any system that does that type of check automatically at startup makes itself potentially vulnerable to a network-based attack by a threat actor on a low security network (such as the network of an untrusted employee or a device that falls into the wrong hands). I’m not saying such a system is impossible - but it’s a super huge target for a threat actor to go after and it needs to be ironclad.

Given all of that, a lot of companies may instead opt that their workstations are cattle, and would simply be re-imaged if they were crypto’d. If all of your data is on the SMB server/OneDrive/Google/Nextcloud/Dropbox/SaaS whatever, and your users are following the rules, you can fix the problem by swapping a user’s laptop - just like the data problem from paragraph one. You just have a team scale issue that your IT team doesn’t have enough members to handle every user having issues at once.

The reality is there are still going to be applications and use cases that may be critical that don’t support that methodology (as we collectively as IT slowly try to deprecate their use), and that is going to throw a Windows-sized monkey wrench into your DR plan. Do you force your uses to use a VDI solution? Those are pretty dang powerful, but as a Parsec user that has operated their computer from several hundred miles away, you can feel when a responsive application isn’t responding quite fast enough. That VDI system could be recovered via paragraph 1 and just use Chromebooks (or equivalent) that can self-reimage if needed as the thin clients. But would you rather have annoyed users with a slightly less performant system 99.99% of the time or plan for a widespread issue affecting all system the other 0.01%? You’re probably already spending your energy upgrading from legacy apps to make your workstations more like cattle.

All in trying to get at here with this long winded counterpoint - this isn’t an easy problem to solve. I’d love to see the day that IT shops are valued enough to get the budget they need informed by the local experts, and I won’t deny that “C-suite went to x and came back with a bad idea” exists. In the meantime, I think we’re all going to instead be working on ensuring our update policies have better controls on them.

As a closing thought - if you audited a vendor that has a product that could get a system back online into low level recovery after this, would you make a budget request for that product? Or does that create the next CrowdStruckOut event? Do you dual-OS your laptops? How far do you go down the rabbit hole of preparing for the low probability? This is what you have to think about - you have to solve enough problems to get your job done, and not everyone is in an industry regulated to have every problem required to be solved. So you solve what you can by order of probability.

computergeek125@lemmy.world · 4 months ago

Okay that’s fair. Their pricing is awful in general, and that’s especially egregious for something that used to be free

computergeek125@lemmy.world · 4 months ago

There’s probably specific ticket queues and wiki/doc spaces for each support team.

Problem with an app? Send it to the internal dev/support team. Then if needed it gets routed.

computergeek125@lemmy.world · 4 months ago

How was Parsec before the acquisition?

I only really have experience after, and it’s the only Unity product I’ve actually found that I like. My only major complaint is that it’s not compatible with the base configuration of Palo Alto, but that’s really more of a Palo Alto problem than a Parsec problem.

computergeek125@lemmy.world · 4 months ago

I work on an open source project in my free time. Officially we support Linux, Windows, and macOS.

I had to change ~2 lines of code to port the Linux/Mac code path to FreeBSD. Windows has a completely different code path for that critical segment because it’s so different compared to the three Unix/Unix-like.

This is a very specific example from a server side code that leaves out a lot of details. One being that we wrote our project with the intent that it would be multi platform by design. Game software is wildly complicated compared to what we do. The point here is that it should be easier to port Unix to Unix-like compared to Unix to Windows.

computergeek125@lemmy.world · 5 months ago

A well managed server won’t init an arbitrary drive and has a lock screen with a password so that the most a rubber ducky would be able to do is reboot it. Which is something you’d already be able to do if you had access to the front panel with the power button.

computergeek125@lemmy.world · 5 months ago

You may not but your phone will fail over to data if it loses its lease and stuff like background update tasks will cease to function (like Windows Update or dnf cron)

computergeek125@lemmy.world · 5 months ago

On/off:
I have 5 main chassis excluding desktops. Prod cluster is all flash, standalone host has one flash array, one spinning rust array, NAS is all spinning rust. I have a big enough server disk array that spinning it up is actually a power sink and the Dell firmware takes a looong time to get all the drives up on reboot.

TLDR: Not off as a matter of day/night, off as a matter of summer/winter for heat.

Winter: all on

Summer:

prod cluster on (3x vSAN - it gets really angry if it doesn’t have cluster consistency)
NAS on
standalone server off, except to test ESXi patches and when vCenter reboots cause it to be WoL’d (vpxd sends a wake to all stand by hosts on program init)
main desktop on
alt desktops off

VMs are a different story. Normally I just turn them on and off as needed regardless of season, though I will typically turn off more of my “optional” VMs to reduce summer workload in addition to powering off the one server. Rough goal is to reduce thermal load as to not kill my AC as quickly which is probably running above its duty cycle to keep up. Physical wise, these servers are virtualized so this on/off load doesn’t cycle the array.

Because all four of my main servers are the same hypervisor (for now, VMware ESXi), VMs can move among the prod cluster to balance load autonomously, and I can move VMs on or off the standalone host by drag-and-drop. When the standalone host is off, I usually move turn it’s VMs off and move them onto the prod cluster so I don’t get daily “backup failure” emails from the NAS.

UPS: Power in my area is pretty stable, but has a few phase hiccups in the summer. (I know it’s a phase hiccup because I mapped out which wall plus are on which phase, confirmed with a multimeter than I’m on two legs of a 3-phase grid hand-off, and watched which devices blip off during an event) For something like a light that will just flicker or a laptop/phone charger that has a high capacitance, such blips are a non issue. Smaller ones can even be eaten by the massive power supplies my Dell servers have. But, my Cisco switches are a bit sensitive to it and tend to sing me the song of their people when the power flickers - aka fan speed 100% boot up whining. Larger blips will also boop the Dell servers, but I don’t usually see breaks more than 3-5m.

Current UPS setup is:

rack split into A/B power feeds, with servers plugged into both and every other one flipped A or B as it’s primary
single plug devices (like NAS) plugged into just one
“common purpose” devices on the same power feed (ex: my primary firewall, primary switches, and my NAS for backups are on feed A, but my backup disks and my secondary switches are on feed B)
one 1500VA UPS per feed (two total) - aggregate usage is 600-800w
one 1500VA desktop UPS handling my main tower, one monitor, and my PS5 (which gets unreasonably upset about losing power, so it gets the battery backup)

With all that setup, the gauges in the front of the 3 UPSes all show roughly 15-20m run time in summer, and 20-25m in winter. I know one may be lower than displayed because it’s battery is older, but even if it fails and dumps it’s redundant load onto the main newer UPS I’ll still have 7-10m of battery at worst case and that’s all I really need to weather most power related issues at my location.

computergeek125@lemmy.world · 5 months ago

/j hey some of us only have 10GbE

Jokes aside, I get the classification. I’m pretty solidly in the category of 2c - more tech than some medium business but without the SLA to go with it.

computergeek125@lemmy.world · 6 months ago

Apologies for being late, I wanted to be as correct as I could be.

So, straight to the point: Nextcloud by default uses plain files if you don’t configure the primary storage to be an S3/object store. As far as I can tell, this is not automatic and is an intentional change at system creation by the original admin. There is a third-party migration script, but there does not appear to be a first-party method of converting between the two. That’s very good news for you! (I think/hope)

My instance was set up as a standalone, so I cannot speak for the all-in-one image. Poking around the root data directory (datadirectory in the config.php), I was able to locate my user account by internal username - which if you do not use LDAP will be the shortened login name. On default LDAP configs, this internal username may be a GUID, but that can be changed during the LDAP enablement process by overriding the Internal Username field in the Expert LDAP settings.

Once in the user’s home folder in the root data directory, my subdirectory options are cache, files, files_trashbin, files_versions, uploads.

files contains the “live” structure of how I perceive my Nextcloud home folder in the Web UI and the Nextcloud Desktop sync engine
files_trashbin is an unstructured data folder containing every file that was deleted by this user and kept per the trash folder’s retention policy (this can be configured at the site level). Files retain their original name, but have a suffix added which takes the form .d######... where the numbers appear to be a Unix timestamp, likely the deletion date. A quick scan of these with the file command in Linux showed that each one had an expected file header based on its extension (i.e., a .png showed as a PNG image with an expected resolution). In the Web UI, there is metadata about which folder the file originally resided in, but I was not able to quickly identify this in the file structure. I believe this info is coming in from the SQL database.
files_version are how Nextcloud is storing its file version history (if enabled). Old versions are cleaned up per a set of default behaviors to keep more copies of more recent changes, up to a maximum age deletion threshold set at the site level. This folder is stored in approximately the same structure as the main files live structure, however each copy of each version is appended a suffix .v######... where the number appears to be the Unix timestamp the version was taken (*I have not verified that this exactly matches what the UI shows, nor have I read the source code that generates this). I’ve spot checked via the Linux file command and sha256 that the files in this versions structure appear to be real data - tested one Excel doc and one plain text doc.

I think that should get a fairly rough answer to your original question, but if I left something out you’re curious about, let me know.

Finally, I wanted to thank you for making me actually take a look at how I had decided to configure and back up my Nextcloud instance and ngl it was kind of a mess. The trash bin and versions can both get out of hand if you have frequently changing or deleting/recreating files (I have network synchronization glued onto some of my games that do not have good remote save support). Retention policy on trash and versions cleaned up extraneous data a lot, as only one of those was partially configured.

I can see a lot of room for improvements… just gotta rip the band-aid off and make intelligent decisions rather than just slapping an rsync job that connects to the Nextcloud instance and replicates down the files and backend database. Not terrible, but not great.

In the backend I’m already using ZFS for my files and Redis database, but my core SQL database was located on the server’s root partition (which is XFS - I’d rather not mess with a DKMS module from a boot CD if something happens and upstream borks the compile, which is precisely what happened when I upgraded to OpenZFS 2.1.15).

I do not have automatic ZFS snapshots configured at this time, but based on the above, I’m reasonably confident that I could get data back from a ZFS snapshot if any of the normal guardrails within Nextcloud failed or did not work as intended (trash bin and internal version history). Plus, the data in that cursed rsync backup should be at least 90% functional.

computergeek125@lemmy.world · 6 months ago

I forget where I originally found this and Google on my phone was unhelpful.

My favorite annoying trick is x -=- 1. It looks like it shouldn’t work because that’s not precisely a valid operator, but it functions as an increment equivalent to x += 1

It works because -= still functions are “subtract and assign”, but the second minus applies to the 1 making it -1.