This is totally going to turn into another JBIG2 lossy compression clusterfuck isn’t it…
For those who are unfamiliar, JBIG2 is a compression standard that has a dubious reputation for replacing characters incorrectly in scanned documents (so 6 could become an 8, for example) leading to potentially serious issues when scanning things like medical and legal documents, construction blueprints, etc.
Say, if you compress some data using these LLMs, how hard it is to decompress the data again without access to the LLM used to perform the compression? Is the compression “algorithm” used by the LLM will be the same for all runs (which means you probably can reverse engineer it to created a decompressor program), or will it be different every time it compress new data?
I mean, having to download a huge LLM to decompress some data, which probably also requires GPU with big VRAM, seems a bit much.
So piedpiper company actually going to start
deleted by creator
It’ll be interesting to see if this gets used in places where the cost of dedicated hardware out ways the bandwidth available. Video calls to Antarctica, shipping vessels, airplanes, space, etc. At least that’s something that comes to mind. Could also see a next interation of CDNs using it, if the numbers check out.