• 0 Posts
  • 18 Comments
Joined 1 year ago
cake
Cake day: August 28th, 2023

help-circle

  • The metric standard is to measure information in bits.

    Bytes are a non-metric unit. Not a power-of-ten multiple of the metric base unit for information, the bit.

    If you’re writing “1 million bytes” and not “8 million bits” then you’re not using metric.

    If you aren’t using metric then the metric prefix definitions don’t apply.

    There is plenty of precedent for the prefixes used in metric to refer to something other than an exact power of 1000 when not combined with a metric base unit. A microcomputer is not one one-thousandth of a computer. One thousand microscopes do not add up to one scope. Megastructures are not exactly one million times the size of ordinary structures. Etc.

    Finally: This isn’t primarily about bit shifting, it’s about computers being based on binary representation and the fact that memory addresses are stored and communicated using whole numbers of bits, which naturally leads to memory sizes (for entire memory devices or smaller structures) which are powers of two. Though the fact that no one is going to do something as idiotic as introducing an expensive and completely unnecessary division by a power of ten for every memory access just so you can have 1000-byte MMU pages rather than 4096 also plays a part.


  • If it averages several instances, with enough signal you could decompose a linear combination (e.g. average) of different patterns back out into its constituent parts.

    A smarter system won’t just take the mean of the votes from different instances but rather discard outliers as invalid input (flagging repeat offenders to be ignored in the future) and use the median or mode of the remainder. The results should also be quantitized to avoid leaking details about sources or internal algorithms; only the larger trends need to be reported.

    Of course you could always just keep the collected data private and only provide it to customers willing to pay $$$ for access, which handily limits instance operators’ ability to reverse-engineer the source of the data. And nothing prevents you from using separate instances for public and private data sets.




  • In what sense do you think this isn’t following the email standard? The plus sign is a valid character in the local part, and the standard doesn’t say how it should be interpreted (it could be a significant part of the name; it’s not proper to strip it out) or preclude multiple addresses from delivering to the same mailbox.

    Unfortunately the feature is too well-known, and the mapping from the tagged address to the plain address is too transparent. Spammers will just remove the label. You need either a custom domain so you can use a different separator (‘+’ is the default but you can generally choose something else for your own server) or a way to generate random, opaque temporary addresses.

    If you want to talk about non-compliant address handing, aside from not accepting valid addresses, the one that always bothers me is sites that capitalize or lowercase the local part of the address. Domain names are not case-sensitive, but the local part is. Changing the case could result in non-delivery or delivery to the wrong mailbox. Most servers are case-insensitive but senders shouldn’t assume that is always true.









  • So you’re not remapping the source ports to be unique? There’s no mechanism to avoid collisions when multiple clients use the same source port? Full Cone NAT implies that you have to remember the mapping (potentially indefinitely—if you ever reassign a given external IP:port combination to a different internal IP or port after it’s been used you’re not implementing Full Cone NAT), but not that the internal and external ports need to be identical. It would generally only be used when you have a large enough pool of external IP addresses available to assign a unique external IP:port for every internal IP:port. Which usually implies a unique external IP for each internal IP, as you can’t restrict the number of unique ports used by each client. This is why most routers only implement Symmetric NAT.

    (If you do have sufficient external IPs the Linux kernel can do Full Cone NAT by translating only the IP addresses and not the ports, via SNAT/DNAT prefix mapping. The part it lacks, for very practical reasons, is support for attempting to create permanent unique mappings from a larger number of unconstrained internal IP:port combinations to a smaller number of external ones.)


  • What “increased risks as far as csam”? You’re not hosting any yourself, encrypted or otherwise. You have no access to any data being routed through your node, as it’s encrypted end-to-end and your node is not one of the endpoints. If someone did use I2P or Tor to access CSAM and your node was randomly selected as one of the intermediate onion routers there is no reason for you to have any greater liability for it than any of the ISPs who are also carrying the same traffic without being able to inspect the contents. (Which would be equally true for CSAM shared over HTTPS—I2P & Tor grant anonymity but any standard password-protected web server with TLS would obscure the content itself from prying eyes.)



  • No, that’s not how I2P works.

    First, let’s start with the basics. An exit node is a node which interfaces between the encrypted network (I2P or Tor) and the regular Internet. A user attempting to access a regular Internet site over I2P or Tor would route their traffic through the encrypted network to an exit node, which then sends the request over the Internet without the I2P/Tor encryption. Responses follow the reverse path back to the user. Nodes which only establish encrypted connections to other I2P or Tor nodes, including ones used for internal (onion) routing, are not exit nodes.

    Both I2P and Tor support the creation of services hosted directly through the encrypted network. In Tor these are referred to as onion services and are accessed through *.onion hostnames. In I2P these internal services (*.i2p or *.b32) are the only kind of service the protocol directly supports—though you can configure a specific I2P service linked to a HTTP/HTTPS proxy to handle non-I2P URLs in the client configuration. There are only a few such proxy services as this is not how I2P is primarily intended to be used.

    Tor, by contrast, has built-in support for exit nodes. Routing traffic anonymously from Tor users to the Internet is the original model for the Tor network; onion services were added later. There is no need to choose an exit node in Tor—the system maintains a list and picks one automatically. Becoming a Tor exit node is a simple matter of enabling an option in the settings, whereas in I2P you would need to manually configure a proxy server, inform others about it, and have them adjust their proxy configuration to use it.

    If you set up an I2P node and do not go out of your way to expose a HTTP/HTTPS proxy as an I2P service then no traffic from the I2P network can be routed to non-I2P destinations via your node. This is equivalent to running a Tor internal, non-exit node, possibly hosting one or more onion services.


  • It is not true that every node is an exit node in I2P. The I2P protocol does not officially have exit nodes—all I2P communication terminates at some node within the I2P network, encrypted end-to-end. It is possible to run a local proxy server and make it accessible to other users as an I2P service, creating an “exit node” of sorts, but this is something that must be set up deliberately; it’s not the default or recommended configuration. Users would need to select a specific I2P proxy service (exit node) to forward non-I2P traffic through and configure their browser (or other network-based programs) to use it.