Do y’all know about textise? I don’t see mention of it come up in a quick search. https://www.textise.net/

It can be used with the duckduckgo bang !textise

It also works over Tor, where I can use it as a proxy to avoid Cloudflare checkpoints.

I don’t think that it is open source but not completely sure.

Copy from the site intro:

Textise is a new way of looking at the Web. It’s an internet tool that removes everything from a web page except for its text. In practice, this means that images, forms, scripts, adverts, they all go, leaving plain text. Find out more here… (https://textise.wordpress.com/about-textise/)

How to use this page

  1. Type or paste the URL of a web page into the box below and click “Textise”. A text only version of the web page will be displayed.
  2. Type a search term into the box, select a search engine from the drop-down list, and click “Search”. You will be taken to a text only version of the search results.
    • syzizeky@lemmy.mlOP
      link
      fedilink
      arrow-up
      7
      arrow-down
      1
      ·
      1 year ago

      My use case is to access text and link content on a web page anonymously over Tor without getting blocked by Cloudflare.

    • Crul@lemm.ee
      link
      fedilink
      arrow-up
      5
      ·
      1 year ago

      I prefer Tranquility Reader add-on (no need for a 3rd party service). Firefox’ native Reader Mod is not compatible with addons, like translation ones. Tranquility Reader is a bit more configurable too, but that’s just an extra.

      • Kissaki@feddit.de
        link
        fedilink
        English
        arrow-up
        4
        ·
        edit-2
        1 year ago

        Does textise support what Reader mode doesn’t? If reader mode can’t determine the central content, does textise have more logic to so so?

        Given the wording I also want to point out a website doesn’t have to actively explicitly support reader mode. They only have to follow html website standards marking their content - a general accessibility approach too.

        • deleted@lemmy.world
          link
          fedilink
          arrow-up
          3
          ·
          edit-2
          1 year ago

          Technically, you’re correct.

          However, many websites doesn’t follow the appropriate HTML standards and just abuse h1 and p.

          I just tried it with Google.com and it seems to remove all html notations other than text.

          It useful in some cases such as wordpress one-page websites which have their story, mission, products, etc…

      • miss_brainfart@lemmy.ml
        link
        fedilink
        arrow-up
        4
        ·
        edit-2
        1 year ago

        Good point. Though I have to say, I’m not a fan of what uBlock Origin shows me when visiting Textise