@Kissaki - Lemmy

Kissaki@lemmy.dbzer0.com · edit-2 3 days ago

Depending on what you want to scape, that’s a lot of overkill and overcomplication. Full website testing frameworks may not be necessary to scrape. Python with it’s tooling and package management may not be necessary.

I’ve recently extracted and downloaded stuff via Nushell.

Requirement: Knowledge of CSS Selectors
Inspect Website DOM in Webbrowser web developer tools
1. Identify structure
2. Identify adequate selectors; testable via browser dev tools console document.querySelectorAll()
Get and query data

For me, my command line terminal and scripting language of choice is Nushell:

let $html = http get 'https://example.org/'
let $meta = $html | query web --query '#infobox .title, #infobox .tags' |  | { title: $in.0.0 tags: $in.1.0 }
let $content = $html | query web --query 'main img' --attribute data-src
$meta | save meta.json

or

1..30 | each {|x| http get $'https://example.org/img/($x).jpg' | save $'($x).jpg'; sleep 100ms }

Depending on the tools you use, it’ll be quite similar or very different.

Selenium is an entire web-browser driver meaning it does a lot more and has a more extensive interface because of it; and you can talk to it through different interfaces and languages.

Kissaki@lemmy.dbzer0.com · 3 days ago

You don’t even need a VPN to use a different DNS server.

Kissaki@lemmy.dbzer0.com · 3 days ago

Injecting a malicious undisclosed firmware/software update. Very private and secure. /s

Kissaki@lemmy.dbzer0.com · 3 days ago

That’s bullshit. There’s no reason to limit or target a specific or non-maximum CPU core usage.

That would only make sense to evade hardware faults or cooling issues. Never as a general guideline.

Kissaki@lemmy.dbzer0.com · edit-2 5 days ago

YouTube channels can be terminated for both repeated copyright infringement and community guideline violations. In these cases, revenues are often withheld as well. It’s possible, however, that linked AdSense accounts are treated differently.

AdSense policies can be confusing, but based on additional information provided by Google’s AI, YouTube copyright bans are most likely to result in AdSense terminations too.

This is the first time I read of an AI as a source / AI being a source for an article.

Kissaki@lemmy.dbzer0.com · 9 days ago

I don’t see any free leech information in their announcement forums, and the news page is empty (looks broken).

Kissaki@lemmy.dbzer0.com · 9 days ago

MusicBee has Tools -> Manage Duplicates

Screenshot

Kissaki@lemmy.dbzer0.com · 4 months ago

I cannot subscribe

details needed

Kissaki@lemmy.dbzer0.com · edit-2 5 months ago

There are website services where you both stay online and transfer directly.

There could be direct peer to peer transfer tools that are more robust.

If you want to go through a file transfer/hoster

no limit https://gofile.io/
300 GB limit https://1fichier.com/

There’s some more, those are the top two in my bookmarks.

You’d do good of encrypting/7z-passwording if you don’t want others to see the content, just to make sure not to have to trust the hoster.

Kissaki@lemmy.dbzer0.com · 6 months ago

In 2021, YouTube announced that it had invested “hundreds of millions of dollars” to create content management tools, of which Content ID quickly emerged as the platform’s go-to solution to detect and remove copyrighted materials.

Content ID was introduced in 2021? Only 3 years ago? I thought it was significantly older.

Wikipedia says 2007.

Dunno if they meant something different or typoed the year.

Kissaki@lemmy.dbzer0.com · edit-2 6 months ago

Bandcamp

Kissaki@lemmy.dbzer0.com · 6 months ago

I clicked here like you suggested, but it didn’t play on YouTube /s

Kissaki@lemmy.dbzer0.com · edit-2 6 months ago

I as a Muslim, do not believe in patenting ideas already, because you can’t deny copying ideas. But most Muslims don’t know about this, unless they actively ask this question.

I’m confused. What does that have to do with being a Muslim?

[neither nor] is following the Patent law when it comes to things when it really matters

What do you mean?

Kissaki@lemmy.dbzer0.com · 8 months ago

will never be a reliable way to truly archive something

I think they’re doing a damn fine job archiving something, and in reliable ways too