Anon 03/12/2024 (Tue) 12:31 No.9820 del
Since the text below in this post concerns itself with a similar phenomenon, I will explain the situation yet again.
>such interesting things [sarcasm] as how his scraper found that one source had a URL in lowercase and another source had the same URL with capital letters.
He is referring to that time that I explained how some of the data in is invalid. Well, basically all of the data in is invalid due to the website owner modifying newlines and deleting certain whitespace, but that's a different story (and I guess all of the source data/archives that is based on is available). The malformed data that I pointed out was this: a likely small amount of paste IDs in are invalid due to case sensitivity or insensitivity; I suspect that this has to do with a mechanism in >>9613. Two considerations here:
(1.) Caring about web data being a faithful copy of the source and pointing out when it isn't - this is related to "archive-quality" data purity and the dialectic of WARCs vs. raws.
(2.) Caring about what some random anon thinks is "interesting" or "annoying".

Obviously, I am going to value the 1st point over the 2nd one. So, on with what I was going to point out. Looks like has a Windows-friendly-filenames copy of some/all torrents. Examples:
>Ponibooru-All-Unrated/7177 - pinkie_pie %22original%22_character brundle rainbow_dash.jpg
>Ponibooru-All-Unrated/7177 - pinkie_pie "original"_character brundle rainbow_dash.jpg
the torrent only has the non-Windows-friendly-filename
>Ponibooru-All-Unrated/7643 - %22original%22_character ms_paint funny_to_me skywind weanus alpine_horn comic MetaSue.PNG
>Ponibooru-All-Unrated/7643 - "original"_character ms_paint funny_to_me skywind weanus alpine_horn comic MetaSue.PNG
>Ponibooru-All-Unrated/7641 - artist:wasd999 original_character.jpg
>Ponibooru-All-Unrated/7641 - artist_wasd999 original_character.jpg
related to >>9754

Message too long. Click here to view full text.