Anon
01/13/2023 (Fri) 08:53
No.8544
del
>>8541
>>8542
UPDATE
Been testing httrack and wget in varying configurations on a few wikis. Mainly small ones like this: (
https://something-sweet-to-bite.fandom.com/). Though a also did a full archival run on the fanlabor wiki.
https://mlpfanart.fandom.com/
Results have been mixed. Finally sort of got wget to work with restricting domains (I don't know how or why, but directly defining domains in wget would break images even when I defined static.wikia.nocookie.net where they host their images). Though my most successful attempt resulted in a site that still was strangely broken. Weirdest thing being -k/--convert-links resulting in broken image files over characters. This behavior was consistent across several sweeps and so I assume is something off on wikia's end.
httrack did a little better constructing websites but still was very broken. The largest site was the fanlabor wiki and that site was larger and took longer than I anticipated (16 hours to download, 15 GB in size). Image links were often broken and 404ing during download though it seems that a lot that were doing that were old remnants of files that were already deleted. Still, I am not confident I got all of them. My internet can be janky and I am not sure that httrack is as flexible with interruptions as wget. Also, if javascript is turned on all the images disappear from the pages. I don't know why that is.
Eh, not the most sucessful but I will see if I can figure out wikia/fandom and hold on to what I do have in the mean time.