Subject: A handy tool/kludge.
Author:
Posted on: 2021-02-01 09:35:47 UTC

Because I'm pulling webpages from my own sites, it's really easy for me to get a list of what pages exist. Actually saving them requires right-click saving each file, which for things like FanficWorld (190 pages) is a bit extreme.

So I've kludged together a tool to do it for me. The Archive Dumper is an Excel file which reads a set of URLs and the filenames you want for them, and then - look away, techies, you're going to hate this - manually opens every page in an invisible Internet Explorer window and saves them all. ^_^

It runs incredibly slowly - I used the same code for a full archive dump of All PPC Stories Ever and it took hours - but it gets it done. The biggest flaw is that, for most of my pages, the IE 'Are you sure you want to leave this page?' has to be clicked every single time (they stack, so you can leave it for 30 seconds and then just hammer them). Oh, it also seems to break when faced with anything that isn't a htm/html file - CSS files are just opening in TXT and crashing the archiver. But it's fine.

This version is set up for FfW; it's automatically creating the filename column from the URLs, but that code won't work for other sites. Oh, and make sure the path to your archive ends with a \, or you'll wind up prefixing the last folder onto every file. In other news, all FanficLand pages now have very similar names... ^^

hS

Reply Return to messages