Retrieving Geocities data for local storage

Remember Geocities? Many years ago I had a website with geocities and that is how I first learnt to code html. I remember the days of trying to get the grey background just right – the white background used to be the default colour but all the cool pages had grey backgrounds – just shows how much has changed.

Anyway, Yahoo is now finally shutting down geocities – there’s about a week left before they pull the plug, so if you haven’t downloaded your site yet then hurry up.  Unfortunately they turned off ftp access to the server a long long time ago and the recommended (and painful) way is to visit each page and do a view source and save the file.

The much easier way is to download wget (if you don’t already have it) and then retrieve your files by running wget with a variety of switches.

The two syntax’s I’d recommend are

“wget –p –r –include-directories=/Hollywood/1880 –k “

or without the –k option.

The –p option retrieves all files required to display the page, –r gets the specified page and everything in it, but we’re limiting to all files within the 1880 directory only – this way you don’t start downloading all the files from other sites that you linked to (and the whole internet).

The –k option changes the source of the web page so that links refer to pages on the local drive as opposed to linking to pages on the web.

I would recommend you download the site twice into different subdirectories – one with the –k and the other without so you have a record of your original code and a working directory locally too.

And yes, the Hollywood1880 site is mine from about 1994-1997

Also don’t forget that dreamhost are still doing their two years of free hosting  for former geocities customers so you can move your site to them for free but you have less than a week to get this done.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.