The Wayback Machine is a superbly useful tool in finding out what various websites in the past looked like. You can use it to dig up the tragically high number of websites that have suffered from the ravages of link-rot, or to see what still-here websites used to look like in the past.

The latter use is particularly important for journalists and us everyday citizens in this modern era when, for instance, the current US administration, being the most-censorious, anti-free-speech, scared-of-words governments over there in recent times is forcing its agencies to quietly remove or edit sections of their websites - including removing, potentially forever, many important datasets on subjects like health and climate change.

It’s one of the few sites that, like Wikipedia, I think is a wonderful example of the internet used for unalloyed good, and like Wikipedia, worth donating to now and then if you have spare money.

But how do the website snapshots get to the Machine in the first place? They do operate their own crawler, but that’s not the only way. They enlist the help of several third party organisations to get at content they otherwise might not have found and catalogued in time

One of these external organisations is the Archive Team.

The Archive Team focuses on grabbing content that’s hosted by services that are or were at risk of closure or some other kind of deletion. In the past this has included GeoCities, Yahoo Video, Friendster and others.

In their official words:

Archive Team is a loose collective of rogue archivists, programmers, writers and loudmouths dedicated to saving our digital heritage. Since 2009 this variant force of nature has caught wind of shutdowns, shutoffs, mergers, and plain old deletions - and done our best to save the history before it’s lost forever.

Per co-founder Jason Scott:

Archive Team was started out of anger and a feeling of powerlessness, this feeling that we were letting companies decide for us what was going to survive and what was going to die

And recently I discovered it was extremely easy to join in with the “rogue archivists” in this important project. You should consider it too.

The easiest way is to download their “warrior”. This is a program that runs inside a virtual machine you first install on your computer (e.g. VirtualBox). Once that’s up and running and you’ve chosen the project you want to work on, it simply automatically downloads the at-risk items that need archiving and then uploads them to Archive Team in the format they need to eventually end up on the Wayback Machine.

All the software is free. And don’t let the need for a Virtual Machine put you off. It was a very simple process for me - something you could complete in a handful of minutes, and fully documented here.

The main caveat is that you need to have a “clean” internet connection. What that means is detailed in the “Can I use whatever internet access for the Warrior?” section on this page. It basically means no VPNs, DNS accelerators, ISP connections that inject adverts, proxies, content filtering firewalls, being in a country that heavily censors the internet, and so on. Your computer basically needs to be able to access the webpage its archiving in its pure, unadulterated form.

But if you’re good with that, why not join the effort to preserve that wealth of content out there that’s at risk of forever vanishing?

A few of their current projects:

Meta Ad Library: Database for advertisements for Facebook and other products by Meta. IRC Channel US Government: Archiving the US government. IRC Channel #UncleSamsArchiv Radio Free Asia: Non-profit media organization owned by USAGM. Radio Free Europe/Radio Liberty: Non-profit media organization owned by USAGM. Voice of America: An internationally-broadcasting state media network at risk of closure.

Telegram: Archiving public messages in various newsworthy and/or otherwise notable Telegram channels

There’s a lot of other ways you can get involved if you have technical ability or computational resources. But this is something you can run very easily on your average everyday computer whilst you’re using it for whatever you turned it on to do.