Data hoarders race to preserve data from rapidly…

Data hoarders race to preserve data from rapidly disappearing U.S. federal websites

U.S. President Donald Trump has issued an executive order that has resulted in many government agencies taking down webpages and sites to comply. Because of this, data hoarders across the internet are racing to preserve them all before they’re taken offline, with MuckRock reporting that the End of Term Archive, which includes the Internet Archive, Stanford University, Common Crawl Foundation, University of North Texas, and Webrecorder, having already saved more than 500 terabytes from .gov domains.

It's reported that more than 8,000 government pages have been taken down, including the Department of Justice database detailing the criminal charges and convictions of January 6 rioters, LGBTQ+ rights and HIV-related information from the Centers for Disease Control and Prevention, and the Climate and Economic Justice Screening Tool released by the Council on Environmental Quality, among others.

Because of this, the r/DataHoarder Subreddit is rallying its over 832,000 members to help save the data in danger of being taken offline and deleted. u/didyousayboop shared on the Subreddit that the Archive Team, composed of volunteer digital archivists led by Jason Scott — the Free Range Archivist and Software Curator at the Internet Archive — is asking for help with its US Government project. This effort is focused on archiving all government content, especially data that is at risk of being removed because of the current administration’s efforts.

We’ve also seen several threads in the r/DataHoarder asking for help backing up specific pages and websites. These include NOAA, USAID, the National Center for Education Statistics, the National HIV Curriculum, CDC Immunization Publications, and more. Someone was even asking for help downloading USAID’s videos on its YouTube channels, fearing that they would be next, after the USAID website went down.

Aside from requests to backup data and volunteers acting on them, we’re also seeing others volunteering to host the archived site data for free on their domains.

This is one of the biggest efforts we’ve seen in archiving, where a huge collection of storage geeks is putting out their best effort to download and preserve online historical data. If you want to join them and help save the information hosted on government servers, you can check out the instructions u/didyousayboop left on r/DataHoarder.

Read news from 100’s of titles, curated specifically for you.

Already a member? Sign in here