The Internet’s Most Powerful Archiving Tool Is in Peril – WIRED

0
43
1080px-Internet_Archive_logo_and_wordmark
1080px-Internet_Archive_logo_and_wordmark

By Kate Knibbs, Apr 13, 2026 7:00 AM

Business

The Internet’s Most Powerful Archiving Tool Is in Peril

As major news outlets cut off the Wayback Machine, journalists and advocacy groups are rallying to protect the Internet Archive’s vast collection of web pages.

A staff member wears a Universal Access to All Knowledge shirt during a 20th anniversary celebration of the Internet...

Photograph: Carlos Avila Gonzalez/Getty Images

This month, USA Today published an excellent report that revealed how US Immigrations and Customs Enforcement delayed disclosing key information about the impacts of its detainment policies. The authors used the Internet Archive’s Wayback Machine to compile and analyze detention statistics from ICE and track how the agency had changed under the Trump administration. The story is one of countless examples of how the Wayback Machine, which crawls and preserves web pages, has helped preserve information for the public good. It was also, Wayback Machine director Mark Graham says, “a little ironic.”

USA Today Co., the publishing conglomerate formerly known as Gannett that runs both its namesake paper and over 200 additional media outlets, bars the Wayback Machine from archiving its work. “They’re able to pull together their story research because the Wayback Machine exists. At the same time, they’re blocking access,” Graham says.

A number of other major journalism organizations have also recently moved to restrict the Wayback Machine from archiving their stories, including The New York Times, Nieman Lab reported earlier this year. According to analysis by the artificial-intelligence-detection startup Originality AI, 23 major news sites are currently blocking ia_archiverbot, the web crawler commonly used by the Internet Archive for the Wayback project. The social platform Reddit is too.

Other outlets are limiting the project in different ways: The Guardian does not block the crawler, but it excludes its content from the Internet Archive API and filters out articles from the Wayback Machine interface, which makes it harder for regular people to access archived versions of its articles.

USA Today Co. spokesperson Lark-Marie Anton emphasized that “this effort is not about specifically blocking the Internet Archive” but instead part of the company’s broader efforts to block all scraping bots. Robert Hahn, the Guardian’s director of business affairs and licensing, says that it has been in conversation with the Archive over “concerns over potential misuse by AI companies of content sets crawled for preservation purposes.”

Now, individual reporters are pushing back on this trend. This week, advocacy organizations including the Electronic Frontier Foundation and Fight for the Future rallied journalists around the Wayback Machine’s cause. The coalition collected more than 100 signatures from working journalists who recognize the tool’s value and presented a letter of support to the Internet Archive.

Signatories range from television mainstay Rachel Maddow to independent reporters like Spitfire News’ Kat Tenbarge and User Mag’s Taylor Lorenz. “In previous generations, journalists would turn to the physical archives of a local newspaper or of a local public library to access historical reporting and follow the threads of the present back into history,” the letter reads. “With many newspapers closed, and no clear path for local public libraries to preserve digital-only reporting, the work of safeguarding journalism’s record increasingly falls to the Internet Archive.”

THANKS to Library Link of the Day for the link and article.
http://www.tk421.net/librarylink/  (archive, rss, subscribe options)

Continue/Read Original Article: The Internet’s Most Powerful Archiving Tool Is in Peril | WIRED


Discover more from DrWeb's Domain

Subscribe to get the latest posts sent to your email.

Leave Your Comments

This site uses Akismet to reduce spam. Learn how your comment data is processed.