The internet is “disappearing”. But there are those who fight to preserve digital history

The numbers are clear: 38% of internet pages published until 2013 have already disappeared. The same happened with 25% of those published in the last decade. The race to preserve this digital history is underway, but it is not easy.

It is a common saying that the speed at which everything moves on the internet is overwhelming. It is also common to say that once on the internet, forever on the internet. After all, it seems that this is not the case. Many pages that perhaps many people accessed over and over again are disappearing without leaving a large digital footprint.

Therefore, digital archiving platforms are facing a race against time to save the vast amount of information that disappears from the internet each year. It's just that a recent investigation reveals that 25% of web pages published between 2013 and 2023 had already disappearedm.

Source: Pew Research Center analysis of a random selection of URLs collected by the Web Common Crawl repository (n=999,989) and verified using page response codes and DNS. Web pages were defined as inaccessible if they returned a status code of 204, 400, 404, 410, 500, 501, 502, 503, 523 or did not return a valid status code. “When online content disappears”

A Internet Archivea non-profit organization, has played a vital role in this effort, aiming to preserve a wide range of digital content. Founded in 1996, it has archived more than 866 billion web pages, as well as books, videos and other cultural materials.

But the truth is that the The digital history of humanity is at risk due to the fragility of the internet. Unlike physical artifacts like ancient manuscripts or historical books, online content is more ephemeral.

Companies close, websites are discontinued, and important information disappears in a constant cycle. This problem is more visible over time, with 38% of web pages that existed in 2013 no longer accessible.

Problem extends to governments

Even government institutions are not immune to this phenomenon. The study shows that one in five government websites contains broken linkswhich compromises the integrity of information that should be preserved for future generations.

More than half of the articles in Wikipedia also face this problem, which means that the sources cited may no longer be available for consultation.

However, the archival effort faces significant challenges. Organizations that fight to preserve this digital history are subject to financial problems, cyber attacks, and legal battles, especially with companies that do not want their intellectual creations to be stored for free.

The Internet Archive, for example, uses its project Wayback Machine to capture multiple versions of a page over time, allowing the public to access previous versions of websites, often free of charge.

In addition to the Internet Archive, other organizations, such as the US Library of Congress, have their own archiving projects, preserving government websites and news. There are also specific efforts in certain regions, such as the United Kingdom, where the UK Web Archive captures a snapshot of the British internet annually.

With the rapid advancement of technology, most of our social, cultural and intellectual life has migrated to the virtual space.

That's why Keeping a digital record of our time is essential for future generations to understand the era in which we livejust as we study historical documents today.

Source: pplware.sapo.pt