SF Technotes

An Open-Source Library for the Millennium

By Michael Castelluccio
November 21, 2019
0 comments

The announcement was made at GitHub’s Universe conference in San Francisco on November 13, 2019: “Piql supports GitHub with perpetual data storage.” Piql, pronounced pickle, as in “yes, we can pickle and preserve your data and source code,” is a Norwegian digital storage company that can store data archives in its Arctic World Archive (AWA), 100 meters deep inside the heart of a mountain, in a storage mode designed to last for more than 1,000 years.

 

GitHub, the other principal, is an American company that acts like a free cloud service for hosting “the world’s largest community of developers to discover, share, and build better software. From open-source projects to team repositories, [they say] we’re your all-in-one platform for collaboration (www.github.com).”

 

Recently purchased by Microsoft for $7.5 billion, GitHub was started up by a legend in open-source software, Linus Torvalds, creator of the Linux operating system. Among the 40 million users of the GitHub platform are SAP, IBM, Microsoft, Google, airbnb, PayPal, Spotify, Facebook, the Vatican, and NASA. The worldwide profile of those on the platform is impressive. Piql’s arctic archive will expand significantly with the addition of the GitHub legacy data and open-source programs and libraries.

 

Image: Creative Commons/Stefano De Sabbata/Oxford Internet Institute (Click to enlarge)

 

AN INSURANCE POLICY FOR HUMANITY

 

The Piql vault is located on an island in Svalbard (Old Norse for “cold coast”), the Norwegian archipelago that’s the northernmost year-round settlement on earth, just 650 miles south of the North Pole.

 

 

Built within the depths of a closed coal mine, the depository is high enough up the mountain to escape the rise of sea level that might one day scale the mountain due to global warming. The vault is also insulated from nuclear attacks and EMPs (electromagnetic pulses). An international treaty signed after World War I granted Norway supremacy over the Arctic archipelago, and Article 9 of that treaty outlaws “the establishment of any naval base…[or] any fortification in the said territories, which may never be used for warlike purposes.” The treaty also bans Norway from bringing Svalbard into war as part of self-defense.

 

Hackers, who might see the installation as an ultimate hall of the mountain king, full of the century’s digital riches, will be kept out, because the entire trove is offline. Inside the Arctic Circle, and deep within the dry cold of the permafrost, the constant temperature is -5 to -10 degrees Celsius.

 

Image: Piql

 

Also on the island is the more famous Svalbart Global Seed Vault, which holds more than 1,108,526 samples of 6,007 species of seeds from almost every country in the world. The seeds are there for some of the same reasons GitHub will be—a permafrost refrigerated, low humidity environment that’s stable even without electricity. The seeds are there as insurance against natural and human catastrophes, and they’re intended for future use when needed. Some of the seeds have already been retrieved and planted, and then new stocks were returned to the depository.

 

The Global Seed Vault (seedvault.no) opened on February 26, 2008, and the AWA (arcticworldarchive.org) on March 27, 2017.

 

STORAGE MEDIUM

 

There were two problems for AWA regarding the type of storage and machines to store and read the information once saved. All traditional digital media is subject to degradation over time, and the hardware, no matter the design, faces constant obsolescence and replacement.

 

Piql decided on a special type of optical film for its storage medium. The founder and managing director of Piql AS, Rune Bjerkestrand described the material to The Verge: “Film is an optical medium, so what we do is, we take files or any kind of data—documents, PDFs, JPGs, TIFFs—and we convert that into big high-density QR codes. Our QR codes are massive, and very high resolution; we use greyscale to get more data into every code. And in this way we convert a visual storage medium, film, into a digital one.” The specially treated proprietary film is perfectly suited to the extreme, long-lasting cold and is expected to last 500 years before it requires duplication to its next generation.

 

 

Piql Writer alongside Piql Reader. Image: Piql

 

The readers for the QR codes need to be as permanent as the film. To assure their durability, detailed instructions on how to build more of these readers are included on the archived film, in plain, readable text, not code.

 

Bjerkestrand commended GitHub for significantly shaping world memory with its first software deposit. He said, “GitHub is leading the way with a future-focused approach to its assets and an inspiring value on history.”

 

That first deposit includes “6,000 of its most significant repositories in AWA for perpetuity, capturing the evolution of technology and software.” Included is the source code for the Linux and Android operating systems; the programming languages Python, Ruby, and Rust; web platforms Node, V8, React, and Angular; cryptocurrencies Bitcoin and Ethereum; AI tools TensorFlow and FastAI; and more. GitHub has announced it will store all active repositories by February 2, 2020.

 

The November 2019 press announcement concludes with the bigger picture of GitHub’s place in the AWA frozen vault. “GitHub’s repository will sit alongside digitally preserved national archives from around the world, (digitally recreated) masterpiece artworks, contemporary sound and visual art, scientific breakthroughs, historical manuscripts, archeological finds, and many more.”

 

The Arctic World Archive might not be what we conventionally think of as a library or museum, but it’s definitely a primary-source archive of the history of modern computing. It stands as an insurance policy for humanity regarding that information, and as an historic monument to the importance of the open-source software movement.

 



Michael Castelluccio has been the Technology Editor for Strategic Finance for 24 years. His SF TECHNOTES blog is in its 21st year. You can contact Mike at mcastelluccio@imanet.org.


0 No Comments