Uncovering the Digital Gems of a Pioneering Institution
As seasoned IT professionals, we understand the immense value of preserving the digital heritage that has shaped the technological landscape. The French National Institute for Research in Digital Science and Technology, Inria, stands as a shining example of an institution that has been at the forefront of groundbreaking innovations since its inception in 1967. However, the passage of time can often obscure the significance of these historic software artifacts, leaving them vulnerable to the ravages of technological obsolescence.
In a bold move to safeguard Inria’s software legacy, a collaborative effort between Software Heritage and the Inria Alumni Network has been initiated. By embracing a crowd-sourced approach, this initiative aims to identify, document, and preserve the invaluable software creations that have emerged from Inria’s rich history.
Defining Legacy: A Nuanced Approach
Defining what constitutes “legacy” software is a complex endeavor, as the very concept of legacy can be subjective and multifaceted. Recognizing that the field of computer science is still relatively young, we have adopted an inclusive approach that allows respondents to freely submit any software they deem significant, regardless of its age.
The survey responses revealed a diverse range of criteria used to justify the legacy status of Inria’s software. Some highlighted the software’s adoption rate or proof of success, such as its “big industrial and economic impact” or its widespread usage as a “geometric library used worldwide.” Others emphasized the software’s historical significance, be it as a “Skype ancestor” or a “predecessor of Centaur,” one of Inria’s flagship programs in the 1990s.
Interestingly, only a small subset of respondents explicitly referenced the age or vintage of the software, with one entry describing a piece “written in Algol W, developed and run on an IBM card-programmed machine in a closet in Building 8 of Rocquencourt, connected via a 150 baud link to the CP-CMS of the IBM 360-90 at IMAG.” This diversity of perspectives highlights the challenge of managing relatively recent legacy software, where the passage of time has not yet provided the necessary historical context to clearly define what should be preserved.
To address this, we advocate for an inclusive approach that extends preservation efforts to any software deemed significant by the community, as setting an arbitrary cut-off date would be ill-advised in this dynamic and rapidly evolving field.
Tracing the Lineage: From Esope to Coq
As we delve deeper into Inria’s software heritage, a fascinating tapestry of interconnected projects begins to emerge. The survey responses reveal a rich lineage, with software pieces referencing and building upon one another, creating a tapestry of technological evolution.
One notable example is the LP10070 compiler, developed during the Esope project, which sought to build a time-sharing exploitation system for the CII 10070 computer between 1968 and 1972. This assembly language, while its source code has unfortunately been lost, represents a crucial milestone in Inria’s early software development.
Moving forward in time, we see the emergence of other pioneering projects, such as the text editor of the ESOP project (1972), the STST protocol of the Cyclades project (1972), which explored innovative solutions for networking computers and laid the foundations for the internet, and UNIF (1974), a unification algorithm in lambda calculus using Church’s type theory.
The lineage continues with the MODULEF finite element library (1974), the interactive automated theorem prover Mini Logic for Computable Functions (1974), and the structure-oriented program editor Mentor (1975), which later served as a stepping stone for the development of the Centaur interactive programming environment (1985).
Centaur, in turn, was leveraged in the creation of CtCoq (1992), an interactive interface for the groundbreaking Coq theorem prover (1989), which received the ACM Software System Award in 2013. The lineage also extends to the Le ML (1980), the first implementation of the ML language at Inria, which led to the development of the renowned CAML (1985) and the award-winning OCaml programming language (1990).
This intricate web of software interconnections not only highlights the profound impact of Inria’s work but also underscores the urgent need to preserve these historical artifacts, ensuring that the connections between them are maintained and their collective significance is safeguarded for future generations.
Preserving the Digital Tapestry
As we delve deeper into Inria’s software heritage, a sobering reality emerges: the older the software, the more likely it is to have partially or completely lost its source code. While a significant portion of the submitted software has fully preserved source code, the survey results reveal that 62% of the referenced source code is either lost or at risk of being lost if active archiving efforts are not undertaken.
The challenge of preserving legacy software is further compounded by the varying formats in which these digital artifacts are stored. From paper listings to floppy disks and outdated media, the diversity of storage methods presents a formidable obstacle to comprehensive preservation.
To address this challenge, Software Heritage, in collaboration with the University of Pisa, has developed the Software Heritage Acquisition Process (SWHAP), a dedicated framework designed to curate and archive legacy code. This process, however, is labor-intensive, underscoring the need for a collective effort to scale up the preservation of Inria’s software heritage.
Recognizing the urgency of this task, the survey respondents have stepped forward, with fifteen individuals expressing their willingness to dedicate time and effort to the preservation of the identified legacy source codes. By leveraging the existing SWHAP framework and empowering this community of dedicated volunteers, we can pave the way for a comprehensive archiving initiative that safeguards Inria’s digital legacy.
Lessons Learned and a Call to Action
The insights gained from this initial survey underscore several key lessons that can inform similar preservation efforts at other large institutions or communities:
-
Institutional Backing is Crucial: The survey’s success in eliciting responses was greatly enhanced by the involvement of the Inria Alumni Network and the official message from Inria’s CEO, underscoring the importance of institutional support for such initiatives.
-
Firsthand Knowledge is Invaluable: The responses confirmed that the involvement of individuals with direct experience in the development and use of these software artifacts is essential for providing context and ensuring comprehensive preservation.
-
Early Software Faces Higher Risks: The survey findings reinforce the urgent need to proactively preserve and archive the digital artifacts from Inria’s earliest years, as the risk of loss increases with time.
-
Crowdsourcing Builds a Community: The crowd-sourced approach has helped to build a dedicated community around preservation efforts, facilitating knowledge sharing, resource pooling, and raising awareness of software’s cultural significance.
As we move forward, we call upon the broader software preservation community to join forces and share their knowledge and best practices. By establishing a dedicated community of volunteers committed to preserving Inria’s legacy software, we can ensure that these digital gems are safeguarded for the benefit of future generations.
The lessons learned from this survey serve as a clarion call to action, inspiring us to redouble our efforts and expand the reach of this initiative. Together, we can create a lasting legacy that honors the pioneering work of Inria and serves as a model for software preservation initiatives worldwide.
Conclusion: Preserving the Digital Tapestry
Inria’s software heritage stands as a testament to the remarkable achievements of a pioneering institution, a tapestry of interconnected projects that have shaped the very foundations of computer science and technology. As seasoned IT professionals, we recognize the immense value of preserving these digital artifacts, ensuring that the lessons of the past continue to inform and inspire the innovations of the future.
Through the crowd-sourced approach adopted in this initiative, we have uncovered a wealth of insights and a dedicated community of volunteers, all driven by a shared passion for safeguarding Inria’s software legacy. By leveraging the SWHAP framework and empowering this collective effort, we can scale up the preservation of these invaluable digital gems, paving the way for a lasting legacy that honors the visionary work of Inria.
As we move forward, we call upon the broader software preservation community to join us in this endeavor. Together, we can ensure that the digital tapestry woven by Inria’s pioneering efforts remains vibrant, accessible, and celebrated for generations to come.