Imagine being able to wander the secret Vatican Archive as easily as you’re able to surf the internet, or being able to study ancient texts as easily as you would a webpage. Imagine having access to the texts of ancient past through a direct link built from technologies of today. This is the aim of In Codice Ratio, a research project resulting from the cooperation of both the computer engineering and paleography departments of Roma Tre University and the Vatican Archive.

Conceived and directed by 52-year old professor Paolo Merialdo of the computer engineering department at Roma Tre University, the project aims to develop solutions for the extraction of data from manuscripts held in the 85 kilometers of bookshelves within the Vatican Archive – an archive containing all correspondences between the Vatican and its ambassadors over centuries, as well as between Popes and European kings. “My research focuses on the extraction of information from these historical texts. It’s a technical study that can extract knowledge from the immense wealth of information and upload it to an online network that may then be useful for various academic purposes. Initially, the palographers were against my idea of ​​working on the Vatican Archive. They later warmed up to the idea, however, thanks to Marco Maiorino, a palaeographer-archivist of the Archive, and to the Prefect of the Archive himself, who was very interested in the idea.”

Professor Paolo Merialdo

The aim of Merialdo is not the digitization of the archive, but the transcription of documents. “Digitizing a text means making a scan, a digital photo. The transcription is obtained with a technique for which one can get data from the text through a search by keywords or names. Using only a scanning method, this is currently not possible.” Information technology works on printed texts but not on Medieval manuscripts, where the handwriting is heterogeneous, full of symbols and abbreviations. “We had to build a system based on artificial intelligence that was able to recognize a large amount of characters from the photography. To achieve this, we must train the artificial intelligence system – that is, support machine learning – so that the machine recognizes distinct images. In this specific case, we need to give the machine many examples of how these images are displayed in the original texts so the machine understands how different words and individual characters appear in the ancient texts. This of course requires a lot of work, and a lot of time.” For this reason, the project involved over 1200 high schools in training the system.

Merialdo is currently working on documents of the Honorius III Pontificate (1216-1227). “No one to date has transcribed any piece of these documents, which together consist of over 2100 pages. The same type of writing was used for over 100 years for the subsequent pontificates.”

Thanks to In Codice Ratio and the work of machines, historians will finally have access to these invaluable documents, ready-translated for their studies.