Panel: DATASETS, WORKFLOWS, SOFTWARE AND AIS TO STUDY RELIGIONS: WHAT IS NEW AND WHAT IS AHEAD?



398_2.1 - A WORKFLOW FOR DIGITIZING AND SHARING ETHIOPIC MANUSCRIPTS

AUTHORS:
Ferrandino G. (University of Naples "L'Orientale" ~ Naples ~ Italy) , Ranieri T. (University of Naples "L'Orientale" ~ Naples ~ Italy) , D'Andrea A. (University of Naples "L'Orientale" ~ Naples ~ Italy)
Text:
The introduction of GenAI and machine learning techniques into philological studies is progressively transforming approaches to the digitization of sources. What was formerly a practice primarily aimed at the preservation of often unique paper-based documents has now evolved into an opportunity for a more rapid publication of ancient texts and for the creation of a virtual, interactive digital ecosystem. Systems for the recognition and interpretation of historical documents are achieving increasingly high levels of accuracy, even for lesser-documented languages, thereby allowing some of the most labor-intensive tasks traditionally undertaken by researchers—such as the reading and transcription of ancient texts—to be delegated to computational tools. The digital resources thus produced must subsequently be encoded in accordance with international standards to ensure their publication, retrieval, preservation, and dissemination, contributing to a redefinition of the concept of documentary archiving. In this context, the metadata creation process plays a central role, as it enables the definition of the entire workflow—both philological and technical—while contributing to the production of scholarly objects that are accessible even to users without advanced technical expertise. This contribution examines the processes of acquisition, transcription, and online publication of a collection of religious manuscripts written in Geʿez. The documents were transcribed using the Transkribus platform and subsequently encoded with the Oxygen software according to the TEI standard. To ensure interoperability, the metadata were mapped to the Dublin Core schema, a general-purpose descriptive model that nonetheless effectively supports data integration and reusability. Finally, to support scholarly study and analysis, a text annotation system was designed, whose notes can be converted into Dublin Core metadata and associated with the original historical document.