Artificial Intelligence in the Digital Humanities
Main Article Content
Abstract
The research in the field of Digital Humanities (DH) has so far been limited to digitization techniques and database organization. However, over the last years, many researchers have pointed out the need for an interdisciplinary approach, which goes from chemical and physical analyses, to microbiology, to Artificial Intelligence. In this paper, we summarize the efforts in this direction of our “MAGIC” project, active for 3 years now in the field of DH, in particular for the application of AI to DH to manuscripts and books of the period 1300-1600. The present short paper will deal with two problems: i) digital restoration of deteriorated pages of books and manuscripts, and ii) transcription of books and manuscripts via OCR and HTR techniques, and for both, we applied AI techniques to improve the automated process. It has to be remembered that only an interdisciplinary approach can give a full overview of these books and manuscripts; on the website www.magic.unina.it, we present the full project and the other work that is underway.
Article Details
Copyright (c) 2025 Russo G, et al.

This work is licensed under a Creative Commons Attribution 4.0 International License.
1. Allegrezza S. Artificial intelligence as a key to the valorisation and conservation of cultural heritage: an interdisciplinary approach. Workshop AI, Cultural Heritage and Art, CINECA; Bologna, September 2023. Available from: https://dx.doi.org/10.1388/IIWORKSHOPAIBC
2. Caruso M, Spadaro A. Digital Humanities and Artificial Intelligence: An Accelerationist Perspective of the Future. Una Quantum 2022: Open Source Technologies for Cultural Heritage, Cultural Activities and Tourism. Proceedings. 2024;96(10). Available from: https://doi.org/10.3390/proceedings2024096010
3. Conte S, Maddalena PM, Mazzucchi A, Merola L, Russo G, Trombetti G. The Role of Project MA.G.I.C. In the Context of the European Strategies for the Digitization of the Library and Archival Heritage. In: Bucciero A, Fanini B, Graf H, Pescarin S, Rizvic S, editors. Proceedings of the Eurographics Workshop on Graphics and Cultural Heritage. Lecce, Italy: The Eurographics Association; 2023. Available from: https://doi.org/10.2312/gch.20231167
4. Conte S. Magic, a Service Center for Technologies Applied to Manuscripts and Printed Books. Uman Digit. 2025;9(20):661–684. Available from: https://doi.org/10.6092/issn.2532-8816/21038
5. Hanif M, Tonazzini A, Savino P, Salerno E, Tsagkatakis G. Document Bleed-Through Removal Using Sparse Image Inpainting. In: Proceedings of the 13th IAPR International Workshop on Document Analysis Systems (DAS); 2018 Apr 24–27; Vienna, Austria. Piscataway (NJ): IEEE; 2018;281–6. Available from: http://dx.doi.org/10.1109/DAS.2018.21
6. Dubois E, Pathak A. Reduction of bleed-through in scanned manuscript documents. In: Proceedings of the IS&T Conference on Image Processing, Image Quality, Image Capture Systems; Montreal, QC, Canada; 22–25. Available from: https://www.researchgate.net/publication/253877269_Reduction_of_Bleedthrough_in_Scanned_Manuscript_Documents
7. Savino P, Tonazzini A, Bedini L. Bleed-through cancellation in non-rigidly misaligned recto–verso archival manuscripts based on local registration. Int J Doc Anal Recognit. 2019;22:163–176. Available from: https://link.springer.com/article/10.1007/s10032-019-00323-2
8. Savino P, Tonazzini A. Training a shallow NN to erase ink seepage in historical manuscripts based on a degradation model. Neural Comput Appl. 2024;36:11743–57. Available from: https://link.springer.com/article/10.1007/s00521-023-09354-7
9. Hu X, Lin H, Li S, Sun B. Global and local features-based classification for bleed-through removal. Sens Imaging. 2016;17:9. Available from: https://link.springer.com/article/10.1007/s11220-016-0134-7
10. Ettari A, Brescia M, Conte S, Momtaz Y, Russo G. Minimizing Bleed-Through Effect in Medieval Manuscripts with Machine Learning and Robust Statistics. J Imaging. 2025;11:136. Available from: https://doi.org/10.3390/jimaging11050136
11. Seaward L, Kallio M. Transkribus: Handwritten Text Recognition technology for historical documents. In: Workshop Digital Humanities. 2017. Available from: https://dh2017.adho.org/abstracts/649/649.pdf
12. Momtaz Y, Laccetti L, Russo G. Modular Pipeline for Text Recognition in Early Printed Books Using Kraken and ByT5. Electronics. 2025;14:3083. Available from: https://doi.org/10.3390/electronics14153083
13. Reul C, Christ D, Hartelt A, Balbach N, Wehner M, Springmann U, et al. OCR4all—An Open-Source Tool Providing a (Semi-)Automatic OCR Workflow for Historical Printings. Appl Sci. 2019;9(22):4853. Available from: https://doi.org/10.3390/app9224853
14. Xue L, Barua A, Constant N, Al-Rfou R, Narang S, Kale M, et al. ByT5: Towards a Token-Free Future with Pre-Trained Byte-to-Byte Models. arXiv. 2022; arXiv:2105.13626. Available from: https://aclanthology.org/2022.tacl-1.17/