Supervisors:
– Beatrice Caseau, UMR 8167 Orient et Méditerranée, Sorbonne University (Professor of History)
– Victoria Eyharabide, STIH Laboratory, Sorbonne University (Associate Professor in Computer Science)
Location:
– Byzantine Library, Collège de France – 52 rue du Cardinal-Lemoine, 75005 Paris
– Maison de la Recherche, Sorbonne University – 28 rue Serpente, 75006 Paris
Duration: 3 years (desired starting date: January 2026)
Funding: Doctoral contract from the Digital Humanities Initiative, Sorbonne University
Gross monthly salary: €2300
Keywords: Deep Learning, Character Recognition, Instance Segmentation, Knowledge Representation, Byzantine Sigillography
This doctoral contract involves multidisciplinary research in artificial intelligence and history. The aim of this digital humanities project is to design a machine learning program based on photographs of Byzantine seals, in order to improve the reading of letters, the segmentation of words, and the reconstruction of the main formulaic expressions appearing on the seals. The ultimate goal is to facilitate the reading of seals whose letters are damaged or whose legends are incomplete, thus making it possible to recover seals that have so far been excluded from publication due to the uncertainty of their transcription. If successful, this work will enhance existing readings and even provide access to previously unreadable seals, making the information they contain available to the scholarly community.
The approach consists in combining deep learning methods [2] with natural language processing (NLP) tools [1], applied to Byzantine seals, in order to recover the complete text of the seals despite their abbreviated form and the frequent damage or loss of characters. In earlier research [7, 4], we obtained transcriptions of the reverse side of seals using a two-step neural approach: first localizing the characters with a deep object detector, then recognizing them with a convolutional network. We plan to extend this research by dividing the task into several stages [9].
We will first develop a Bayesian approach to predict word boundaries (whether complete or abbreviated) [6, 5]. Next, word hypotheses will be expanded using text normalization and machine translation–based transformation approaches. For training, we will rely on Greek corpora and dictionaries [8].
Byzantine seals are crucial objects for understanding the Byzantine administration and aristocracy. The number of Byzantine seals unearthed in excavations is estimated at around 80,000, and this figure is steadily increasing, with approximately 1,000 to 1,500 new pieces discovered each year. The seals contain inscriptions that make it possible to reconstruct not only the careers of the seal owners (sigillants) but also the full range of administrative functions and provincial offices. They are by far the most important source for establishing the prosopography of the Byzantine aristocracy, since all dignitaries and the principal officials of the Byzantine Empire (which lasted from the 4th to the 15th century) possessed a seal for authenticating their correspondence.
The preserved seals are predominantly made of lead, measuring between 20 and 30 mm. From the 6th century onwards, lead seals are systematically double-sided, most often bearing iconography as well as inscriptions in Greek (seals with other alphabets are excluded here). When the iconography features figures of saints, facial recognition methods could be applied. Finally, some seals display monograms, in which a name or title is represented through interwoven letters; it would therefore be highly useful to develop a computational program capable of suggesting possible readings of these monograms.
Military officials and members of the high clergy also possessed their own seals. Women rarely had seals, but empresses and members of the highest aristocracy occasionally had a personal bulla.
Figure 1. An example of a Byzantine seal (Tatianos hypatos, Cheynet 2019 [3], 5.57, p. 225)
Applicants should meet the following criteria:
Interested candidates are invited to send a complete application by e-mail to both supervisors:
Beatrice Caseau — beatrice.caseau@sorbonne-universite.fr
Victoria Eyharabide — maria-victoria.eyharabide@sorbonne-universite.fr
The application should include:
Important Dates
Application deadline: October 20, 2025
Desired PhD starting date: January 2026
Funding is already secured: the PhD candidate may begin immediately, if possible.
References
[1] Badr AlKhamissi, Millicent Li, Asli Celikyilmaz, Mona Diab, and Marjan Ghazvininejad. A review on language models as knowledge bases, 2022.
[2] Yannis Assael, Thea Sommerschield, Brendan Shillingford, Mahyar Bordbar, John Pavlopoulos, Marita Chatzipanagiotou, Ion Androutsopoulos, Jonathan Prag, and Nando de Freitas. Restoring and attributing ancient texts using deep neural networks. Nature, 603(7900):280–283, 2022.
[3] Jean-Claude Cheynet. Les sceaux byzantins de la collection Yavuz Tatis. Izmir. Privately published, Izmir, 2019.
[4] Victoria Eyharabide, Laurence Likforman-Sulem, Lucia Maria Orlandi, Alexandre Binoux, Theophile Rageau, Qijia Huang, Attilio Fiandrotti, Beatrice Caseau, and Isabelle Bloch (2023). Study of historical Byzantine seal images: the BHAI project for computer-based sigillography. In ICDAR 2023, 7th International Workshop on Historical Document Imaging and Processing (HIP '23). Association for Computing Machinery ACM, New York, NY, USA, 49–54. https://doi.org/10.1145/3604951.3605523
[5] Sharon Goldwater, Thomas L Griffiths, and Mark Johnson. A bayesian framework for word segmentation: Exploring the effects of context. Cognition, 112(1):21–54, 2009.
[6] Shu Okabe, Laurent Besacier, and Francois Yvon. Weakly supervised word segmentation for computational language documentation. In Annual meeting of the Association for Computational Linguistics, 2022.
[7] Théophile Rageau, Laurence Likforman-Sulem, Attilio Fiandrotti, Victoria Eyharabide, Béatrice Caseau, Jean-Claude Cheynet (2025). Character recognition in Byzantine seals with deep neural networks. Journal of Digital Applications in Archaeology and Cultural Heritage, Elsevier. Volume 37, pp e00403, https://doi.org/10.1016/j.daach.2025.e00403
[8] Francois Yvon. Rewriting the orthography of SMS messages. Natural Language Engineering, 16(2):133–159, 2010.
[9] Victoria Eyharabide (2024). Artificial Intelligence Applied to Byzantine Sigillography: Current Research, Challenges, and Future Perspectives. In Special issue “Digital Approaches to Medieval Sigillography,” of the journal Digital Medievalist (DM), 17(1): 1–16. https://doi.org/10.16995/dm.15119