1 Information about the internship

· Project: ANR BHAI https://anr.fr/Project-ANR-21-CE38-0001

· Project’s members:

– Victoria Eyharabide, STIH Laboratory, Sorbonne Université (Project coordinator)

– Laurence Likforman-Sulem, Departement IDS, Telecom Paris

– Isabelle Bloch, LIP6 Laboratory, Sorbonne Université

– Beatrice Caseau, UMR 8167 Orient et Méditerranée, Sorbonne Université

· Location: Maison de la recherche, Sorbonne Université - 28 rue Serpente, 75006 Paris.

· Duration: 12 months

· Keywords: Deep Nets, Character recognition, NLP, Instance segmentation, Fuzzy Logic, Knowledge representation and reasoning, Byzantine sigillography.

2 Context

This research will be developed within the framework of the ANR BHAI project. The general aim of the project is to combine computer vision, NLP, knowledge engineering, and mathematical modeling of spatial relationships to help with the interpretation of Byzantine seals. This research aims to (i) fully recover the text on seals, (ii) work on the recognition of objects to analyze iconographic scenes, (iii) estimate the inception date of Byzantine seals, and (iv) propose solutions based on hybrid AI techniques to interpret damaged areas based on existing insights.

Byzantine seals are small circular objects (10-50 mm) used to identify the sender of letters. They enclose essential knowledge about the Byzantine administration, aristocracy, and religion. Their importance derives from the scarcity of surviving Byzantine documents and the large number of extant seals. Since Byzantine seals are mostly made of lead, they suffer from corrosion and are often damaged. The historians’ interpretation work is challenging because some seals have been crushed or shattered, making their inscriptions difficult or impossible to read. However, considering their intrinsic properties, such as coherence between an icon and its associated text, as well as similarities between different seals (e.g., belonging to the same person over their career), historians can make hypotheses on the missing parts.

Seals have two sides: an observe side, which most often includes iconography (people, crosses, objects, etc.), and a reverse side, including text such as the sender’s name, their social position, and elements of prayers. In the project, there are two teams working on parallel: while a first team is focused on combining deep learning approaches [2] and natural language processing(NLP)[1] tools to fully recover the text on seals; a second teams is working on combining knowledge graph embeddings and mathematical modeling of spatial relationships for image understanding. The applicant can choose which team to join to focus on (a) the text on the back side or (b) the visual elements on the front.

3 State of the art

3.1 Recovering hidden text in Byzantine seals

Seals have been altered over time, so characters may be damaged or erased. In addition, since the surface of a seal is small, engravers have gained room by removing word spaces, omitting or fusing characters, and omitting even entire words. Consequently, the text is often abbreviated. A possible direction of this research may consists of combining deep learning approaches [2] and natural language processing(NLP)[1] tools to fully recover the text on seals despite its abbreviated form and damaged characters. In previous research [7, 4], we obtained transcripts of seal reverse sides using a two-step neural-based approach, localizing first characters by a deep object detector, then recognizing characters by a convolutional net. We plan to continue this research by splitting this task into several steps. We will first develop a Bayesian approach that predicts word (complete or abbreviated) boundaries [6, 5]. Then, the word hypotheses will possibly be expanded by using text normalization and machine translation transformer approaches. To train the systems, we will rely on Greek corpus and dictionaries [8].

3.2 Knowledge graphs embeddings and spatial reasoning for image understanding

Knowledge about space, in particular about spatial relations, is vital for image understanding [9]. Indeed, humans use spatial relations intensively to describe, detect, and recognize objects. Ambiguities between objects of similar shape or appearance can be solved based on their spatial arrangement since it is often more stable than other objects’ characteristics. Knowledge Graphs played a significant role in preserving cultural heritage and modeling human expert knowledge. They provide rich semantic context about the images’ content that is useful for extracting class-informative embeddings. Knowledge graph embeddings [10] are mappings of different parts of the knowledge graph into a vector space that satisfies specific properties and maintains the information that exists in the graph. Neural networks can use the resulting vectors to improve the results with the inherent information and structure of the graph [11]. In this internship, we will start by using the images and textual description of seals to train deep neural networks for detecting objects (such as staffs, swords, and scepters) and then identifying figures on Byzantine seals (Theotokos, St. Nicholas, St. Michael). The spatial organization of these objects and their arrangement in relation to each other could guide identification algorithms. By modeling the intrinsic spatial relations within seals, we should be able to contribute to their interpretation and estimate their inception date. We will create knowledge graphs from descriptions of already existing published seals and the knowledge of experts in the field to represent the objects and relations present in seals. Not only the figures and scenes will be studied, but also the object relations within a seal (such as the Virgin Mary holding the Child, a patriarchal cross mounted on three steps). The relations between different seals (such as belonging to the same owner or containing the same figure or inscription) will also be analyzed. The challenge will be the mathematical modeling of these relations (building on existing fuzzy models, adapting them to this specific field of application, and proposing new ones for relations not modeled before). The use of fuzzy [12, 13] logic to reason from expressions such as "in the center", "next to", or "between" has given excellent results in medical imaging. Unlike medical images, where an organ can be deformed or slightly displaced by a tumor, in Byzantine seals, the objects do not necessarily have a predefined place.

4 Profile of applicant

Applicants are required to have:

· A PhD in Computer Science.

· Advanced skills in Python programming are mandatory.

· A strong background in Machine Learning & Deep Learning on images and/or text using related libraries (scikitlearn, Tensorflow, Pytorch, etc.).

· Fluency in written and spoken English is essential.

· Communication skills in French are a plus but not required.

· A good publication record will be a plus.

The position is open immediately. Review of applications will begin as soon as applications are received and continue until the position is filled.

5 Application

Applicants should send an email to the project’s members: Victoria Eyharabide maria-victoria.eyharabide@sorbonne-universite.fr, Laurence Likforman laurence.likforman@telecom-paris.fr and Isabelle Bloch isabelle.bloch@sorbonne-universite.fr with:

· A full curriculum vitae including a complete list of publications

· A transcript of higher education records

· A one-page research statement discussing how the candidate’s background fits the proposed topic

· Two support letters from persons who have worked with them.

6 References

[1] Badr AlKhamissi, Millicent Li, Asli Celikyilmaz, Mona Diab, and Marjan Ghazvininejad. A review on language models as knowledge bases, 2022.

[2] Yannis Assael, Thea Sommerschield, Brendan Shillingford, Mahyar Bordbar, John Pavlopoulos, Marita Chatzipanagiotou, Ion Androutsopoulos, Jonathan Prag, and Nando de Freitas. Restoring and attributing ancient texts using deep neural networks. Nature, 603(7900):280–283, 2022.

[3] Jean-Claude Cheynet. Les sceaux byzantins de la collection Yavuz Tatis. Izmir. Privately published, Izmir, 2019.

[4] Victoria Eyharabide, Laurence Likforman-Sulem, Lucia Maria Orlandi, Alexandre Binoux, Theophile Rageau, Qijia Huang, Attilio Fiandrotti, Beatrice Caseau, and Isabelle Bloch. Study of historical Byzantine seal images: the BHAI project for computer-based sigillography. In ICDAR 2023 International Workshop on Historical Document Imaging and Processing (7th edition) (HIP’23), San Jose, California, United States, August 2023.

[5] Sharon Goldwater, Thomas L Griffiths, and Mark Johnson. A bayesian framework for word segmentation: Exploring the effects of context. Cognition, 112(1):21–54, 2009.

[6] Shu Okabe, Laurent Besacier, and Francois Yvon. Weakly supervised word segmentation for computational language documentation. In Annual meeting of the Association for Computational Linguistics, 2022.

[7] Rageau Theophile. Deep learning approach for character recognition in byzantine seal images. Rapport de stage de master mva, Telecom Paris, 2022.

[8] Francois Yvon. Rewriting the orthography of SMS messages. Natural Language Engineering, 16(2):133–159, 2010.

[9] Isabelle Bloch. Fuzzy spatial relationships for image processing and interpretation: a review. Image and Vision Computing, 23(2):89–110, 2005.

[10] Victoria Eyharabide, Imad Eddine Bekkouch, and Nicolae Dragos, Constantin. Knowledge graph embedding-based domain adaptation for musical instrument recognition. Computers, 10(8):94, 2021.

[11] Quan Wang, Zhendong Mao, Bin Wang, and Li Guo. Knowledge graph embedding: A survey of approaches and applications. IEEE Transactions on Knowledge and Data Engineering, 29(12):2724– 2743, 2017. 4

[12] Celine Hudelot, Jamal Atif, and Isabelle Bloch. Fuzzy spatial relation ontology for image interpretation. Fuzzy Sets and Systems, 159(15):1929–1951, 2008.

[13] Isabelle Bloch. Spatial reasoning under imprecision using fuzzy set theory, formal logics and mathematical morphology. International Journal of Approximate Reasoning, 41(2):77–95, 2006.