Thesis topic as part of the Popcorn project (collaborative project with two companies) supervised by Benjamin Lecouteux, Gilles Sérasset and Didier Schwab (Laboratoire d'Informatique de Grenoble, Groupe d'Étude en Traduction Automatique/Traitement Automatisé des Langues et de la Parole)

Title: OPerational settlement of Knowledge bases and Neural Networks

The project addresses the problem of semi-automated enrichment of a knowledge base through automatic text analysis. In order to achieve a breakthrough innovation in the field of Natural Language Processing (NLP) for security and defense customers, the project focuses on the processing of French (even if the approaches chosen will subsequently be generalization to other languages). The thesis work will address different aspects:

● Automatic annotation of textual documents by detecting mentions of entities present in the knowledge base and their semantic disambiguation (polysemy, homonymy);

● The discovery of new entities (people, organizations, equipment, events, places), their attributes (age of a person, reference number of a piece of equipment, etc.), and relationships between entities (a person works for an organization, people involved in an event, ...). Particular attention will be given to the fact of being able to adapt flexibly to changes in ontology, taking into account the place of the user and the analyst for the validation/capitalization of the extractions carried out.

The project focuses on the following three lines of research:

● Generation of textual synthetic data from reference texts;

● Recognition of entities of interest, associated attributes and relationships between entities.

● Semantic disambiguation of entities (in case of homonymy for example)

Profile sought: - Solid experience in programming & machine learning for Automatic Language Processing (NLP), including deep learning

- Master Machine Learning or Computer Science, a TAL or computational linguistics component will be a plus appreciated

- Good knowledge of French

Practical details: - Start of the thesis on January 1, 2022

- Full-time doctoral contract at the LIG (Getalp team) for 3 years (salary: min 1768€ gross monthly)

Scientific environment: The thesis will be conducted within the Getalp team of the LIG laboratory(https://lig-getalp.imag.fr/).

The person recruited will be welcomed into the team which offers a stimulating, multinational and pleasant working environment.

The means to carry out the doctorate will be ensured both with regard to missions in France and abroad and with regard to equipment (personal computer, access to the GPU servers of the LIG, Jean Zay calculation grid of the CNRS).

How do I apply? Applicants must hold a Master's degree in Computer Science in Machine Learning or Natural Language Processing (obtained before the start of the doctoral contract).

They should have a good knowledge of machine learning methods and ideally experience in corpus collection and management.

They must also have a good knowledge of the French language.

Applications must contain: CV + cover letter/message + master's notes + letter(s) of recommendations; and be addressed to Benjamin Lecouteux([email protected]),Gilles Sérasset(gilles.serasset@univ- grenoble-alpes.fr)and Didier Schwab([email protected])

Thesis topic as part of the Popcorn project (collaborative project with two companies) supervised by Benjamin Lecouteux, Gilles Sérasset and Didier Schwab (Laboratoire d'Informatique de Grenoble, Groupe d'Étude en Traduction Automatique/Traitement Automatisé des Langues et de la Parole)

Title: OPérationnel settlement of COnnaissances bases and Neural Networks The project addresses the problem of semi-automated enrichment of a e knowledge base through automatic text analysis. In order to achieve a breakthrough innovation in the field of Natural Language Processing (NLP) for security and defense customers, the project focuses on the processing of French (even if the approaches chosen will subsequently be generalization to other languages).

The thesis work will address different aspects: ● The automatic annotation of textual documents by detecting mentions of entities present in the knowledge base and their semantic disambiguation (polysemy, homonymy);

● The discovery of new entities (people, organizations, equipment, events, places), their attributes (age of a person, reference number of a piece of equipment, etc.), and relationships between entities (a person works for an organization, people involved in an event, ...). Particular attention will be given to the fact of being able to adapt flexibly to changes in ontology, taking into account the place of the user and the analyst for the validation/capitalization of the extractions carried out. The project focuses on the following three research axes:

● Generation of textual synthetic data from reference texts;

● Recognition of entities of interest, associated attributes and relationships between entities.

● The semantic disambiguation of entities (in case of homonymy for example)

Profile sought: - Solid experience in programming & machine learning for Automatic Language Processing (NLP), including deep learning - Master Machine Learning or computer science, a TAL or computational linguistics component will be a plus appreciated - Good knowledge of French

Déta they practice: - Applications before 31/10/2021

Hello please find attached six detailed internship offers at CEA LIST (DIASI/SIALV/LVA) in the field of computer vision and learning for scene analysis:

LVA-22-S1: frugal AI for object re-identification via unsupervised domain adaptation

LVA-22-S2 : Incremental learning for scene analysis

LVA-22-S3: Providing Space-Temporal Attention for Action Recognition in Video Sequence

LVA-22-S4 : 3D point cloud perception with transformer models

LVA-22-S5: Adapting Visual Recognition Methods for Various Perspectives

LVA-22-S6 : Self supervised interactive segmentation

To apply, it is requested to send a CV and a cover letter to [email protected].

Thank you for disseminating widely,

Location: Orléans / Grenoble, France

Contacts: Emmanuel Schang ( [email protected] ),

Benjamin Lecouteux ( [email protected] )

We are looking for a candidate for a thesis in Language Sciences on the subject of automatic speech processing.

The thesis will be carried out within the Ligérien Linguistics Laboratory (LLL, UMR 7270), with a possibility of hosting at LIG-GETALP (Grenoble).

Funding will be provided within the framework of the ANR CREAM project (Machine-assisted documentation of CREoles languages, https://sites.google.com/view/creamproject/home ).

Key terms: Creole languages, automatic speech processing, keyword detection, bilingual alignment, creole languages, speech processing, keyword spotting, bilingual alignment.

Goals The CREAM project aims to offer linguists working on Creole languages innovative tools in the collection and processing oral data on languages with few resources.

In the particular context of diglossia which often characterizes the Creolophone space, the passage through the stage of transcription of corpus is frequently seen as a difficulty by linguists ground. One consequence is the lack of available corpora.

The objective of this project is to pave the way for innovative methods in linguistic documentation and resource creation on Creole languages.

Using learning technologies state-of-the-art automatic linguistic documentation is implemented in terms of construction linguistic resources and processing of spoken corpora.

Emphasis will be placed on two tasks in particular: - Query-by-example: finding similar segments in corpus in Creole language,

- Automated bilingual alignment between speech segments in a Creole language and a similar language (French, English, Portuguese, according to Creoles).

Depending on the progress, the research may extend to other tasks of the TAL:

- automatic speech recognition

- study of the transfer of learning between lexifying languages and creole languages

- automatic translation ...

Required profile Candidates will have a master's degree in linguistics or computer science and will show a certain interest in the automatic treatment of speech and so-called "rare" languages.

Autonomy in coding in python is essential, as well as machine learning basics.

Candidacy : candidates will send a cover letter and a CV detailed.

Additional documents may be requested if the or the candidate is selected for an audition.

Framing Emmanuel SCHANG (HDR Doctor in Language Sciences)

Benjamin LECOUTEUX (Doctor in Computer Science)

Application to send to Emmanuel Schang ( [email protected] ),

Benjamin Lecouteux ( [email protected] ).

Calendar : Deadline for sending files: November 01, 2021

The dates of the auditions will be communicated to the selected candidates. on file.

How well can deep learning algorithms generalize over unseen data:

A case study in multiword expression identification

Master internship proposal, 2021-2022

- Domain: natural language processing

- Location: Université Paris-Saclay, Gif-sur-Yvette, France (LISN https://www.lisn.upsaclay.fr/ )

- Research teams: ILES (https://www.limsi.fr/en/research/iles, Written and Sign Language Processing) of the LISN; TALEP (https://talep.lis-lab.fr/, Written and Spoken Language Processing) of the LIS

- Supervisors: - Agata Savary (LISN) http://www.info.univ-tours.fr/~savary/

- Carlos Ramisch (LIS) http://pageperso.lis-lab.fr/carlos.ramisch/

- Funding: Université Paris-Saclay

- Duration: 3-6 months

- Remuneration: around 606€/month

Motivation and context

The aim of this internship is to boost applications in Natural Language Processing (NLP), by focusing on one of their major challenges: multiword expressions (MWEs).

MWEs are groups of words which exhibit unpredicted properties (Baldwin & Kim, 2010).

Most prominently, their meaning does not straightforwardly derive from the meanings of their components.

For instance, faire‘make/do’ and valoir‘be worth sth’ are verbs, while their combination yields a noun: faire-valoir‘a stooge, a person who is used by somebody to do things that are unpleasant or dishonest’.

Similarly, the meaning of casser sa pipe ‘to die’ (literally to break one’s pipe) cannot be straightforwardly deduced from the meanings of the individual components.

Due to these properties, MWEs are very challenging in applications like machine translation, information retrieval, opinion mining, etc.

A major task related to MWEs is to automatically identify their occurrences in running text (so as to provide more accurate representations to downstream applications).

The PARSEME (https://gitlab.com/parseme/corpora/-/wikis/home) network has been addressing this task via a series of shared tasks on automatic identification of verbal MWEs (https://gitlab.com/parseme/corpora/-/wikis/home#shared-tasks).

Edition 1.1 of the PARSEME shared task (in 2018) showed critical hardness of identifying MWEs which have not been previously seen in the training corpus.

Edition 1.2 saw the advent of transformer-based language models (BERT), which brought substantial progress to MWE identification performances. Still, only modest progress was achieved in generalization over unseen data.

Objectives The aim of this internship is to better understand the potential of transformer-based models in generalising over unseen data in MWE identification. More precisely we wish to:

- analyze the results of edition 1.2 (https://gitlab.com/parseme/sharedtask-data/-/tree/master/1.2/system-results) of the PARSEME shared task, and in particular those related to unseen data

- propose an error analysis methodology for MWEs which are and are not correctly identified, and try to understand the reasons behind this state of the affairs

- put forward recommendations for future enhancements of the state-of-the-art MWE identifiers

- (depending on the candidate's profile and the length of the internship) implement a prototype based on these recommendations

Candidate's profile - 2nd-year master student in computational linguistics, computer science or alike ; excellent 1st-year master ou 3rd year bachelor students will also be considered

- Interests in linguistics and familiarity with language technology

- Good programming skills, preferably in Python

Important dates

- Application deadline: 20 November 2021 (or until filled)

- Notification: 30 November 2021

- Position starts: late January 2022 (at earliest)

- Position ends: around late July 2022 (or later)

How to apply

Send your CV and a transcript of your bachelor and master grades to Agata Savary [email protected] and Carlos Ramisch [email protected] .

IHU Strasbourg is looking for a team leader to set up and develop an artificial intelligence service platform in the medical field.

More information here.

https://www.ihu-strasbourg.eu/wp-content/uploads/IHU_recrutement_IA_DIAMS.pdf

Nicolas Padoy Professor of Computer Science, University of Strasbourg Director of Computer Science and AI Research, IHU Strasbourg

Head of Research Group CAMMA University of Strasbourg / ICube IHU Strasbourg, 1 place de l'Hôpital 67000 Strasbourg, France

Web: http://camma.u-strasbg.fr Phone: +33 (0) 3 904 13530

Looking for Research Scientists, Post-doctoral fellows and Interns for research topics in AI for Healthcare

We have open positions for 1 research scientist, several research engineers, 1 post-doc and multiple interns at the University of Strasbourg & IHU Strasbourg Hospital

- We are looking for a research scientist to conduct novel research in AI for healthcare in the areas of medical image analysis and computer aided surgery. The successful candidates will contribute to developing new areas of research at IHU Strasbourg and mentor a growing team of PhD students and engineers. Salary is competitive and permanent contracts will be offered to strong candidates with experience.

- We are looking for one post-doctoral fellow for an exciting new project aiming at developing novel AI methods for monitoring critical safety steps in endoscopic surgery. This project is sponsored by a national Chair in Artificial Intelligence.

- We are looking for experienced research engineers in the areas of medical image analysis, computer aided surgery and federated learning to support our research team.

- We also have multiple internship positions in computer vision and machine learning for healthcare.

The successful candidates will be hosted within the AI team at IHU Strasbourg/University of Strasbourg, an institute offering an international environment with state-of-the-art computing resources and unique clinical facilities for both patients and medical research. More information about the positions is available here.

Nicolas Padoy Professor of Computer Science, University of Strasbourg Director of Computer Science and AI Research, IHU Strasbourg Head of Research Group CAMMA

University of Strasbourg / ICube IHU Strasbourg, 1 place de l'Hôpital 67000 Strasbourg, France

Web: http://camma.u-strasbg.fr Phone: +33 (0) 3 904 13530

Thesis topic as part of the Popcorn project

Internship positions at CEA LIST in Computer Vision and Machine Learning for scene understanding

Thesis offer: automatic speech processing and Creole languages

Stage : 3-6 months, "Multiword Expressions: Generalizing over Unseen data" (LISN, Orsay)

Job offer: Development Manager in Artificial Intelligence for the Medical Field

Looking for Research Scientists, Research Engineers, Post-doctoral fellows and Interns for research topics in AI for Healthcare