qatent is an NLP startup working on generation of patent applications from handcrafted claims using Artificial Intelligence (e.g. Transformers, BERT).

We are an INRIA backed company of 5 people.

You will be working in a highly inspiring and stimulating startup atmosphere at AGORANOV, in the 6th arrondissement of Paris.

qatent aims to add *Chinese* as a working language of its patent editing tools and is searching for an NLP engineer to work on *Chinese patent text generation*.

The candidate shall have excellent command of Transformer models as well as very good knowledge of Chinese and Chinese-specific NLP challenges.

We propose a one-year contract that can be extended. As we work on state-of-the-art language generation tools, scientific publications about the work are encouraged and the work can be a preparatory year before starting a PhD (pre-doc).

*Required*: Chinese fluent, computer science education

*Plus*: creativity, problem solver

*Keywords*: NLP, NLG, patents, transformers, Chinese, Python, Spacy, BERT, ....

*Salary*: depending on experience

Please send your resume to [email protected]

Link : https://profilsdemplois.cnrs.fr/index_public_referens?destination=CE2022 Competition n°103, 2nd position

2nd post of competition N° 103 *Assignment:* Support for Research and the Dissemination of Knowledge, JULYTOWN *Function group:* Group 3

*Assignment :* The Software Engineering Engineer will be based within the UAR ARDIS to create and implement software and tools in the framework of the scientific projects of SEDYL and LLACAN, located on the CNRS campus in Villejuif. He/she will in particular ensure the treatment linguistic and multilingual corpora, both written and oral corpora (audio/video) for little described or rare languages, as well as the creation and improvement of digital tools for the exploitation scientific of these corpora.

*Activities :* Ensure the development of specific IT tools for process linguistic and multilingual corpora, written and oral (e.g.: concordancers, automatic dictionaries, search tools data, formal representation of languages, etc.) and thus equip them truly innovative IT tools.

- Analyze user needs, translate them into technical specifications.

- Design and develop applications and tools for the projects SEDYL and LLACAN. - Ensure the evolutionary and curative maintenance of the developments realized - Write technical and functional documentation

- Assist users (training and project monitoring) - Ensure technological watch, follow the evolution of standards and standards in the field (corpus linguistics, annotation of linguistic resources, heterogeneous data mining)

- Ensure the development of formats to ensure resource interoperability. - Ensure the management of research data and archives and their valuation.

*Skills :* Knowledge: - Mastery of methods of analysis, modeling and development (UML,...) - Mastery of development and scripting languages (php, python, Java,...)

- Knowledge of data representation issues linguistics, registration and corpus manipulation (XML, XSLT,...)

Know-how : - Master relational databases - Master the formalisms of encoding, annotation and processing information in XML documentary chains (XML, XML Schema, XSLT, XPath)

- Master the methods and techniques of web programming (HTML/CSS, JavaScript, PHP, etc.) - Knowledge of a linguistic processing tool (ex: ELAN) appreciated

- Knowledge of good practice standards (CMMI, ITIL) and security, accessibility and interoperability (RGS, RGAA, RGI, GDPR) - Knowledge of written technical English in the field, and ability to communicate orally (written level B2, oral level B1).

Skills: - Ability to work in a team, especially within projects collaborative - Ability to interact with researchers and adapt to research practices

- Strength of proposal - Curiosity, adaptability, good general culture in linguistics

*Context :* The position will be based within the UAR2259 ARDIS, in the service computer science. The position is shared between SEDYL and LLACAN, field linguistics research units located on the Campus Villejuif CNRS. The person recruited will depend administratively and hierarchically of the ARDIS unit and will participate in the life of this unity.

She will preferably be in charge of SEDYL and LLACAN projects. For each of these projects, resulting from research activities on corpus of these units (description of languages, theorization, description of language contacts), the recruited person will be integrated into the teams of corresponding searches.

It will ensure the development of tools specific computer systems to process these corpora (e.g. concordancers, automatic dictionaries, data mining tools, formal representation of languages, etc.) and thus offer teams and to the scientific community of innovative computer tools.

*Analysis, design, formatting and distribution of vocal corpora and multimodal LIG and LIDILEM* *Position to be filled*: engineer - CDD

*Duration*: 1 year (possibility of extension) *Start*: from September 1, 2022

*Application deadline*: June 30, 2022 *Location*: Grenoble Computer Science Laboratory – Getalp Team

*Field*: Automatic Language and Speech Processing *Profile*: Master 2 in computer science or doctorate in computing/linguistics

*Context* The position to be filled is supported by the Artificial Intelligence & Language from the MIAI Grenoble Alpes Institute. MIAI is a center excellence in artificial intelligence which aims to conduct research at the highest level, to offer attractive courses for students and professionals of all levels, support innovation in large companies, SMEs and startups and finally to inform and interact with citizens on all aspects of AI.

The recruited person will be housed within the GETALP team from the Grenoble Computer Science Laboratory (LIG), which offers a dynamic, international and stimulating framework for carrying out high-level multidisciplinary research. The GETALP team is housed in a modern building (IMAG) located on a landscaped campus of 175 hectares which was ranked the eighth most beautiful campus in Europe by Times Higher Education magazine in 2018.

*Missions entrusted* - Organize corpora containing multimodal data (audio, text, video).

- Process and transform the data into a format of use for facilitate processing and reproducibility.

- Develop scripts for transformation, formatting and data testing (Python, Bash, Java).

- Supervise data annotation campaigns (Elan, doccano, Brat).

- Disseminate these corpuses on open platforms (ORTOLANG, Zenodo, ELRA) and facilitate their use.

- Participate in the writing of scientific documents and techniques.

- Assist the implementation and manage various software pipelines to support data analysis and text mining. - Help other team members to perform experiments regarding the data.

- Document the data lifecycle and update the plan data management.

You will work closely with PhD students, trainees and researchers from the Grenoble area of the MIAI institute.

You will also benefit from the skills and environment of research of 2 research units: the LIG (https://www.liglab.fr) and LIDILEM (https://lidilem.univ-grenoble-alpes.fr/).

*Skills* - Master's degree in data science, digital humanities or sciences computational social; - Fluency in technical and scientific English; - Excellent interpersonal skills; - Ability to work in a multidisciplinary team; - Know how to adapt to the project context; - Be autonomous in his personal organization and the reports;

- Have good written and oral communication in French; - Mastery of scripting languages (Python, bash, Perl, PhP); - Knowledge of annotation tools (Elan, Praat); - Experience in corpus linguistics tools, in research on corpus, in quantitative and qualitative analysis of data. - Experience in natural language processing, data processing speech or in computational linguistics are judged as one more.

*Application Instructions* Applications are expected until June 30, 2022.

Please send your CV + a cover letter/message + the notes from your previous studies + references for one or more potential letters of recommendation to: [email protected]

Topic 1 - Knowledge-aware few shot learning Topic 2 - Information retrieval models for structured and verbose queries

Starting date: Fall 2022

Research team: IRIS at IRIT : Lynda Tamine / José G Moreno / Taoufiq Dkaki Synergy: Christophe Thovex

Profile: - Master's level or engineering school in Computer Science, with skills in Information Extraction/Research and Text Mining - Good English skills (written and oral) - Good skills in advanced programming (Python, Pytorch, sklearn, ...) - Good knowledge in Machine Learning, deep learning is a plus

Funding: Total gross salary for 3 years : 105 401,50 € / Note that final monthly salary depends of personal situation in France

Application instructions: All applications must include the following to be considered: detailed CV, cover letter, transcripts (with rankings), contacts for recommendation. Please use “PhD application - Synergie” as subject of the email and select a preferred topic (1 or 2) if any.

Applications to be sent by mail to Lynda Tamine, José G Moreno, and Taoufiq Dkaki ([email protected], [email protected], [email protected]).

All applications will be processed as they arise until the positions will be filled.

Location: Institut de Recherche en Informatique de Toulouse (IRIT) University of Toulouse 3 Paul Sabatier (UT3) 118 Route de Narbonne, F-31062 TOULOUSE CEDEX 9, France

Comment: Nantes (France) is also an alternative location if requested by the applicant as Synergie has also offices in Nantes.

We have a thesis proposal on the topic "Distributed decision architecture for multi-robot systems and Interactions" at ONERA Toulouse within the autonomous robotics laboratory.

Details of the subject are available here: https://w3.onera.fr/ training by research/sites/w3.onera.fr. trainingparlarecherche/files/tis-dtis-2022-27.pdf

This thesis is to be filled very quickly to confirm the funding. If you (or your students) are interested, do not hesitate to contact me as soon as possible to arrange an interview.

1 Membership Inference One of the wonders of machine learning is that it turns any kind of data into mathematical equations. Once you train a machine learning model on training examples—whether it’s on images, audio, raw text, or tabular data—what you get is a set of numerical parameters.

In most cases, the model no longer needs the training dataset and uses the tuned parameters to map new and unseen examples to categories or value predictions.

You can then discard the training data and publish the model on GitHub or run it on your own servers without worrying about storing or distributing sensitive information contained in the training dataset.

Nevertheless, a type of privacy-leak oriented attack against ML systems, namely membership in- ference, makes it possible to detect whether a given data instance was used to train a machine learning model.

In many cases, the attackers can stage membership inference attacks without having access to the machine learning model’s parameters. They just query the model and observe its output (soft decision scores or hard predicted labels).

Membership inference can cause severe security and privacy concerns in cases where the target model has been trained with sensitive information. For example, identifying that a certain patient’s clinical record was used to train a automatic diagnosis model reveals that the patient’s identity and relevant personal information.

Moreover, such privacy risk might lead commercial companies who wish to leverage machine learning-as-a-service to violate privacy regula- tions. [VBE18] argues that membership inference attacks on machine learning models increase greatly the vulnerability of machine learning service providers on privacy leaks. They may face further legal issues related to privacy information breaching in their business practices due to GDPR (General Data Protection Regulation).

In this thesis, our plan is to first implement and benchmark typical membership inference attacks proposed in the literature [LZ21, SDS+19, SSSS17, CCTCP21, CCN+22].

We need to carefully outline the impact of crucial parameters such as the hardness of the classification task (dimension of the inputs, number of classes), the size (depth, number of parameters), the training procedure (data augmentation), and the potential overfitting of the target model.

This also includes the working assumptions about the attacker’s knowledge on the training data and his computation power. Indeed, some attacks rely on unrealistic assumptions.

Designing more tractable attacks is key in order to clearly define when membership attacks are a real threat in practice.

In differential privacy [ACG+16, NST+21], a common defense is to randomize the procedure by adding noise either on the inputs (the training data set), the training procedure of the model, or the outputs (the trained model’s parameters). This idea witnesses several implementations in modern machine learning like randomness in label smoothing, data augmentation, or penalization.

The study focuses on evaluating the multiple trade-off between the loss of classification performance, the preven- tion of overfitting, and the gain of robustness against membership inference attacks but also against adversarial attacks [SSM19].

Beyond inferring the membership of a given instance, we will also study the feasibility of attribute inference attack targeting to reversely estimate the attributes of training data, which is an extension to membership inference.

The candidate for this thesis is expected to have accomplished courses on Machine Learning and/or have experience of implementing Machine Learning algorithms using Python for practical data mining problems.

Especially, expertise in using Pytorch will be required in the project.

Theoretical develop- ments are also expected based on statistics and theory of machine learning and approximation.

The thesis takes place within INRIA Rennes, campus universitaire de Beaulieu, Rennes, France.

Contact information: Teddy Furon [email protected] and Yufei Han [email protected]

A position of Research Engineer in Robotics and Computer Science on the ANR Learn2Grasp project is to be filled at the Institute of Intelligent Systems and Robotics (CNRS, Sorbonne University) in Paris.

Contract start date: September 2022 Contract duration: 18 months

Context : As part of the ANR-BMBF Learn2Grasp project on learning how to grasp objects by a robotic arm, ISIR's AMAC team is recruiting a research engineer to participate in the development and realization of experiments on real and simulated robotic systems.

Missions: The development and experimental implementation of real and simulated robotic systems for object capture, including the implementation of state-of-the-art algorithms for perception, learning, decision and control as well as the integration of components developed by team members and partners of the ANR-BMBF Learn2Grasp project.

Profile sought: Engineer or doctor in robotics with strong computer skills , or engineer or doctor in computer science and machine learning with experience in robotics.

Skills sought: - Software engineering and Python and C++ development- Machine learning (TensorFlow and/or PyTorch platforms) - Robotic arm control and manipulation - ROS platform- Visual perception (2D, 3D)

- Simulation for robotics (in particular Bullet/pyBullet) - Conduct of experiments and realization of demonstrations on real robot - Carrying out a state-of-the-art bibliographic research- Presenting written and oral results- Contributing to the writing of scientific articles and technical reports- Working in a team - Good command of English

Link to website: https://www.isir.upmc.fr/ contact us/oppotunites/

Job description: https://www.isir.upmc.fr/wp-content/uploads/2022/06/Ingenieur-de-recherche-robotique-informatique.pdf

PhD. position in Computer Science - Nantes Université Surgical Process Modelling with Graphical Event Models and Ontologies

Supervisors : * Philippe Leray, LS2N, Nantes Université * Thomas Guyet, INRIA, Centre de Lyon * Pierre Jannin, LTSI, INSERM, Université Rennes 1

More details : https://uncloud.univ-nantes.fr/index.php/s/yffCR7p4G49T94s

Keywords : Artificial Intelligence, Probabilistic Graphical Event Model, Ontology, Machine Learning, Surgical Process Modelling.

Context DUKe (Data User Knowledge) research group at LS2N, UMR CNRS 6004, is one of the laboratory's main teams in "Data and Decision Science" field, with its skills in data manipulation, data mining and interaction. Within this framework, the research group has, among other things, developed numerous algorithms for learning and manipulating probabilistic graphical models (Bayesian networks, dynamic Bayesian networks, relational Bayesian networks, graphical event models) gathered within the PILGRIM C++ software library. This PhD thesis is part of the SPARS project (Sequential Pattern Analysis in Robotic Surgery: Understanding Surgery), funded by Labex CominLabs, in collaboration with LTSI/INSERM/Université Rennes 1 and INRIA.

The objective of this project is to propose data analysis methods to better understand complex technical human activities, such as surgery. Surgery is a complex activity, that depends on many factors, including the patient and surgeon characteristics. Such complexity and variability explain why there is almost no detailed study of the surgical practice yet. Until now, the surgical procedure performed in the operating room is considered as a whole, as a black-box and is technically described with few words. Analysis usually consisted in comparing impact of different surgical approaches or of different pre-operative clinical patient’s parameters on post-operative outcomes. In the SPARS project, we will rely on a combination of data and model-driven approaches to analyze and compare kinematics of whole surgical procedures acquired during robotic assisted hysterectomies.

Funding: The PhD fellowship is funded for 3 years from september-October 2022.

Profile of the candidate: The candidate should have a master's degree in computer science or equivalent, as well as knowledge of machine learning, probabilistic graphical models and knowledge representation. Good skills in machine learning is mandatory. Some knowledge in knowledge representation will be a plus.

The programming environment associated with this project also requires some knowledge of C++ programming language. The personal qualities expected are mainly autonomy and a taste for interdisciplinary work, rigour and abstraction, as well as writing skills (in French and English).

Application instructions: The application file should contain the following documents: * a curriculum vitæ (CV); * the official academic transcripts of all the candidate’s higher education degrees (BSc, License, MSc, Master’s degree, Engineer degree, etc.).

If the candidate is currently finishing a Master’s degree, s/he must send the transcript of the grades obtained so far, with the rank among her/his peers, and the list of classes taken during the last year; * some recommendation letters (quality is more important than quantity, there); * and a motivation letter written specifically for this position.

Send all of these documents by email to [email protected], [email protected] and [email protected]

Postdoctoral Fellowship: Psychophysical and Computational Studies of the Human Visual System York University Centre for Vision Research Toronto, Canada

A postdoctoral position is available in the Human and Computer Vision Laboratory of James Elder at York University, Toronto, Canada. The candidate should have a research background in visual psychophysics and/or computational modeling of biological vision systems.

Specific topics of current interest include: Contour processing Perceptual organization Feedback / recurrent processing in visual cortex Shape perception Single-view 3D perception Natural scene statistics Probabilistic and deep network models

The salary for this position is competitive and the starting date is flexible. The application deadline is August 1, 2022. International applications are encouraged.

The Elder Laboratory is part of the York Centre for Vision Research (CVR), the York Vision: Science to Applications (VISTA) Program and the Intelligent Systems for Sustainable Urban Mobility (ISSUM) project.

Please direct your application to Ms. Anna Kajor at [email protected], with subject line Application: Human Vision Postdoctoral Fellowship. Your application should include your c.v. and the names of 3 referees.

The Industrial Engineering Center (CGI) IMT Mines Albi offers, in collaboration with the Connected Health Lab of the School of Engineering in Computer Science and Information Systems for Health (ISIS) of Castres, a thesis in industrial and computer engineering co-funded by the Occitanie region.

The topic is "Knowledge management to assist in the classification of adverse drug events for reliable management of drug management".

The details of the offer are attached.

Applications must be sent to [email protected] by 15 August 2022. The application procedures are detailed in the attached tender.

Thesis subject: Physics-based deep learning for modeling complex dynamics. Climate applications

Candidate profile: Holder of a master's degree in computer science or applied mathematics, or an engineering school diploma. Training and experience in machine learning, and good technical skills in programming.

Context : Deep learning is beginning to be developed for scientific computing in fields traditionally dominated by physical models like earth sciences, climate sciences, biological sciences, etc. It is especially promising for problems involving processes that are not completely understood or too complex to be modeled analytically. Researchers from different communities have begun to explore (i) how to integrate physical knowledge and data for modeling complex phenomena, and (ii) how to push the boundaries of current machine learning methods and theory for these problems. of modeling, two stimulating directions. Here we consider deep learning approaches for modeling complex dynamic systems characterizing natural phenomena, a recent and growing research topic (Willard et al. 2020, Thuerey et al. 2021). Motivating problems and applications will come from climate science (de Bezenac et al. 2018, Ayed et al. 2020).

Scientific objectives: The overall objective of the thesis is the development of new models exploiting observation or simulation data for the modeling of complex spatio-temporal dynamics characterizing physical phenomena such as those underlying observations in earth sciences and of the climate. The classic tools for modeling these dynamics in physics and applied mathematics are based on partial differential equations (PDEs). Despite their successes in different fields, current learning approaches are clearly insufficient for such problems. Using learning for physics raises new issues that require rethinking the ideas underlying learning.

Lines of research : Hybrid Systems - Integrating Physics and Deep Learning Often there is prior physical knowledge described by PDEs to characterize the underlying phenomenon. A key question is then how to combine this knowledge with the information extracted from the data. Learning can complement numerical models and allow us to take into account information not present in the model or to integrate observation data. It can also be used as a rapid prototyping model. Initial attempts to address similar issues exist in recent works such as (de Bezenac et al. 2018, Harlim et al. 2020, Yin et al. 2021, Dona et al. 2022). This will be developed for the thesis project with the objective of analyzing and developing different frameworks of hybrid systems.

Domain generalization for learning dynamics Explicit physical models come with warranties and can be used in any context (also called a domain or environment) where the model is valid. This is not the case with neural networks, which offer no guarantee of generalization to new physical environments. We propose here to attack this problem by taking inspiration from recent learning frameworks developed to address this new research topic of generalization to a domain, such as (Yin et al. 2021b, Wang et al. 2021).

Learning at multiple scales Modeling dynamic physical processes often requires taking into account several spatio-temporal scales. For example, in the field of climate, global phenomena are influenced by dynamics operating on a smaller scale. Similar problems arise, for example, in fluid dynamics. Learning at different scales is an open question. Most current deployments of neural networks for learning dynamics use fixed spatio-temporal discretization. Recent advances (Sitzman 2020, , Lindel et al. 2021, Li 2021) relying on implicit representations, allow to learn a space of functions instead of discrete flows and open the possibility to generalize to different spatio-temporal resolutions . This will be used as a starting point for multi-scale learning with neural networks.

Profile sought: Master's degree in computer science or applied mathematics, or school ofengineers. Training and experience in machine learning. Good technical programming skills.

Working environment : The thesis contract is for three years starting in October/November 2022. It does not include a teaching obligation, but it is possible to do so if desired. The doctoral student will work at Sorbonne University (S.U.), Campus Pierre et Marie, in the center of Paris. He/she will join the Machine Learning and Deep Learning for Information Accesss team of S.U. at the ISIR laboratory. (Institute of Intelligent Systems and Robotics). With regard to the climate, the candidate will be co-supervised by M. Levy and S. Thiria from the LOCEAN laboratory, https://www.locean-ipsl.upmc.fr/.

Link to the offer on the ISIR website: https://www.isir.upmc.fr/nous-rejoindre/oppotunites/

We are looking for a talented post-doc researcher in Computer science/Knowledge Engineering/Logic/Artificial Intelligence/Unmanned Aerial Vehicle (UAV) to join our research group and contribute to a reaserach project in collaboration with the CIAD laboratory (Connaissance et Intelligence Artificielle Distribuées - Université de Technologie de Belfort-Montbéliard - EA 7533 - France).

Title: "An environmenTal knowlEdge-based approaCh to real-time navigaTiOn of uNmanned aerIal vehiCles beyond GNSS"

Mission: see enclosed file "FDP-2_2022-2023-DDR-Post_doc-INFOR_english_version.pdf"

Duration : 1 year

Salary : around 2 150 €/mois (net)

Application deadline: July 8th 2022.

Expected starting date: around Sept. 2022.

Research Lab: Institut de recherche de l'école navale (IRENav - EA 3634), Ecole navale, France.

We are looking for a highly motivated candidate for a two-years Postdoc position interested in the investigation of the security and privacy of Machine Learning architectures, especially in the context of computer vision and video-surveillance.

The research will be conducted within a collaborative and highly stimulating environment. The candidate will be working with Ihsen Alouani at the IEMN-CNRS Lab Polytechnic University Hauts-de-France (https://www.uphf.fr/DOAE/) in Valenciennes, France and Ioan Marius Bilasco at University of Lille - CRIStAL lab (https://www.cristal.univ-lille.fr/), France.

The candidate will be recruited by UPHF and its main residence will be in Valenciennes but technically he/she will be working within the two institutions.

Polytechnic University Hauts-de-France (UPHF) in Valenciennes, and more specifically IEMN Lab (Institut d’Electronique, Micro-electronique et Nanotechnologie, https://www.iemn.fr/ ), are located in Campus Mont-Houy in an international and friendly environnent: https://www.youtube.com/watch?v=kVG_AcGBxvk&ab_channel=UPHFOfficiel

== REQUIREMENTS and Expected Qualifications: - PhD in Computer Science, Statistics, or Applied Mathematics with preferably background in machine learning, deep learning, computer vision, and similar topics - A background in Cybersecurity is a plus - Ability to work in a collaborative environment - Fluency in English, both written and spoken

== APPLICATION: Please address an email with the subject "[MLSecCV] application" to: [email protected], [email protected] that enclose : a) your CV; b) a list of publications; c) at least two reference letters.

The position is expected to start beginning of September 2022.

Contact : Patrick Gallinari, [email protected]

Location: Sorbonne Université, Pierre et Marie Curie Campus, 4 Place Jussieu, Paris, Fr

Candidate profile: Master in computer science or applied mathematics, Engineering school. Background and experience in machine learning. Good technical skills in programming.

How to apply: please send a cv, motivation letter, grades obtained in master, recommendation letters when possible to [email protected]

Start date: October/November 2022

Note: The research topic is open and depending on the candidate profile could be oriented more on the theory or on the application side.

Detailed description at: Physics Based Deep Learning for Modeling Complex Dynamics. Applications to Climate.

https://drive.google.com/file/d/1J6XkV1_3Y0DuqpXoxDjQbvz4rZ44VLd0/view?usp=sharing

NLP Engineer for patent text generation in Chinese, AGORANOV (Paris)

Engineer position in TAL software engineering (IE CNRS), Villejuif

Data Science and Corpus Engineer – Computer Science Laboratory from Grenoble

PhD positions at IRIT (Toulouse, France)

PhD position proposal Distributed decision architecture for multi-robot systems and interactions

Ph.D. thesis on Membership Inference Attack in Machine Learning

Anna Kajor_Postdoctoral Fellowship Position is Open

PhD position in Data Science and Risk Management

PhD position offer in deep learning is to be filled at ISIR - Institute of Intelligent Systems and Robotics of Sorbonne University/CNRS.

post-doc researcher in Computer science/Knowledge Engineering/Logic/Artificial Intelligence/Unmanned Aerial Vehicle (UAV)

Postdoctoral Researcher position in ML Security - applications to Computer Vision

PhD position in Engineering and Computer Science, Sorbonne Universite, Paris, Physics Based Deep Learning for Modeling Complex Dynamics. Applications to Climate.