Unsupervised attention-guided atom-mapping
- đ¤ Speaker: Philippe Schwaller, IBM Research
- đ Date & Time: Monday 31 August 2020, 17:00 - 17:30
- đ Venue: virtual ZOOM meeting ID: 263 591 6003, https://zoom.us/j/2635916003
Abstract
Language models called transformers have recently revolutionized natural language processing and show great potential when applied to text-based representations of chemical reactions. The patterns in chemical reactions are learned by predicting masked parts of reaction SMILES . The pretrained models can then be specialized on a task like reaction classification [1]and yield predictions [2], where they reach unprecedented accuracies. Not only can specific outputs of the transformer models serve as fingerprints to map the chemical reaction space without the need of knowing the reaction center or distinguishing between reactants and reagents, but they can also be used to recover the rearrangement between reactant and product atoms [3]. By opening the black-box using detailed visual analysis, we discovered that the transformer models learned atom-mapping without supervision. Atom-mapping is necessary for making chemical reaction data better machine-accessible and crucial for graph- and template-based reaction prediction and synthesis planning approaches. Here, we present an attention-guided reaction mapper that shows remarkable performance in terms of speed and accuracy, even for strongly imbalanced reactions as typically found in patents. This work is the first demonstration of knowledge extraction from a self-supervised language model with a direct practical application in the chemical reaction domain.
References: [1] Mapping the Space of Chemical Reactions using Attention-Based Neural Networks P Schwaller, D Probst, AC Vaucher, VH Nair, D Kreutter, T Laino, JL Reymond http://dx.doi.org/10.26434/chemrxiv.9897365
[2] Prediction of Chemical Reaction Yields using Deep Learning P Schwaller, AC Vaucher, T Laino, JL Reymond http://dx.doi.org/10.26434/chemrxiv.12758474
[3] RXN Mapper: Unsupervised Attention-Guided Atom-Mapping. P Schwaller, B Hoover, JL Reymond, H Strobelt, T Laino http://dx.doi.org/10.26434/chemrxiv.9897365
Series This talk is part of the Machine learning in Physics, Chemistry and Materials discussion group (MLDG) series.
Included in Lists
- Hanchen DaDaDash
- Lennard-Jones Centre external
- Machine learning in Physics, Chemistry and Materials discussion group (MLDG)
- virtual ZOOM meeting ID: 263 591 6003, https://zoom.us/j/2635916003
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)


Monday 31 August 2020, 17:00-17:30