BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Can machines understand the scientific literature? - Dr. Peter Mu
 rray-Rust (Department of Chemistry\, University of Cambridge)
DTSTART:20201111T140000Z
DTEND:20201111T150000Z
UID:TALK151048@talks.cam.ac.uk
CONTACT:Samantha Noel
DESCRIPTION:Perhaps half of the 3 million annual articles\, preprints\, th
 eses\, and gray literature (up 5000/day) are directly relevant to biomedic
 ine (including chemistry\, materials\, IT\, engineering\, etc.)\, and many
  of the rest (psychology\, politics\, law\, philosophy) are needed to tack
 le global challenges. We need to index for scientific computation  (indexi
 ng\, searching\, data abstraction\, and ultimately "Artificial Intelligenc
 e") But the raw material (usually PDF)  is very poorly suited for automati
 c ingestion and the major search engines are not well suited for science. 
 We will present prototypes of Open tools (software\, dictionaries)  to ext
 ract science in computable (semantic) form. Since science is a global ende
 avor the tools must be equitable and inclusive and we have included collab
 orators using several languages (EN\, HI\, TA\, UR\, ES\, IND).\n\nThe cen
 tral ontology is based on multilingual Wikidata (ca 100 million Items) whi
 ch is increasingly subsuming the major biomedical and chemical ontologies 
 and some reference data. The scholarly literature is also formally indexed
  there (Scholia). Where possible all our entities and many of their relati
 onships are based on Wikidata Items (Q) and Properties (P). Our primary ap
 proach is supervised text-mining through faceted dictionaries created from
  Wikidata SPARQL queries. Current dictionaries include countries\, disease
 s\, drugs\, chemicals\, species\, organizations\, and can be extended to m
 any other areas (e.g. through Wikipedia categories). Besides text\, many d
 ocuments contain tables and diagrams and it's also possible to extract dat
 a from these such as phylogenetic trees\, Forest plots\, graphs.\n\nWe sha
 ll give examples of a variety of several tools that can be run from Jupyte
 r Notebooks and designed to be generic and extensible.
LOCATION:Please get in touch at compbiomphil@maths.cam.ac.uk for joining i
 nformation.
END:VEVENT
END:VCALENDAR
