1、Translation between Moleculesand Natural LanguageHeng Ji(UIUC,Amazon Scholar)Based on the wonderful work done by Hongwei Wang,Carl Edwards,Tuan Lai and Zixuan Zhang(UIUC)Collaborations with Martin Burke(UIUC)and Kyunghyun Cho(NYU)hengjiillinois.eduUniversity of IllinoisUrbana-Champaign2Problem:Too M
2、any papersMorethan500KpapersarepublishedatPubMedeveryyear,andmorethan1.2millionnewpapersarepublishedin2016alone,bringingthetotalnumberofpaperstoover26million(VanNoorden,2014)AsofJune13,2020,thereareatleast140KpapersaboutcoronavirusQuality:Giventherapidpublicationsofpreprintswithoutpeerreviews,manyre
3、searchresultsareredundant,complementaryorevenconflictingwitheachotherHumansreadingabilitykeepsalmostthesameacrossyears:USscientistsestimatedthattheyread,onaverage,only264papersperyear(1outof5000availablepapers,thesameacrossyears)3How Modern Chemists Design Their ExperimentsMostofthecurrentscientific
4、experimentsarestillbasedonmanualdesignandcandidaterankingThereare500K+possiblereactionsE.g.,Top20candidatesmanuallyselectedforSuzukiCoupling:Noliteraturesearchenginessupportcross-mediaretrievalBioNLPsharedtasksmainlycoverbiomedicalpapers,butverylimitedpapersareaboutchemistry44How Modern Doctors Pred
5、ict Cancer TodayTheclassificationfeaturesareextremelycoarse-grained,genericandfragileChangingthenumberofbiopsiesfrom1to2willchangethecancerrisklevelfrom17%to37%,despiteofthepositive/negativeresultsofbiopsiesPrecisionMedicineisonlyaffordableforatinypopulationDevelopmentcostisabout$2.6billion5Scientif
6、ic LiteratureHierarchical Spherical EmbeddingOntology Enriched Text EmbeddingCross-media Structured Semantic RepresentationGenerative Adversarial Networks for Data Augmentation and Distant SupervisionMultimedia Search and SummarizationGraph neural networksJoint entity/relation/event extraction and o