DeepPath: A Reinforcement Learning Method for Knowledge Graph
DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning Wenhan Xiong, Thien Hoang, William Wang Department of Computer Science UC Santa Barbara 1 Knowledge Graphs English Sp ok en I personLanguages es ag Neal
McDonough nationality-1 United States countryOfOrigin castActor Band of Brothers Actor gu co un try an nL rso
Caesars Entertain serviceLocation-1 pe n -1 serviceLanguage profession Tom Hanks awardWorkWinner 2 Reasoning on Knowledge Graph n -1 Sp
ok en I personLanguages es ag Neal McDonough nationality-1 United States countryOfOrigin castActor Band of Brothers Actor
gu co un try an nL rso Caesars Entertain serviceLocation-1 pe Query Node: Band of Brothers English serviceLanguage Query Relation: tvProgramLanguage Query: tvProgramLanguage(Band of Brothers, ?) profession
Tom Hanks awardWorkWinner 3 Related Work Path-based methods Path-Ranking Algorithm (PRA), Lao et al. 2011 Subgraph Feature Extraction, Gardner et al, 2015 RNN + PRA, Neelakantan et al, 2015 Chains of Reasoning, Das et al, 2017 4 Related Work Embedding-based method
TransE, Bordes et al, 2013 Neural Tensor Network, Socher et al, 2013 TransR/CTransR, Lin et al, 2015 Complex Embeddings, Trouillon et al, 2016 5 Our Approach Learning the paths instead of using random walks Model the path finding as a MDP Train a RL agent to find paths Represent the KG with pretrained KG embeddings Use the learned paths as horn clauses 6 Reinforcement Learning Agent
: Environment Agent Multi-layer neural nets (st ) Environment KG modeled as a MDP 1. Richard Sutton, Reinforcement Learning An Introduction 7 Components of MDP Markov decision process continuous states represented with KG embeddings : action space transition probability reward received for each taken action
With pretrained KG embeddings , all relations in the KG 8 Framework English oke nIn -1 ryS p cou nt United States Neal McDonough Actor
Reward ReLU profession Reason Step nationality-1 castActor Tom Hanks countryOfOrigin Band of Brothers State ReLU es
ag gu serviceLocation-1 personLanguages Next State Lan on Caesars Entertain rs pe serviceLanguage awardWorkWinner Softmax
(a| s) 9 Reward Functions Global Accuracy Path Efficiency Path Diversity 10 Training with Policy Gradient Monte-Carlo Policy Gradient (REINFORCE, William, 1992) 11 Challenge Typical RL problems Atari games (Mnih et al., 2015): 4~18 valid actions
AlphaGo (Silver et al. 2016): ~250 valid actions Knowledge graph reasoning: >= 400 actions Issue: large action (search) space -> poor convergence properties 12 Supervised (Imitation) Policy Learning Use randomized BFS to retrieve a few paths Do imitation learning using the retrieved paths All the paths are assigned with +1 reward 13 Datasets and Preprocessing Dataset # of Entities # of Relations
# of Triples # of Tasks FB15k-2371 14,505 237 310,116 20 NELL-995 75,492 200 154,213 12
FB15k-237: Constructed from FB15k (Bordes et al., 2013), redundant relations removed NELL-995: Constructed from the 995th iteration of NELL system (Carlson et al., 2010b) Dataset processing Remove useless relations: haswikipediaurl, generalizations, etc Add inverse relation links to the knowledge graph Remove the triples with task relations 1 . Toutanova et al. Representing text for joint embedding of text and knowledge bases 14 Effect of Supervised Policy Learning x-axis: number of training episodes y-axis: success rate calculated on a held-out test set -> Re-train the agent using reward functions 15
Inference Using Learned Paths Path as logical formula (horn-clauses) FilmCountry: actorFilm-1 -> personNationality PersonNationality: placeOfBirth -> locationContains-1 etc Bi-directional path-constrained search Check whether the formulas hold for entity pairs Uni-directional search bi-directional search 16 Link Prediction Result Tasks
Conclusion and Future Work Conclusions Propose a RL framework for KG reasoning Controllable paths finder (walker) in KG Combine path-based and embedding-based methods Future Directions Adversarial learning to give rewards Joint reasoning with KG triples and text 20 Thanks! Code: https://github.com/xwhan/DeepPath Dataset: http://cs.ucsb.edu/~xwhan/datasets/NELL-995.zip 21 Bi-directional search Relation link belongs to the reasoning path Think about supernode which has neighbors linked by relation But only the path with can reach the tail entity
If we go from tail to head, we will encounter first, by relation link , is the only neighbor 22 Feature Engineering in PRA vs DRL Reward learning is different from feature engineering DRL enable learning on the symbolic space We can try to learn better reward functions in the future, which can further boost the performance 23
Locum jobs Review your needs & be clear with surgeries Honesty & commitment Be organised Contract or not Now Folder & Stationery CV GMC Covering letter P60 Invoices Certificates Performers list - CCT, Indemnity, Copy for folder Speak to current...
Lab 6A Warm-Up What are plasmids? What gene(s) are on the plasmid being used in this lab? How can we know if cells have been successfully transformed in this lab? Bacteria Chapter 27.2 What you need to know: Mechanisms that...
Outline Security Concerns Kerberos X.509 Authentication Service Recommended reading and Web Sites Security Concerns key concerns are confidentiality and timeliness to provide confidentiality must encrypt identification and session key info which requires the use of previously shared private or public...
Definition-the author explains the meaning of the word right in the sentence or selection Synonym-the author uses a similar word Antonym-the author uses a opposite word Example-the author provides an example General-the author provides several words or statements that give...
RTP/RTCP, RTSP, and RSVP Multimedia protocols for the Internet Jim Chou and Thinh Nguyen Outline of the presentation 1- the context 2- the RTP/RTCP protocols 3- the RTSP protocol 4- the RSVP protocol PART 1: The context A general view...
International Business by Daniels and Radebaugh Chapter 3 Political and Legal Environments Facing Business Objectives To discuss the different functions that political systems perform To compare democratic and totalitarian political regimes and discuss how they can influence managerial decisions To...
Ready to download the document? Go ahead and hit continue!