Building a Semantic Parser Overnight Overnight framework Which

Building a Semantic Parser Overnight Overnight framework   Which

Building a Semantic Parser Overnight Overnight framework

Which country has the highest CO2 emissions? Which had the highest increase since last year? What fraction is from the five countries with highest GDP?

Training data The data problem: The main database is 600 samples (GEO880)

To compare: Labeled photos: millions Not only quantity:

The data can lack critical functionality The process Domain Seed lexicon

Logical forms and canonical utterances Paraphrases Semantic parser

The data base: Triples (e1, p, e2) e1 and e2 are entities (e.g., article1, 2015) p is a property

(e.g., publicationDate) Seed lexicon For every property, a lexical entry of the form

t is a natural language phrase and s is a syntactic category < publication date RELNP[publicationDate]>

Seed lexicon In addition, L contains two typical entities for each semantic type in the database

Unary TYPENP ENTITYNP

Verb phrases VP ( has a private bath) Binaries: RELNP functional properties (e.g., publication date) VP/NP transitive verbs (cites, is the president of)

Grammar <1 . . . n s[z]> 1 . . . n tokens or categories, s is a syntactic category

z is the logical form constructed Grammar

Z: R(publicationDate).article1 C: publication date of article 1

Crowdsourcing X: when was article 1 published? D = {(x, c, z)} for each (z, c) GEN(G L) and x P(c) GEN(G L) and x P(c) L) and x GEN(G L) and x P(c) P(c)

Training log-linear distribution p(z, c | x, w) Under the hood

Lambda DCS Entity: singleton set {e}

Property: set of pairs (e1, e2) Lambda DCS binary b and unary u join b.u

2 [ ] ( 1 , 2 ) [ ] Lambda DCS u

Lambda DCS R(b) (e1, e2) GEN(G L) and x P(c) [b] -> (e2, e1) GEN(G L) and x P(c) [R(b)]

Lambda DCS count(u)

sum(u) average(u, b) argmax(u, b)

Lambda DCS x.u is a set of (e1, e2): e1 [u[x/e2]]x.u is a set of (e1, e2): e1 GEN(G L) and x P(c) [u[x/e2]]w R(x.u is a set of (e1, e2): e1 [u[x/e2]]x.count(R(cites).x)) (e1, e2), where e2 is the number of entities that e1 cites.

Seed lexicon for the SOCIAL domain Seed lexicon article

publication date cites won an award

Grammar Assumption 1 (Canonical compositionality): Using a small grammar, all logical forms expressible in natural language can be realized compositionally based on the logical form.

Grammar Functionality-driven Generate superlatives, comparatives, negation, and coordination

Grammar Grammar

From seed: types, entities, and properties noun phrases (NP) verbs phrases (VP) complementizer phrase (CP)

that cites Building a Semantic Parser Overnight that cites more than three article Grammar

Grammar Grammar

Paraphrasing meeting whose attendee is alice meeting with alice meeting with alice author of article 1 meeting with alice who wrote article 1 player whose number of points is 15 meeting with alice player who scored 15 points

Paraphrasing article that has the largest publication date meeting with alice newest article. housing unit whose housing type is apartment meeting with alice apartment

university of student alice whose field of study is music meeting with alice At which university did Alice study music?, Which university did Alice attend?

Sublexical compositionality parent of alice whose gender is female meeting with alice mother of alice. person that is author of paper whose author is X meeting with alice co-author of X person whose birthdate is birthdate of X meeting with alice person born on the same

day as X. meeting whose start time is 3pm and whose end time is 5pm meeting with alice meetings between 3pm and 5pm that allows cats and that allows dogs meeting with alice that allows pets author of article that article whose author is X cites meeting with alice who does X

cite. Crowdsourcing in numbers Each turker paraphrased 4 utterances

28 seconds on average per paraphrase 38,360 responses 26,098 examples remained

Paraphrasing noise in the data 17% noise in the data 17% (player that has the least number of team meeting with alice player with the lowest jersey number)

(restaurant whose star rating is 3 stars meeting with alice hotel which has a 3 star rating). Model and Learning

numbers, dates, and database entities first Model and Learning (z, c) GEN(G L) and x P(c) GEN(G L) and x P(c) Lx)

z, c | x, w) exp((c, z, x, w) >) exp((c, z, x, w) >) Floating parser Floating parser

Floating parser Floating parser

Model and Learning Features

Model and Learning

( ,| , ) ||||1 , ,

AdaGrad (Duchi et al., 2010) Experimental Evaluation

Recently Viewed Presentations

  • Lecture 2: Overview 1. 2. 3. 4. Discussion

    Lecture 2: Overview 1. 2. 3. 4. Discussion

    Typically 95% of all lead time is non-value-added. Principles for Implementing Lean Manufacturing. Complexity. The waste of doing things the hard way! Excessive paperwork. Excessive approvals. Redundancy. Poor communications. Causes of complexity:
  • Frankenstein - Chapter 9

    Frankenstein - Chapter 9

    terrible crime, which by its . awfulness . ... assuredly she would have been the most depraved of human creatures." "For the sake of a few jewels, ... and cottages every here and there peeping forth from among the trees...
  • Strength-based Strategies for Engaging Families

    Strength-based Strategies for Engaging Families

    Strength-based Strategies for Engaging Families. Mellonie Hayes, PhD, LMFT. Steve Livingston, PhD, LMFT ... Expand your view of what the issue is and how it can be resolved- address systemic issues and consider systemic strengths as part of problem resolution...
  • OPNAV Action Officer Course - Freedom Technologies, Inc

    OPNAV Action Officer Course - Freedom Technologies, Inc

    Arial Wingdings Times New Roman Verdana Tahoma N1 Brief Template 1_N1 Brief Template 2_N1 Brief Template 3_N1 Brief Template Microsoft Clip Gallery Microsoft Office Excel Chart Slide 1 The N6 Battlespace The New N2/N6 Battlespace Departmental and Service Policy Slide...
  • Chapter 13: Sound Chapter menu Resources Copyright  by

    Chapter 13: Sound Chapter menu Resources Copyright by

    Chapter 13. The Production of Sound Waves. Every sound wave begins with a. vibrating object, such as the vibrating prong of a tuning fork. A. compression . is the region of a longitudinal wave in which the density and pressure...
  • Diversion Programs

    Diversion Programs

    The three remaining residential facilities are: Huronia Regional Centre in Orillia Rideau Regional Centre in Smith Falls/Ottawa Southwestern Regional Centre in Chatham-Kent These DD adults will move to MCSS community-based accommodation and MOHLTC long term care homes. Date for final...
  • APC presentation - Feb. 2006

    APC presentation - Feb. 2006

    INFN, Laboratori Nazionali di Frascati Frascati, Italy, November 14th-16th, 2012 The HiLumi LHC Design Study is included in the High Luminosity LHC project and is partly funded by the European Commission within the Framework Programme 7 Capacities Specific Programme, Grant...
  • Risk Management and Financial Institutions

    Risk Management and Financial Institutions

    * VaR and Regulatory Capital Regulators base the capital they require banks to keep on VaR The market-risk capital is k times the 10-day 99% VaR where k is at least 3.0 Under Basel II, capital for credit risk and...