Notes about semantic parsing

Allikas: Lambda


Just notes, not a detailed theory or literature review.

We'll start with examples of sentences whose meaning we want to encode and then continue to different meaning encoding styles, from simpler to more complex.


Kinds of sentences we want to encode

Let us start from very simple sentences and continue to slightly more complex sentences.


Immediate goal: very simple sentences

Can encode simple sentences of three kinds, each typically associated with a confidence like:

  • Generic rules about kinds/categories like

"Cats are animals", "Plants are not animals", "Animals have a head", "Poison kills when eaten", "Iron conducts electricity", "Glass does not conduct electricity"

  • Statements about specific objects like

"Percy is a cat", "John ingested poison", "John was born in 1990".

  • Statements about specific situations like

"Three cats sit on the couch".

  • Questions about either generic statements or specific objects or situations:

Generic like: "Are cats animals?", Specific like "Does Percy have a head?", "Is John alive?", "Which of these is an animal: Percy, John" "Are there four cats on the couch?"


Mid-level goal: sentences in a "relatively simple" context

Like

"Steve thinks John ingested poison", "Donald Trump said John was born in 1990", "In the book 'Hobbit' there are dragons", "John ate two hamburgers yesterday", "Three cats sat on the couch", "Three cats sat on the couch at John's place".

The "relatively simple" context should allow to express (to some degree, not perfectly)

  • Time relations like "earlier", "after", "at the same time", "in 2000"
  • Location relations like "In Tallinn", "At John's place", "To the north of", "Near the town hall".
  • Knowledge relations like "John knows that", "John believes that", " "John said that","It is said in the book 'Hobbit'"

Some Q&A benchmark examples

Observe that these (mostly) do not contain rules and facts necessary for answering: building up these is a crucial part of the challenge.

However, the sentences in the tasks are not very complicated.

    1. Priit example for OpenBook Q&A (https://allenai.org/data/open-book-qa):

https://gitlab.cs.ttu.ee/Priit.Jarv1/openbook-qa/-/blob/master/examples/syntax_improved1.pl

% > Poison causes harm to which of the following? % A.) a Tree % B.) a robot % C.) a house % D.) a car

    1. Winograd Schema Challenge (https://cs.nyu.edu/faculty/davise/papers/WinogradSchemas/WS.html)

The city councilmen refused the demonstrators a permit because they [feared/advocated] violence. Who [feared/advocated] violence? Answers: The city councilmen/the demonstrators.

The trophy doesn't fit into the brown suitcase because it's too [small/large]. What is too [small/large]? Answers:The suitcase/the trophy.

Joan made sure to thank Susan for all the help she had [given/received]. Who had [given/received] help? Answers: Susan/Joan.

    1. GeoQuery (http://www.cs.utexas.edu/users/ml/nldata/geoquery.html) examples:

how many states border texas? what states border texas and have a major river? what is the total population of the states that border texas? what states border states that border states that border states that border texas

    1. FB babl (https://research.facebook.com/researchers/1543934539189348) example:

John picked up the apple. John went to the office. John went to the kitchen. John dropped the apple. Question: Where was the apple before the kitchen?

    1. Web questions (http://www-nlp.stanford.edu/software/sempre/)

what is the name of justin bieber brother? what character did natalie portman play in star wars? where donald trump went to college? what countries around the world speak french?



Basics of how to encode meaning

Again, let us start from simpler and less expressive to a bit more complex and more expressive style.

Context and meta-info is ignored in this section: see the next section.


Mainstream background

Some of the major semantic styles being advocated and used (they could be combined, I guess):

  • Discourse representation theory
 Covered well in the part 2 of
 Bos & Blackburn small book "Representation and Inference for Natural Language"
 part 1: http://www.coli.uni-saarland.de/publikationen/softcopies/Blackburn:1997:RIN.pdf
 part 2: https://ling.sprachwiss.uni-konstanz.de/pages/home/butt/main/material/bb-drt.pdf
 NB! Bos had a real system using that, where he employed provers to eliminate
 inconsistent interpretations: lots of examples in TPTP
  • Frame Semantics and Framenet
 A frame is a typical set of well-known relations like Being_born and Locative_relation
 which have a number of parameters and known rules about these
 https://framenet.icsi.berkeley.edu/fndrupal/
 https://en.wikipedia.org/wiki/FrameNet 
  • Neo-Davidsonian semantics / event-representations
 Events and such are objects which can be quantified over
 and related to various properties
 See a good intro:
   http://www.coli.uni-saarland.de/courses/incsem-12/neodavidsonian.pdf 
 See also Jurafsky & Martin 2nd edition section IV: semantics and pragmatics

See also

  • Jurafsky & Martin 2nd edition section IV: semantics and pragmatics
 https://github.com/rain1024/slp2-pdf/blob/master/complete-book-pdf/slp2.pdf


Minimal triple-style

Attempt to expressing relations using mostly (but not only)

  • triples: t3(object,property,value)
  • quads: t4(object,property,value,context_where_holds)

Observe that "property" here is a constant which typically has meta-rules like

All X,Y,Z . t3(X,is_a,Y) & t3(Y,is_a,Z) => t3(X,is_a,Z) All X,Y,Z . t3(X,is_not_a,Y) => -t3(X,is_a,Y)

"Cats are animals":

   t3(cat,is_a,animal)

"Plants are not animals":

   t3(cat,is_not_a,animal)

"Animals have a head":

   t3(animal,has,head)

"Poison kills when eaten":

   All X, P: (t3(X,eats,P) & t3(P,is_a,poison) => t3(P,kill_effect,X)  
   or, possibly with a further derivation, which is hard to do: 
     All X, P: (t3(X,eats,P) & t3(P,is_a,poison) => t3(X,has_property,dead) 

"Iron conducts electricity":

   t3(iron,conducts,electricity)

"Glass does not conduct electricity":

   -t3(iron,conducts,electricity)


"Percy is a cat":

   t3(percy,is_a,cat)

"John ingested poison",

   t3(john,ingested,poison) 

"John was born in 1990".

   t3(john,born_in,1990)

"Three cats sit on the couch":

  E X: t3(X,is_a,set) & (All Y . t3(Y,member_of,X) => t3(Y,is_a,cat) & t3(X,has_count_of_set,3) & 
    (All Y . t3(Y,member_of,X) => t3(Y,sit_on,couch))

"Are cats animals?":

   To be proved: t3(cat,is_a,animal)

"Does Percy have a head?":

   To be proved: t3(percy,has,head)

"Is John alive?":

   To be proved: t3(john,has_property,alive)

"Which of these is an animal: Percy, John":

   to be proved: Exists X. t3(X,is_a,animal) & (X=percy | X=john)

"Are there four cats on the couch?": to be proved:

   E X: t3(X,is_a,set) & (All Y . t3(Y,member_of,X) => t3(Y,is_a,cat) & t3(X,has_count_of_set,3) & 
     (All Y . t3(Y,member_of,X) => t3(Y,sit_on,couch))


Simple quantified style

A simple, yet usable way to go is to express categories/types as predicates and concrete objects as individual constants like in the following.

Observe that

  • no context is encoded
  • we assume that we can use words like "John" to point to specific individuals
  • categories/types/relations themselves cannot have properties: instead, they have a lot of associated rules describing their relations

"Cats are animals":

   All X . cat(X) => animal(X)

"Plants are not animals":

   All X. plant(X) => -animal(X).

"Animals have a head":

   All X. animal(X) => has(X,head)

"Poison kills when eaten":

   All X, P: (eats(X,P) & poison(P)) => kills(P,X)  
   or, possibly with a further derivation, which is hard to do: 
     All X, P: (eats(X,P) & poison(P)) => dead(X)  

"Iron conducts electricity":

   All X . iron(X) => conducts_electicity(X)

"Glass does not conduct electricity":

   All X . glass(X) => -conducts_electicity(X)


"Percy is a cat":

   cat(percy)

"John ingested poison",

   Exists X. ingested(john,X) & poison(X)

"John was born in 1990".

   born_in(john,1990)

"Three cats sit on the couch":

   E X: set(X) & (All Y . member(Y,X) => cat(Y)) & count_of_set(X,3) & 
     (All Y . member(Y,X) => sit_on(Y,couch))

"Are cats animals?":

   To be proved: All X . cat(X) => animal(X).

"Does Percy have a head?":

   To be proved: Exists X. has(percy,X) & head(X) 

"Is John alive?":

   To be proved: alive(john)

"Which of these is an animal: Percy, John":

   to be proved: Exists X. animal(X) & (X=percy | X=john)

"Are there four cats on the couch?":

   to be proved: E X: set(X) & (All Y . member(Y,X) => cat(Y)) & count_of_set(X,4) &
     (All Y . member(Y,X) => sit_on(Y,couch))

Kind of neo-davidsonian combination

The idea here is to allow quantification over properties, like in the simple-triple version earlier, but in a more flexible manner.

We will have a large, but manageable set of "meta" predicates like has_type, has_property, owns, has_part, causes, etc etc.

"Cats are animals":

   All X . has_type(X,cat) => has_type(X,animal)

"Plants are not animals":

   All X. has_type(X,plant) => -has_type(X,animal)

"Animals have a head":

   either All X. has_type(X,animal) => owns(X,head)
   or, if we know more, All X. has_type(X,animal) => has_part(X,head)

"Poison kills when eaten":

   All E, R, X, P: (
    (has_type(E,action) & has_type(E,eating) &  has_role(E,eater,X) & has_role(E,eaten,P)) => 
    ((has_type(R,action) & has_type(R,killing) & has_role(X,killed))

"Iron conducts electricity":

   All X . has_type(X,iron) => has_type(X,conducts_electicity)

"Glass does not conduct electricity":

   All X . has_type(X,iron) => -has_type(X,conducts_electicity)


"Percy is a cat":

   generally Exists X . has_name(X,Percy) & has_type(X,cat)
   and only if we can detected a concrete object we can replace X with, say 
   has_type(percy_123,cat)

"John ingested poison",

   Exists E, P, X. has_type(E,event) & has_type(E,ingesting) & has_role(E,ingested,P) &
     has_role(P,ingester,X) & has_name(X,John) & has_time(E,past).              

"John was born in 1990".

   Exists E X. has_type(E,event) & has_type(E,being_born) & has_year(E,1990) & 
     has_role(E,born,X) & has_property(E,name,John)

"Three cats sit on the couch":

   Exists X E: set(X) & (All Y . member(Y,X) => has_type(Y,cat) & 
     count_of_set(X,3) & (All Y . member(Y,X) => 
       (has_type(E,event) & has_type(E,sitting) & has_role(E,sitter,Y) &
       has_role(E,sitting_on,couch) & has_time(E,now)


Context and meta-info

Location, time and agent is basic crucial information sentences often convey.

For time and space you may look at Ernest Davis:

but it seems a bit too complex: Jurafsky & Martin propose some simpler ideas.

Time, space and agent should probably be encoded in the neo-davidsonian style as properties of events directly in the logical language like

"Tallinn is north of Tartu":

 here we assume NER produces ID-s for Tallinn and Tartu
 
 Exists X, Y.
 location(X,tallinn_123) & location(Y,tartu_123) &
 geo_relation(north_of,X,Y)
 
 or, since we know the location of cities does NOT change
 
 geo_relation(
    north_of,
    stable_location_of(tallinn_123),
    stable_location_of(tartu_123)),
 
 

"John was in Tallinn yesterday":

   here we assume NER produces ID-s for Tallinn and Tartu
   but John is not fixed to an object
   
   Exists X.
     has_property(X,name,John) &
     location_of_at_time(X,L,T) &
     location_inside(L,tallinn_123) &
     relative_time(T,yesterday)


Confidence and context of the information should probably be attached to sentences/formulas at the meta-level like

Scraping for world knowledge we detect that normally "Birds fly":

   Let S stand for "Birds fly", then we add
   on the meta-level 
   
   confidence(S,0.99) 
   

While parsing a book "Travels of John" we detect that it is probable that "John was in Tallinn in 2020":

   Let S stand for "John was in Tallinn in 2020", then we add
   on the meta-level
   
   confidence_for_tag(S,"Travels of John",0.9)
   

While parsing a book "Thoughts of Bert" we detect that almost certainly Bert in the book thought it probable that "John was in Tallinn in 2020":

   Let S stand for "John was in Tallinn in 2020", then we add
   on the meta-level 
   S_bert: confidence_for_agent(S,Bert,0.9)
   and finally
   confidence_for_tag(S_bert,"Thoughts of Bert",0.99)