Notes about semantic parsing
Just notes, not a detailed theory or literature review.
We'll start with examples of sentences whose meaning we want to encode and then continue to different meaning encoding styles, from simpler to more complex.
Sisukord
Kinds of sentences we want to encode
Let us start from very simple sentences and continue to slightly more complex sentences.
Immediate goal: very simple sentences
Can encode simple sentences of three kinds, each typically associated with a confidence like:
- Generic rules about kinds/categories like
"Cats are animals", "Plants are not animals", "Animals have a head", "Poison kills when eaten", "Iron conducts electricity", "Glass does not conduct electricity"
- Statements about specific objects like
"Percy is a cat", "John ingested poison", "John was born in 1990".
- Statements about specific situations like
"Three cats sit on the couch".
- Questions about either generic statements or specific objects or situations:
Generic like: "Are cats animals?", Specific like "Does Percy have a head?", "Is John alive?", "Which of these is an animal: Percy, John" "Are there four cats on the couch?"
Mid-level goal: sentences in a "relatively simple" context
Like
"Steve thinks John ingested poison", "Donald Trump said John was born in 1990", "In the book 'Hobbit' there are dragons", "John ate two hamburgers yesterday", "Three cats sat on the couch", "Three cats sat on the couch at John's place".
The "relatively simple" context should allow to express (to some degree, not perfectly)
- Time relations like "earlier", "after", "at the same time", "in 2000"
- Location relations like "In Tallinn", "At John's place", "To the north of", "Near the town hall".
- Knowledge relations like "John knows that", "John believes that", " "John said that","It is said in the book 'Hobbit'"
Some Q&A benchmark examples
Observe that these (mostly) do not contain rules and facts necessary for answering: building up these is a crucial part of the challenge.
However, the sentences in the tasks are not very complicated.
- Priit example for OpenBook Q&A (https://allenai.org/data/open-book-qa):
https://gitlab.cs.ttu.ee/Priit.Jarv1/openbook-qa/-/blob/master/examples/syntax_improved1.pl
% > Poison causes harm to which of the following? % A.) a Tree % B.) a robot % C.) a house % D.) a car
- Winograd Schema Challenge (https://cs.nyu.edu/faculty/davise/papers/WinogradSchemas/WS.html)
The city councilmen refused the demonstrators a permit because they [feared/advocated] violence. Who [feared/advocated] violence? Answers: The city councilmen/the demonstrators.
The trophy doesn't fit into the brown suitcase because it's too [small/large]. What is too [small/large]? Answers:The suitcase/the trophy.
Joan made sure to thank Susan for all the help she had [given/received]. Who had [given/received] help? Answers: Susan/Joan.
- GeoQuery (http://www.cs.utexas.edu/users/ml/nldata/geoquery.html) examples:
how many states border texas? what states border texas and have a major river? what is the total population of the states that border texas? what states border states that border states that border states that border texas
- FB babl (https://research.facebook.com/researchers/1543934539189348) example:
John picked up the apple. John went to the office. John went to the kitchen. John dropped the apple. Question: Where was the apple before the kitchen?
- Web questions (http://www-nlp.stanford.edu/software/sempre/)
what is the name of justin bieber brother? what character did natalie portman play in star wars? where donald trump went to college? what countries around the world speak french?
Basics of how to encode meaning
Again, let us start from simpler and less expressive to a bit more complex and more expressive style.
Context and meta-info is ignored in this section: see the next section.
Mainstream background
Some of the major semantic styles being advocated and used (they could be combined, I guess):
- Discourse representation theory
Covered well in the part 2 of Bos & Blackburn small book "Representation and Inference for Natural Language" part 1: http://www.coli.uni-saarland.de/publikationen/softcopies/Blackburn:1997:RIN.pdf part 2: https://ling.sprachwiss.uni-konstanz.de/pages/home/butt/main/material/bb-drt.pdf NB! Bos had a real system using that, where he employed provers to eliminate inconsistent interpretations: lots of examples in TPTP
- Frame Semantics and Framenet
A frame is a typical set of well-known relations like Being_born and Locative_relation which have a number of parameters and known rules about these https://framenet.icsi.berkeley.edu/fndrupal/ https://en.wikipedia.org/wiki/FrameNet
- Neo-Davidsonian semantics / event-representations
Events and such are objects which can be quantified over and related to various properties See a good intro: http://www.coli.uni-saarland.de/courses/incsem-12/neodavidsonian.pdf See also Jurafsky & Martin 2nd edition section IV: semantics and pragmatics
See also
- Jurafsky & Martin 2nd edition section IV: semantics and pragmatics
https://github.com/rain1024/slp2-pdf/blob/master/complete-book-pdf/slp2.pdf
- Situations semantics https://plato.stanford.edu/entries/situations-semantics/
- Dynamic semantics https://plato.stanford.edu/entries/dynamic-semantics/
Minimal triple-style
Attempt to expressing relations using mostly (but not only)
- triples: t3(object,property,value)
- quads: t4(object,property,value,context_where_holds)
Observe that "property" here is a constant which typically has meta-rules like
All X,Y,Z . t3(X,is_a,Y) & t3(Y,is_a,Z) => t3(X,is_a,Z) All X,Y,Z . t3(X,is_not_a,Y) => -t3(X,is_a,Y)
"Cats are animals":
t3(cat,is_a,animal)
"Plants are not animals":
t3(cat,is_not_a,animal)
"Animals have a head":
t3(animal,has,head)
"Poison kills when eaten":
All X, P: (t3(X,eats,P) & t3(P,is_a,poison) => t3(P,kill_effect,X) or, possibly with a further derivation, which is hard to do: All X, P: (t3(X,eats,P) & t3(P,is_a,poison) => t3(X,has_property,dead)
"Iron conducts electricity":
t3(iron,conducts,electricity)
"Glass does not conduct electricity":
-t3(iron,conducts,electricity)
"Percy is a cat":
t3(percy,is_a,cat)
"John ingested poison",
t3(john,ingested,poison)
"John was born in 1990".
t3(john,born_in,1990)
"Three cats sit on the couch":
E X: t3(X,is_a,set) & (All Y . t3(Y,member_of,X) => t3(Y,is_a,cat) & t3(X,has_count_of_set,3) & (All Y . t3(Y,member_of,X) => t3(Y,sit_on,couch))
"Are cats animals?":
To be proved: t3(cat,is_a,animal)
"Does Percy have a head?":
To be proved: t3(percy,has,head)
"Is John alive?":
To be proved: t3(john,has_property,alive)
"Which of these is an animal: Percy, John":
to be proved: Exists X. t3(X,is_a,animal) & (X=percy | X=john)
"Are there four cats on the couch?": to be proved:
E X: t3(X,is_a,set) & (All Y . t3(Y,member_of,X) => t3(Y,is_a,cat) & t3(X,has_count_of_set,3) & (All Y . t3(Y,member_of,X) => t3(Y,sit_on,couch))
Simple quantified style
A simple, yet usable way to go is to express categories/types as predicates and concrete objects as individual constants like in the following.
Observe that
- no context is encoded
- we assume that we can use words like "John" to point to specific individuals
- categories/types/relations themselves cannot have properties: instead, they have a lot of associated rules describing their relations
"Cats are animals":
All X . cat(X) => animal(X)
"Plants are not animals":
All X. plant(X) => -animal(X).
"Animals have a head":
All X. animal(X) => has(X,head)
"Poison kills when eaten":
All X, P: (eats(X,P) & poison(P)) => kills(P,X) or, possibly with a further derivation, which is hard to do: All X, P: (eats(X,P) & poison(P)) => dead(X)
"Iron conducts electricity":
All X . iron(X) => conducts_electicity(X)
"Glass does not conduct electricity":
All X . glass(X) => -conducts_electicity(X)
"Percy is a cat":
cat(percy)
"John ingested poison",
Exists X. ingested(john,X) & poison(X)
"John was born in 1990".
born_in(john,1990)
"Three cats sit on the couch":
E X: set(X) & (All Y . member(Y,X) => cat(Y)) & count_of_set(X,3) & (All Y . member(Y,X) => sit_on(Y,couch))
"Are cats animals?":
To be proved: All X . cat(X) => animal(X).
"Does Percy have a head?":
To be proved: Exists X. has(percy,X) & head(X)
"Is John alive?":
To be proved: alive(john)
"Which of these is an animal: Percy, John":
to be proved: Exists X. animal(X) & (X=percy | X=john)
"Are there four cats on the couch?":
to be proved: E X: set(X) & (All Y . member(Y,X) => cat(Y)) & count_of_set(X,4) & (All Y . member(Y,X) => sit_on(Y,couch))
Kind of neo-davidsonian combination
The idea here is to allow quantification over properties, like in the simple-triple version earlier, but in a more flexible manner.
We will have a large, but manageable set of "meta" predicates like has_type, has_property, owns, has_part, causes, etc etc.
"Cats are animals":
All X . has_type(X,cat) => has_type(X,animal)
"Plants are not animals":
All X. has_type(X,plant) => -has_type(X,animal)
"Animals have a head":
either All X. has_type(X,animal) => owns(X,head) or, if we know more, All X. has_type(X,animal) => has_part(X,head)
"Poison kills when eaten":
All E, R, X, P: ( (has_type(E,action) & has_type(E,eating) & has_role(E,eater,X) & has_role(E,eaten,P)) => ((has_type(R,action) & has_type(R,killing) & has_role(X,killed))
"Iron conducts electricity":
All X . has_type(X,iron) => has_type(X,conducts_electicity)
"Glass does not conduct electricity":
All X . has_type(X,iron) => -has_type(X,conducts_electicity)
"Percy is a cat":
generally Exists X . has_name(X,Percy) & has_type(X,cat) and only if we can detected a concrete object we can replace X with, say has_type(percy_123,cat)
"John ingested poison",
Exists E, P, X. has_type(E,event) & has_type(E,ingesting) & has_role(E,ingested,P) & has_role(P,ingester,X) & has_name(X,John) & has_time(E,past).
"John was born in 1990".
Exists E X. has_type(E,event) & has_type(E,being_born) & has_year(E,1990) & has_role(E,born,X) & has_property(E,name,John)
"Three cats sit on the couch":
Exists X E: set(X) & (All Y . member(Y,X) => has_type(Y,cat) & count_of_set(X,3) & (All Y . member(Y,X) => (has_type(E,event) & has_type(E,sitting) & has_role(E,sitter,Y) & has_role(E,sitting_on,couch) & has_time(E,now)
Context and meta-info
Location, time and agent is basic crucial information sentences often convey.
For time and space you may look at Ernest Davis:
- Old book https://cs.nyu.edu/faculty/davise/rck/rck.html
- Or new version https://www.jair.org/index.php/jair/article/view/11076/26258
but it seems a bit too complex: Jurafsky & Martin propose some simpler ideas.
Time, space and agent should probably be encoded in the neo-davidsonian style as properties of events directly in the logical language like
"Tallinn is north of Tartu":
here we assume NER produces ID-s for Tallinn and Tartu Exists X, Y. location(X,tallinn_123) & location(Y,tartu_123) & geo_relation(north_of,X,Y) or, since we know the location of cities does NOT change geo_relation( north_of, stable_location_of(tallinn_123), stable_location_of(tartu_123)),
"John was in Tallinn yesterday":
here we assume NER produces ID-s for Tallinn and Tartu but John is not fixed to an object Exists X. has_property(X,name,John) & location_of_at_time(X,L,T) & location_inside(L,tallinn_123) & relative_time(T,yesterday)
Confidence and context of the information should probably
be attached to sentences/formulas at the meta-level like
Scraping for world knowledge we detect that normally "Birds fly":
Let S stand for "Birds fly", then we add on the meta-level confidence(S,0.99)
While parsing a book "Travels of John" we detect that it is probable that "John was in Tallinn in 2020":
Let S stand for "John was in Tallinn in 2020", then we add on the meta-level confidence_for_tag(S,"Travels of John",0.9)
While parsing a book "Thoughts of Bert" we detect that almost certainly Bert in the book thought it probable that "John was in Tallinn in 2020":
Let S stand for "John was in Tallinn in 2020", then we add on the meta-level S_bert: confidence_for_agent(S,Bert,0.9) and finally confidence_for_tag(S_bert,"Thoughts of Bert",0.99)