Nlpsolver knowledge representation principles: draft
Allikas: Lambda
--------
taxonomy
-------
The schema predicate is "isa", used like this:
["isa",class,object_in_class]
Note: no context is currently added to the "isa".
Examples:
"John is a man.": ["isa","man","c1_John"]
"Bears are animals.": [["isa","bear","?:S2"], "=>", ["isa","animal","?:S2"]] ehk ["or", ["-isa","bear","?:S2"], ["isa","animal","?:S2"]]
Note: variables are anything starting with ?:, but I use a readability-enhancing convention:
?:S subject
?:O object
?:A action
?:Tense tense (past/present)
?:Fv situation number for a given tense
--------
"property" permanent (big) and temporary (angry)
------
The schema predicate is "prop", with these arguments:
["prop",actual_property,object,strength of property (1 small / $generic not indicated / 3 strong), class of property (a la small bear): $generic if missing, context]
About context:
* In case the statement holds always, use a variable for the context, like ["prop","nice","c1_John","$generic","$generic","?:Ctxt"]
* The context structure will be extended in the future with more parameters
* The current context structure is ["$ctxt",past_pres_or_future (either "Past","Pres"),concrete_situation_number in past/present/future: separate enumerations]
"John is nice": ["prop","nice","c1_John","$generic","$generic",["$ctxt","Pres",1]]
"John is somewhat nice" ["prop","nice","c1_John",1,"$generic",["$ctxt","Pres",1]]
"John is very nice" ["prop","nice","c1_John",3,"$generic",["$ctxt","Pres",1]]
"John is a big mouse": ["prop","big","c1_John","$generic","mouse",["$ctxt","Pres",1]]
"John is a nice mouse": ["prop","nice","c1_John","$generic","$generic",["$ctxt","Pres",1]]
Notice that in the last example in "nice mouse" the "nice" is not considered to be class-related. Only a fixed list of property-words like "big", "small", etc
is considered to be class-related.
---------
"hasa" possessions and body parts
---------
The schema predicate is "rel2" with the first argument "have", for undetermined number of things
["rel2","have",object_having,what_does_it_have,context]
plus functions for single values, counted sets and measures:
["$theof1",class_of_object_haved,object_who_has,context]
and
["$count",["$setof",logic_expression,object_who_has]]
where logic_expression may contain pseudo-lambda-parameters $arg1,$arg2 etc
and
["$count",["$measure1",type_of_measure,object_measured,unit_of_measurement,context]
where "unit_of_measurement" may be "$generic" if not relevant, and the type words are limited,
currently: heavy,light, long,shot, tall,short, wide,narrow, deep,shallow, warm,hot,cold,cost,cheap
Example for undetermined/unmeasured number of things:
"John has a car.":
First we make a formula:
[exists,[?:O3],[and,[isa,car,?:O3],[rel2,have,c1_John,?:O3,[$ctxt,Pres,1]]]]
and normalize it to
["isa","car","cs2"]
["rel2","have","c1_John","cs2",["$ctxt","Pres",1]]
Another example:
"Elephants have a trunk."
[forall,[?:S2],[[isa,elephant,?:S2],=>,[exists,[?:O1],[and,[isa,trunk,?:O1],[rel2,have,?:S2,?:O1,[$ctxt,Pres,1]]]]]]
and normalize it (observe skolemizing the "exists ?:O1" to ["cs1","?:S2"]):
["or", ["isa","trunk",["cs1","?:S2"]], ["-isa","elephant","?:S2"]]
["or",
["rel2","have","?:S2",["cs1","?:S2"],["$ctxt","Pres",1]],
["-isa","elephant","?:S2"],
["$block",["$","elephant",1],["$not",["rel2","have","?:S2",["cs1","?:S2"],["$ctxt","Pres",1]]]]],
A full example which illustrates that for questions and conditions we cannot like pre-skolemize, but have to
use the formula with quantifiers before the final normalization:
"John has a red car. John has a car?":
[and
[prop,red,cs2,$generic,$generic,[$ctxt,Pres,1]]
[isa,car,cs2,]
[rel2,have,c1_John,cs2,[$ctxt,Pres,1]]]
[[$def0],<=>,[exists,[?:O3],[and,[isa,car,?:O3],[rel2,have,c1_John,?:O3,[$ctxt,Pres,?:Fv5]]]]]
{ @question: [$def0] }
clausified
[
{"@logic": ["prop","red","cs2","$generic","$generic",["$ctxt","Pres",1]]},
{"@logic": ["isa","car","cs2"]},
{"@logic": ["rel2","have","c1_John","cs2",["$ctxt","Pres",1]]},
{"@logic": ["or", ["isa","car","cs3"], ["-$def0"]]},
{"@logic": ["or", ["rel2","have","c1_John","cs3",["$ctxt","Pres","?:Fv5"]], ["-$def0"]]},
{"@logic": ["or",
["$def0"],
["-isa","car","?:O3"],
["-rel2","have","c1_John","?:O3",["$ctxt","Pres","?:Fv5"]]]},
{"@question": ["$def0"]}
]
Next the functional having:
[and
[isa,elephant,the_c1_elephant]
[rel2,have,the_c1_elephant,[$theof1,trunk,the_c1_elephant,[$ctxt,Pres,1]],[$ctxt,Pres,1]]
[isa,trunk,[$theof1,trunk,the_c1_elephant,[$ctxt,Pres,1]]]
[prop,heavy,[$theof1,trunk,the_c1_elephant,[$ctxt,Pres,1]],$generic,trunk,[$ctxt,Pres,1]]]
Next the countable having:
"John has three red cars":
[exists,[?:O1],
[and,[=,3,[$count,[$setof,[and,[isa,car,$arg1],
[prop,red,$arg1,$generic,$generic,[$ctxt,Pres,1]]],c1_John]]],
[=,3,[$count,?:O1],[$conf,1,False]],[prop,red,?:O1,$generic,$generic,[$ctxt,Pres,1]],
[isa,car,?:O1],
[rel2,have,c1_John,?:O1,[$ctxt,Pres,1]]]]
with the main statement there normalized as
["=",3,
["$count",["$setof",
["and",["isa","car","$arg1"],["prop","red","$arg1","$generic","$generic",["$ctxt","Pres",1]]],
"c1_John"]]]
Next the measurable having:
"Nile has the length 10 kilometers" or
"The length of Nile is 10 kilometers" etc
[and
[rel2,have,c1_Nile,[$measure1,length,c1_Nile,kilometer,[$ctxt,Pres,1]],[$ctxt,Pres,1]]
[isa,length,[$measure1,length,c1_Nile,kilometer,[$ctxt,Pres,1]]]
[=,10,[$count,[$measure1,length,c1_Nile,kilometer,[$ctxt,Pres,1]]]]
[isa,kilometer,[$measure1,length,c1_Nile,kilometer,[$ctxt,Pres,1]]]]
with the main statement there normalized as
["=",10,["$count",["$measure1","length","c1_Nile","kilometer",["$ctxt","Pres",1]]]]
----------
"capability" what can it do (verbs?)
------
The schema predicates are can1 and can2:
[can1, verb_which_can_do, subject_who_can, action_or_capability_id, context]
[can2, verb_which_can_do, subject_who_can, object_of_action, action_or_capability_id, context]
and closely related predicates act1 and act2 with the same arguments, for actually doing something:
[act1, verb_which_can_do, subject_who_can, action_or_capability_id, context]
[act2, verb_which_can_do, subject_who_can, object_of_action, action_or_capability_id, context]
NB! The action verb (eat), doer/subject, context and optionally object are present as arguments,
but location, helpers, qualities of action etc are indicated separately as properties
of the action/capability id.
Example for can1:
"John can fly":
[exists,[?:A3],[can1,fly,c1_John,?:A3,[$ctxt,?:Tense4,1]]]
which is normalized to
["can1","fly","c1_John","cs2",["$ctxt","?:Tense2",1]]
Another example for can1:
"Birds can fly."
[forall,[?:S1],[[isa,bird,?:S1],=>,[exists,[?:A2],[can1,fly,?:S1,?:A2,[$ctxt,?:Tense3,1]]]]]
which is normalized to
["or",
["-isa","bird","?:S1"],
["can1","fly","?:S1",["cs1","?:S1"],["$ctxt","?:Tense3",1]],
["$block",["$","bird",1],["$not",["can1","fly","?:S1",["cs1","?:S1"],["$ctxt","?:Tense3",1]]]]]},
Yet another example for can1:
"Penguins cannot fly."
[forall,[?:S1],[[isa,penguin,?:S1],=>,[not,[exists,[?:A2],[can1,fly,?:S1,?:A2,[$ctxt,?:Tense3,1]]]]]
which is normalized to
["or",
["-isa","penguin","?:S1"],
["-can1","fly","?:S1","?:A2",["$ctxt","?:Tense3",1]],
["$block",["$","penguin",1],["can1","fly","?:S1","?:A2",["$ctxt","?:Tense3",1]]]]},
Example for can2:
"John can drive a car."
[and
[isa,car,the_c2_car,[$conf,1,False]]
[exists,[?:A2],[can2,drive,c1_John,the_c2_car,?:A2,[$ctxt,?:Tense3,1]]]]
which is normalized to
["isa","car","the_c2_car"]
["can2","drive","c1_John","the_c2_car","cs3",["$ctxt","?:Tense3",1]],
Another example for can2:
"Bears can eat honey."
[forall,[?:S2],[[isa,bear,?:S2],=>,[exists,[?:O1],[and,[isa,honey,?:O1,],[exists,[?:A3],[can2,eat,?:S2,?:O1,?:A3,[$ctxt,?:Tense4,1]]]]]]]
which is normalized to
["or", ["isa","honey",["cs1","?:S2"]], ["-isa","bear","?:S2"]],
["or",
["can2","eat","?:S2",["cs1","?:S2"],["cs2","?:S2"],["$ctxt","?:Tense4",1]],
["-isa","bear","?:S2"],
["$block",["$","bear",1],["$not",["can2","eat","?:S2",["cs1","?:S2"],["cs2","?:S2"],["$ctxt","?:Tense4",1]]]]],
Full example illustrating properties of the action/verb:
"John can fly fast. John can fly?"
[
{"@logic": ["prop","fast","cs2","$generic","$generic",["$ctxt","?:Tense2",1]]},
{"@logic": ["can1","fly","c1_John","cs2",["$ctxt","?:Tense3",1]]},
{"@logic": ["or", ["-$def0"], ["can1","fly","c1_John","cs3",["$ctxt","cs4","?:Fv6"]]]},
{"@logic": ["or", ["$def0"], ["-can1","fly","c1_John","?:A4",["$ctxt","?:Tense5","?:Fv6"]]]},
{"@question": ["$def0"]}
]
NB! There is also actually _doing_ something:
"John drove the red car":
[and
[prop,red,the_c2_car,$generic,$generic,[$ctxt,Past,1]]
[isa,car,the_c2_car]
[exists,[?:A2],[act2,drive,c1_John,the_c2_car,?:A2,[$ctxt,Past,1]]]]
------------
"comparative" arity 3 subject bigger subject2
------------
The schema predicate is rel2_than for non-measurable and "=", "$less", "$lesseq", "$greater", "$greatereq" for measurable:
[rel2_than,property_compared,more_object,less_object,action_id,context]
["=", counted_measure1, counted_measure2]
where the "counter_measure" has the same structure/meaning as above for the "having" relation.
NB!! We should probably modify rel2_than to contain the somewhat/much distinction,
or add the distinction to the action id, or drop the action id.
Example for non-measurable comparison:
"John is nicer than Eve."
[exists,[?:A1],[rel2_than,nice,c2_John,c1_Eve,?:A1,[$ctxt,?:Tense2,1]]]
which is normalized to
["rel2_than","nice","c2_John","c1_Eve","cs3",["$ctxt","?:Tense2",1]]
Example for measurable:
"The length of Nile is equal to the length of Amazon."
which is normalized to
["rel2","have","c1_Nile",["$measure1","length","c1_Nile","$generic",["$ctxt","Pres",1]],["$ctxt","Pres",1]]
["isa","length",["$measure1","length","c1_Nile","$generic",["$ctxt","Pres",1]]]
["rel2","have","c2_Amazon",["$measure1","length","c2_Amazon","$generic",["$ctxt","Pres",1]],["$ctxt","Pres",1]]
["isa","length",["$measure1","length","c2_Amazon","$generic",["$ctxt","Pres",1]]]},
["=",["$count",["$measure1","length","c1_Nile","$generic",["$ctxt","Pres",1]]],["$count",["$measure1","length","c2_Amazon","$generic",["$ctxt","Pres",1]]]]
where probably only the last one is actually needed and the rest can be skipped.
---------------
"partof" membership
---------------
The schema predicate is rel2_of in combination with "part" or "rel2" in combination with "in"
["rel2_of","part",what_is_the_part,who_has_the_part,action_relation_id,ctxt]
["rel2","in",wha_is_in,in_what,context],
NB! Maybe the action_relation_id should be dropped, or maybe some sensible use can be found?
NB! Also, maybe a special relation should be created?
Example for "rel2_of"+"part":
"Trunks are a part of an elephant."
[forall,[?:S2],[[isa,trunk,?:S2],=>,[and,[isa,elephant,the_c1_elephant],
[exists,[?:A3],[rel2_of,part,?:S2,the_c1_elephant,?:A3,[$ctxt,?:Tense4,1]]]]]]
normalized to
["or",
["rel2_of","part","?:S2","the_c1_elephant",["cs2","?:S2"],["$ctxt","?:Tense4",1]],
["-isa","trunk","?:S2"],
["$block",["$","trunk",1],["$not",["rel2_of","part","?:S2","the_c1_elephant",["cs2","?:S2"],["$ctxt","?:Tense4",1]]]]],
Example for "rel2"+"in":
"Elephants contain trunks"
[forall,[?:S2],[[isa,elephant,?:S2],=>,[exists,[?:O1],[and,[isa,trunk,?:O1],
[rel2,in,?:O1,?:S2,[$ctxt,Pres,1]]]]]]
which is normalized to
["or",
["rel2","in",["cs1","?:S2"],"?:S2",["$ctxt","Pres",1]],
["-isa","elephant","?:S2"],
["$block",["$","elephant",1],["$not",["rel2","in",["cs1","?:S2"],"?:S2",["$ctxt","Pres",1]]]]],
---------
"subjectto" what happens to it (can include events)
------------
Have not thought about it: needs work asap.
------------
"location" where is it normally found
------------
For actual location the
schema is "rel2" in combination with in "in","on","at","near","above","under":
["rel2",in_on_at_etc,object_in_location,object_where_is_located,context]},
However, for typical location we should think a bit more, see below.
Example:
"John is in a room."
[and
[isa,room,the_c2_room]
[rel2,in,c1_John,the_c2_room,[$ctxt,Pres,1]]]
which is normalized to
["isa","room","the_c2_room"]
["rel2","in","c1_John","the_c2_room",["$ctxt","Pres",1]]
NB! I propose the typical generic location to be represented like this with a low probability and blocker attached:
[forall,[?:S1,?:Ctxt],[[isa,dog,?:S1],=>,[exists,[?:O2],[rel2,in,?:S1,?O1,?:Ctxt]]]]
This latter thing is currently not properly implemented in the parser.
Alternative ideas are also welcome.
-----------------
These need thought, no clear ideas yet:
meta stuff.
mostly clear how these can connect events (X, Y)
mostly unclear how to combine e.g. causes and property
"causes" causes X
"prevents" Y prevents doing X
"dependency" X requires Y
"usedfor" subject is used for X
"createdby" subject is created by X
"madeof" subject is made of object (substance)
"have_goal" subject wants to do X / X to happen
------------
"time" X happens at time
-------------
Time is represented (a) in a context, (b) like location above, with words "in","at","on","during","before","after",
plus the "$time" constructor:
["rel2",in_at_etc,event,time_object,context]
where the time constructed element is used as a special typed variable:
[$time,type_of_time_indicator,time_indicator]
where the "type_of_time_indicator" can be "$generic".
Example:
"On Monday, John jumped in a house."
[exists,[?:A1],[and,[exists,[[$time,$generic,Monday]],
[rel2,on,?:A1,[$time,$generic,Monday],[$conf,1,False],[$ctxt,Past,1]]],
[exists,[?:O6],[and,[isa,house,?:O6,[$conf,1,False]],
[rel2,in,?:A1,?:O6,[$conf,1,False],[$ctxt,Past,1]]]],
[act1,jump,c1_John,[$conf,1,True],?:A1,[$ctxt,Past,1]]]]
which is normalized as
[rel2,on,cs2,[$time,$generic,Monday],[$ctxt,Past,1]],
[isa,house,cs3],
[rel2,in,cs2,cs3,[$ctxt,Past,1]],
[act1,jump,c1_John,cs2,[$ctxt,Past,1]],
---------------
event roles
"event_type" stab
"event_actor" senators
"event_theme" Caesar
"event_method" brutally
"event_instrument" knife
"event_type_modifier" if type is go: go IN, go OUT, ...
--------------
These may need more thought, but for now we have:
* type,actor,theme are given as act1/act2 arguments, see above
* method and instrument are properties of the action id, indicated with "rel2"
in combination with the actual word like "with": what does the "with" mean,
needs additional reasoning rules or procedural derivation of new facts.
Example:
"Senators stabbed Caesar with a knife in curia"
is normalized as
[isa,senator,the_c2_senator]},
[isa,knife,the_c3_knife]},
[isa,curia,the_c4_curia]},
[rel2,in,the_c3_knife,the_c4_curia,[$ctxt,Pres,1]]},
[isa,knife,cs6]},
[rel2,with,cs5,cs6,[$ctxt,Past,1]]},
[act2,stab,the_c2_senator,c1_Caesar,cs5,[$ctxt,Past,1]]},
Observe that the fact that there were several senators should be given,
but currently is not done for that example.
-----------
These need further thinking:
event meta
"event_parallel" X and Y are simultaneous
"event_after" Y happens after X
"event_content" Y is subevent of X (may be broken, other mixed use in db)
special use
"similar" semantic similarity
---------
I am attaching the current small ruleset I am using while debugging the parser:
it is intentionally small.
very high level commonsense rules
transitivity of "be"
symmetry of "similar"
inference using taxonomy of object (can leap |- can jump)
--------