Nlpsolver knowledge representation principles: draft
Allikas: Lambda
-------- taxonomy ------- The schema predicate is "isa", used like this: ["isa",class,object_in_class] Note: no context is currently added to the "isa". Examples: "John is a man.": ["isa","man","c1_John"] "Bears are animals.": [["isa","bear","?:S2"], "=>", ["isa","animal","?:S2"]] ehk ["or", ["-isa","bear","?:S2"], ["isa","animal","?:S2"]] Note: variables are anything starting with ?:, but I use a readability-enhancing convention: ?:S subject ?:O object ?:A action ?:Tense tense (past/present) ?:Fv situation number for a given tense -------- "property" permanent (big) and temporary (angry) ------ The schema predicate is "prop", with these arguments: ["prop",actual_property,object,strength of property (1 small / $generic not indicated / 3 strong), class of property (a la small bear): $generic if missing, context] About context: * In case the statement holds always, use a variable for the context, like ["prop","nice","c1_John","$generic","$generic","?:Ctxt"] * The context structure will be extended in the future with more parameters * The current context structure is ["$ctxt",past_pres_or_future (either "Past","Pres"),concrete_situation_number in past/present/future: separate enumerations] "John is nice": ["prop","nice","c1_John","$generic","$generic",["$ctxt","Pres",1]] "John is somewhat nice" ["prop","nice","c1_John",1,"$generic",["$ctxt","Pres",1]] "John is very nice" ["prop","nice","c1_John",3,"$generic",["$ctxt","Pres",1]] "John is a big mouse": ["prop","big","c1_John","$generic","mouse",["$ctxt","Pres",1]] "John is a nice mouse": ["prop","nice","c1_John","$generic","$generic",["$ctxt","Pres",1]] Notice that in the last example in "nice mouse" the "nice" is not considered to be class-related. Only a fixed list of property-words like "big", "small", etc is considered to be class-related. --------- "hasa" possessions and body parts --------- The schema predicate is "rel2" with the first argument "have", for undetermined number of things ["rel2","have",object_having,what_does_it_have,context] plus functions for single values, counted sets and measures: ["$theof1",class_of_object_haved,object_who_has,context] and ["$count",["$setof",logic_expression,object_who_has]] where logic_expression may contain pseudo-lambda-parameters $arg1,$arg2 etc and ["$count",["$measure1",type_of_measure,object_measured,unit_of_measurement,context] where "unit_of_measurement" may be "$generic" if not relevant, and the type words are limited, currently: heavy,light, long,shot, tall,short, wide,narrow, deep,shallow, warm,hot,cold,cost,cheap Example for undetermined/unmeasured number of things: "John has a car.": First we make a formula: [exists,[?:O3],[and,[isa,car,?:O3],[rel2,have,c1_John,?:O3,[$ctxt,Pres,1]]]] and normalize it to ["isa","car","cs2"] ["rel2","have","c1_John","cs2",["$ctxt","Pres",1]] Another example: "Elephants have a trunk." [forall,[?:S2],[[isa,elephant,?:S2],=>,[exists,[?:O1],[and,[isa,trunk,?:O1],[rel2,have,?:S2,?:O1,[$ctxt,Pres,1]]]]]] and normalize it (observe skolemizing the "exists ?:O1" to ["cs1","?:S2"]): ["or", ["isa","trunk",["cs1","?:S2"]], ["-isa","elephant","?:S2"]] ["or", ["rel2","have","?:S2",["cs1","?:S2"],["$ctxt","Pres",1]], ["-isa","elephant","?:S2"], ["$block",["$","elephant",1],["$not",["rel2","have","?:S2",["cs1","?:S2"],["$ctxt","Pres",1]]]]], A full example which illustrates that for questions and conditions we cannot like pre-skolemize, but have to use the formula with quantifiers before the final normalization: "John has a red car. John has a car?": [and [prop,red,cs2,$generic,$generic,[$ctxt,Pres,1]] [isa,car,cs2,] [rel2,have,c1_John,cs2,[$ctxt,Pres,1]]] [[$def0],<=>,[exists,[?:O3],[and,[isa,car,?:O3],[rel2,have,c1_John,?:O3,[$ctxt,Pres,?:Fv5]]]]] { @question: [$def0] } clausified [ {"@logic": ["prop","red","cs2","$generic","$generic",["$ctxt","Pres",1]]}, {"@logic": ["isa","car","cs2"]}, {"@logic": ["rel2","have","c1_John","cs2",["$ctxt","Pres",1]]}, {"@logic": ["or", ["isa","car","cs3"], ["-$def0"]]}, {"@logic": ["or", ["rel2","have","c1_John","cs3",["$ctxt","Pres","?:Fv5"]], ["-$def0"]]}, {"@logic": ["or", ["$def0"], ["-isa","car","?:O3"], ["-rel2","have","c1_John","?:O3",["$ctxt","Pres","?:Fv5"]]]}, {"@question": ["$def0"]} ] Next the functional having: [and [isa,elephant,the_c1_elephant] [rel2,have,the_c1_elephant,[$theof1,trunk,the_c1_elephant,[$ctxt,Pres,1]],[$ctxt,Pres,1]] [isa,trunk,[$theof1,trunk,the_c1_elephant,[$ctxt,Pres,1]]] [prop,heavy,[$theof1,trunk,the_c1_elephant,[$ctxt,Pres,1]],$generic,trunk,[$ctxt,Pres,1]]] Next the countable having: "John has three red cars": [exists,[?:O1], [and,[=,3,[$count,[$setof,[and,[isa,car,$arg1], [prop,red,$arg1,$generic,$generic,[$ctxt,Pres,1]]],c1_John]]], [=,3,[$count,?:O1],[$conf,1,False]],[prop,red,?:O1,$generic,$generic,[$ctxt,Pres,1]], [isa,car,?:O1], [rel2,have,c1_John,?:O1,[$ctxt,Pres,1]]]] with the main statement there normalized as ["=",3, ["$count",["$setof", ["and",["isa","car","$arg1"],["prop","red","$arg1","$generic","$generic",["$ctxt","Pres",1]]], "c1_John"]]] Next the measurable having: "Nile has the length 10 kilometers" or "The length of Nile is 10 kilometers" etc [and [rel2,have,c1_Nile,[$measure1,length,c1_Nile,kilometer,[$ctxt,Pres,1]],[$ctxt,Pres,1]] [isa,length,[$measure1,length,c1_Nile,kilometer,[$ctxt,Pres,1]]] [=,10,[$count,[$measure1,length,c1_Nile,kilometer,[$ctxt,Pres,1]]]] [isa,kilometer,[$measure1,length,c1_Nile,kilometer,[$ctxt,Pres,1]]]] with the main statement there normalized as ["=",10,["$count",["$measure1","length","c1_Nile","kilometer",["$ctxt","Pres",1]]]] ---------- "capability" what can it do (verbs?) ------ The schema predicates are can1 and can2: [can1, verb_which_can_do, subject_who_can, action_or_capability_id, context] [can2, verb_which_can_do, subject_who_can, object_of_action, action_or_capability_id, context] and closely related predicates act1 and act2 with the same arguments, for actually doing something: [act1, verb_which_can_do, subject_who_can, action_or_capability_id, context] [act2, verb_which_can_do, subject_who_can, object_of_action, action_or_capability_id, context] NB! The action verb (eat), doer/subject, context and optionally object are present as arguments, but location, helpers, qualities of action etc are indicated separately as properties of the action/capability id. Example for can1: "John can fly": [exists,[?:A3],[can1,fly,c1_John,?:A3,[$ctxt,?:Tense4,1]]] which is normalized to ["can1","fly","c1_John","cs2",["$ctxt","?:Tense2",1]] Another example for can1: "Birds can fly." [forall,[?:S1],[[isa,bird,?:S1],=>,[exists,[?:A2],[can1,fly,?:S1,?:A2,[$ctxt,?:Tense3,1]]]]] which is normalized to ["or", ["-isa","bird","?:S1"], ["can1","fly","?:S1",["cs1","?:S1"],["$ctxt","?:Tense3",1]], ["$block",["$","bird",1],["$not",["can1","fly","?:S1",["cs1","?:S1"],["$ctxt","?:Tense3",1]]]]]}, Yet another example for can1: "Penguins cannot fly." [forall,[?:S1],[[isa,penguin,?:S1],=>,[not,[exists,[?:A2],[can1,fly,?:S1,?:A2,[$ctxt,?:Tense3,1]]]]] which is normalized to ["or", ["-isa","penguin","?:S1"], ["-can1","fly","?:S1","?:A2",["$ctxt","?:Tense3",1]], ["$block",["$","penguin",1],["can1","fly","?:S1","?:A2",["$ctxt","?:Tense3",1]]]]}, Example for can2: "John can drive a car." [and [isa,car,the_c2_car,[$conf,1,False]] [exists,[?:A2],[can2,drive,c1_John,the_c2_car,?:A2,[$ctxt,?:Tense3,1]]]] which is normalized to ["isa","car","the_c2_car"] ["can2","drive","c1_John","the_c2_car","cs3",["$ctxt","?:Tense3",1]], Another example for can2: "Bears can eat honey." [forall,[?:S2],[[isa,bear,?:S2],=>,[exists,[?:O1],[and,[isa,honey,?:O1,],[exists,[?:A3],[can2,eat,?:S2,?:O1,?:A3,[$ctxt,?:Tense4,1]]]]]]] which is normalized to ["or", ["isa","honey",["cs1","?:S2"]], ["-isa","bear","?:S2"]], ["or", ["can2","eat","?:S2",["cs1","?:S2"],["cs2","?:S2"],["$ctxt","?:Tense4",1]], ["-isa","bear","?:S2"], ["$block",["$","bear",1],["$not",["can2","eat","?:S2",["cs1","?:S2"],["cs2","?:S2"],["$ctxt","?:Tense4",1]]]]], Full example illustrating properties of the action/verb: "John can fly fast. John can fly?" [ {"@logic": ["prop","fast","cs2","$generic","$generic",["$ctxt","?:Tense2",1]]}, {"@logic": ["can1","fly","c1_John","cs2",["$ctxt","?:Tense3",1]]}, {"@logic": ["or", ["-$def0"], ["can1","fly","c1_John","cs3",["$ctxt","cs4","?:Fv6"]]]}, {"@logic": ["or", ["$def0"], ["-can1","fly","c1_John","?:A4",["$ctxt","?:Tense5","?:Fv6"]]]}, {"@question": ["$def0"]} ] NB! There is also actually _doing_ something: "John drove the red car": [and [prop,red,the_c2_car,$generic,$generic,[$ctxt,Past,1]] [isa,car,the_c2_car] [exists,[?:A2],[act2,drive,c1_John,the_c2_car,?:A2,[$ctxt,Past,1]]]] ------------ "comparative" arity 3 subject bigger subject2 ------------ The schema predicate is rel2_than for non-measurable and "=", "$less", "$lesseq", "$greater", "$greatereq" for measurable: [rel2_than,property_compared,more_object,less_object,action_id,context] ["=", counted_measure1, counted_measure2] where the "counter_measure" has the same structure/meaning as above for the "having" relation. NB!! We should probably modify rel2_than to contain the somewhat/much distinction, or add the distinction to the action id, or drop the action id. Example for non-measurable comparison: "John is nicer than Eve." [exists,[?:A1],[rel2_than,nice,c2_John,c1_Eve,?:A1,[$ctxt,?:Tense2,1]]] which is normalized to ["rel2_than","nice","c2_John","c1_Eve","cs3",["$ctxt","?:Tense2",1]] Example for measurable: "The length of Nile is equal to the length of Amazon." which is normalized to ["rel2","have","c1_Nile",["$measure1","length","c1_Nile","$generic",["$ctxt","Pres",1]],["$ctxt","Pres",1]] ["isa","length",["$measure1","length","c1_Nile","$generic",["$ctxt","Pres",1]]] ["rel2","have","c2_Amazon",["$measure1","length","c2_Amazon","$generic",["$ctxt","Pres",1]],["$ctxt","Pres",1]] ["isa","length",["$measure1","length","c2_Amazon","$generic",["$ctxt","Pres",1]]]}, ["=",["$count",["$measure1","length","c1_Nile","$generic",["$ctxt","Pres",1]]],["$count",["$measure1","length","c2_Amazon","$generic",["$ctxt","Pres",1]]]] where probably only the last one is actually needed and the rest can be skipped. --------------- "partof" membership --------------- The schema predicate is rel2_of in combination with "part" or "rel2" in combination with "in" ["rel2_of","part",what_is_the_part,who_has_the_part,action_relation_id,ctxt] ["rel2","in",wha_is_in,in_what,context], NB! Maybe the action_relation_id should be dropped, or maybe some sensible use can be found? NB! Also, maybe a special relation should be created? Example for "rel2_of"+"part": "Trunks are a part of an elephant." [forall,[?:S2],[[isa,trunk,?:S2],=>,[and,[isa,elephant,the_c1_elephant], [exists,[?:A3],[rel2_of,part,?:S2,the_c1_elephant,?:A3,[$ctxt,?:Tense4,1]]]]]] normalized to ["or", ["rel2_of","part","?:S2","the_c1_elephant",["cs2","?:S2"],["$ctxt","?:Tense4",1]], ["-isa","trunk","?:S2"], ["$block",["$","trunk",1],["$not",["rel2_of","part","?:S2","the_c1_elephant",["cs2","?:S2"],["$ctxt","?:Tense4",1]]]]], Example for "rel2"+"in": "Elephants contain trunks" [forall,[?:S2],[[isa,elephant,?:S2],=>,[exists,[?:O1],[and,[isa,trunk,?:O1], [rel2,in,?:O1,?:S2,[$ctxt,Pres,1]]]]]] which is normalized to ["or", ["rel2","in",["cs1","?:S2"],"?:S2",["$ctxt","Pres",1]], ["-isa","elephant","?:S2"], ["$block",["$","elephant",1],["$not",["rel2","in",["cs1","?:S2"],"?:S2",["$ctxt","Pres",1]]]]], --------- "subjectto" what happens to it (can include events) ------------ Have not thought about it: needs work asap. ------------ "location" where is it normally found ------------ For actual location the schema is "rel2" in combination with in "in","on","at","near","above","under": ["rel2",in_on_at_etc,object_in_location,object_where_is_located,context]}, However, for typical location we should think a bit more, see below. Example: "John is in a room." [and [isa,room,the_c2_room] [rel2,in,c1_John,the_c2_room,[$ctxt,Pres,1]]] which is normalized to ["isa","room","the_c2_room"] ["rel2","in","c1_John","the_c2_room",["$ctxt","Pres",1]] NB! I propose the typical generic location to be represented like this with a low probability and blocker attached: [forall,[?:S1,?:Ctxt],[[isa,dog,?:S1],=>,[exists,[?:O2],[rel2,in,?:S1,?O1,?:Ctxt]]]] This latter thing is currently not properly implemented in the parser. Alternative ideas are also welcome. ----------------- These need thought, no clear ideas yet: meta stuff. mostly clear how these can connect events (X, Y) mostly unclear how to combine e.g. causes and property "causes" causes X "prevents" Y prevents doing X "dependency" X requires Y "usedfor" subject is used for X "createdby" subject is created by X "madeof" subject is made of object (substance) "have_goal" subject wants to do X / X to happen ------------ "time" X happens at time ------------- Time is represented (a) in a context, (b) like location above, with words "in","at","on","during","before","after", plus the "$time" constructor: ["rel2",in_at_etc,event,time_object,context] where the time constructed element is used as a special typed variable: [$time,type_of_time_indicator,time_indicator] where the "type_of_time_indicator" can be "$generic". Example: "On Monday, John jumped in a house." [exists,[?:A1],[and,[exists,[[$time,$generic,Monday]], [rel2,on,?:A1,[$time,$generic,Monday],[$conf,1,False],[$ctxt,Past,1]]], [exists,[?:O6],[and,[isa,house,?:O6,[$conf,1,False]], [rel2,in,?:A1,?:O6,[$conf,1,False],[$ctxt,Past,1]]]], [act1,jump,c1_John,[$conf,1,True],?:A1,[$ctxt,Past,1]]]] which is normalized as [rel2,on,cs2,[$time,$generic,Monday],[$ctxt,Past,1]], [isa,house,cs3], [rel2,in,cs2,cs3,[$ctxt,Past,1]], [act1,jump,c1_John,cs2,[$ctxt,Past,1]], --------------- event roles "event_type" stab "event_actor" senators "event_theme" Caesar "event_method" brutally "event_instrument" knife "event_type_modifier" if type is go: go IN, go OUT, ... -------------- These may need more thought, but for now we have: * type,actor,theme are given as act1/act2 arguments, see above * method and instrument are properties of the action id, indicated with "rel2" in combination with the actual word like "with": what does the "with" mean, needs additional reasoning rules or procedural derivation of new facts. Example: "Senators stabbed Caesar with a knife in curia" is normalized as [isa,senator,the_c2_senator]}, [isa,knife,the_c3_knife]}, [isa,curia,the_c4_curia]}, [rel2,in,the_c3_knife,the_c4_curia,[$ctxt,Pres,1]]}, [isa,knife,cs6]}, [rel2,with,cs5,cs6,[$ctxt,Past,1]]}, [act2,stab,the_c2_senator,c1_Caesar,cs5,[$ctxt,Past,1]]}, Observe that the fact that there were several senators should be given, but currently is not done for that example. ----------- These need further thinking: event meta "event_parallel" X and Y are simultaneous "event_after" Y happens after X "event_content" Y is subevent of X (may be broken, other mixed use in db) special use "similar" semantic similarity --------- I am attaching the current small ruleset I am using while debugging the parser: it is intentionally small. very high level commonsense rules transitivity of "be" symmetry of "similar" inference using taxonomy of object (can leap |- can jump) --------