Itv0060 2014 arhiiv
Code: ITV0060 |
Sisukord
- 1 NB! See on 2014 aasta arhiiv, mitte hetkel käsiloleva kursuse materjalikogu
NB! See on 2014 aasta arhiiv, mitte hetkel käsiloleva kursuse materjalikogu
- 14. mai vaatame üle eksamimaterjalid + indeksite loeng.
- 16. mai on viimane ametlik prakside esitamise päev, aga ...
- 21. mai on varupäev prakside jaoks + indeksite loeng.
Eksamil neli küsimust väikeste ülesannete näol: igast suurest
allolevast teemast üks.
Exam times:
- 26. May, Monday: in room ICT-A2 at 10:00
- 3. June, Tuesday: in ICT-A2 at 10:00
Time, place, result
Semester: spring
Grading: exam
Lectures: every Wednesday 16:00-17.30, room ICT-A2
Practical work: Fridays on odd weeks (7. Feb, 21 Feb, ...) 12:00-13:30 room ICT-403
Practical work will give 40% and exam 60% of points underlying the final grade.
Focus
The main focus of the course is on KR (knowledge representation): how to represent nontrivial information in programs and databases, how to build and use indexes for efficient search through large sets of knowledge.
The course contains four blocks built on each other:
- Background and basics. Representing simple facts.
- Representing rules.
- Time, planning and uncertain knowledge.
- Indexes and search.
Ca half of the course themes are covered in this book.
Practical work
There are three labs: the first two are obligatory, third is optional. The labs have to be presented to the prof and all students present at labwork.
First lab 2014
The goal of the first lab is to write a software system able to scrape factual raw data about a person with a name given to the program.
Basically, search for web pages containing the name and extract relevant words from the list you create: words closer to the name and with more occurrences are also more important/better match. It may be a good idea to do some searches together with interesting keywords already. Also, whenever you do a search / pull a page, it is a good idea to store the search result / html source in a file to avoid exchausting your search quota and just wasting bandwidth and time,
Deadline: recommended mid-March, absolutely latest end of March (after this there will be a penalty).
See also notes for KR lab 1.
Second lab 2014
The goal of the second lab is to write and use rules to categorize/tag people according to the data obtained during the first lab. A person should get a number of tags with numerical indicators showing our trust in that the tag really applies to the person, plus (sometimes) a number indicating the degree to which the tag applies.
As a concrete scenario imagine we want to show ads to the people and decide the concrete ads based on the tags/categories of people.
See details for KR lab 2.
Third lab 2014
This lab is optional and will simply give as many points as lab 1 or 2 towards the final result and the grade: practical wotrk 60% and exam 40%.
The goal of the lab is to use wordnet or teksaurus to create an additional ruleset and to use that in addition to your own rules in lab 2.
Student ideas are also welcome: have to agree with Tanel first.
Lecture block 1: basics and representing simple facts.
Overview of the course. Background and basics. First lab.
Lecture materials:
- Intro lecture: declarative and procedural representations
- Core logic refresher.
- Details of the first lab.
Programming and databases. SQL: meaning and representation of facts
- Encoding data in programming languages.
- relation of plain data in databases to logic & representing complex structures in databases
Core ideas of non-relational databases, mostly RDF
Lecture materials:
HTML annotations. Microformats, microdata, RDFa
Lecture materials:
- Html annotations, RDFa, RDFs RDFs part covered during the next block
Understand main parts (not part of exam):
- Google rich snippets
- wiki intro
- RDFa lite ja w3c rdfa primer
- open graph protocol (i.e. Facebook stuff)
- microformats
- Media:portaalidekoosvoime.ppt or as pdf Media:portaalidekoosvoime.pdf
Data extraction from the web
not part of exam
- Overview
- Some details about Nell as a nice example.
Lecture block 2: representing rules
RDFS and logic
Understand RDFS:
Lecture material:
- RDFS: rdf schema and as ppt
Additional details (not part of exam):
Important KR languages
Not necessary for the exam.
Several languages:
- RDFS
- OWL
- KIF
- CL
- ontologies
- wordnet
- cyc
- restricted english systems
- frame systems
See also:
RDFs and OWL
Understand basics of owl:
not part of exam:
Owl background: description logics (not part of exam )
- brief intro to description logic
- detailed course in description logics (not necessary for exam)
Start looking at interesting ontologies:
Restricted english
- http://attempto.ifi.uzh.ch/site/pubs/ Attempto restricted english]
Attempto details not necessary for the exam.
Lecture block 3: time, planning and uncertain knowledge
Rules in planning and robotics
Lecture material:
See also (not part of exam):
Logic for uncertain knowledge
- nonmonotonic logic Not necessary for exam.
- default logic For exam: main material for default logic.
For the exam: you should be able to create and solve small examples with default logic.
Fuzzy and probabilistic logic
- Uncertain_prob_fuzzy.ppt Intro.
- Vienna_tanel_2.pdf Additional examples and combining.
For the exam: understand the differences between fuzzy and probabilistic logic and be able to present small examples.
Logic of belief and knowledge
- Ijcai93.pdf Overview.
For exam: understand referential transparency and core ideas about encoding belief and knowledge. Modal logic not necessary for the exam.
Lecture block 4: indexes and search
Indexes: intro and mainstream
Traditional database indexes incl B+ tree:
Hash indexes:
- wiki.
Bitmap indexes:
- wiki.
For exam: understand the core usage scenarios and be able to create small examples.
Multi-field and geoindexes, fulltext and term indexes
- A few words about both multi-field indexes and geoindexes.
- Fulltext indexes: good intro and overview
- Term indexes: mccune paper
not part of exam:
Fancier term indexes
- Path indexes
- On from here.
Nosql indexes
- Document bases
- Graph bases