KR 2020 homework part 1

Allikas: Lambda

As a source for geographical data we use yago. Investigate parts of it and download from the download page. The second important taxonomy source is wordnet: Yago contains some (?) of it, but you can use wordnet directly in your system.

Yago is large. You do not need to incorporate all of Yago in your database: the geographical facts for say, one country, and taxonomies are enough.

Your task is to

  • select a sensible sub-part of geographical facts in Yago (like, one country) and store it in the SQL database, using both standard SQL and, where it seems useful, json. Postgresql is the best option due to special capabilities of handling json. Sqlite is the second best option, due to simplicity of use.
  • select a sensible sub-part of Yago taxonomies and store it in the same SQL database.
  • investigate whether your yago taxonomy contains relevant parts of wordnet: if not, incorporate also wordnet into your database.
  • perform some sample queries to verify that you can actually find information: try searching for both simple facts and also using taxonomies for searching for more abstract concepts.