Teadmispõhise tarkvaraarenduse meetodid / Methods of Knowledge Based Software Development - 2015
Course code: ITI8600
(Ainekaart eesti keeles ITI8600)
Language: The default language of the course is English, but if all students understand Estonian, it will be in Estonian
Lecturers:
- Tanel Tammet, tanel.tammet@ttu.ee, 6203457, TTÜ ICT-426
- Juhan Ernits, juhan.ernits@ttu.ee, 6202326, TTÜ ICT-428 (handles ÕIS registrations)
Past editions
This course will be offered for the first time in 2015. It is the result of combining three previously offered courses, Knowledge Search, Formalization and Storing, Principles of Artificial Intelligence and applied logic
Time, place, result
- Lectures: Mondays 17:45-19:15, SOC-211B, SOC-211C
- Labs: Tuesdays 17:45-19:15, ICT-401
- Exam:
- Monday, Jan 11, 10:00-12:30, U05-103. Reg deadline in ÕIS: Jan 8, 2016.
- Monday, Jan 18, 10:00-12:30, U05-103. Reg deadline in ÕIS: Jan 15, 2016.
Please register to the appropriate exam in ÕIS.
Grading
The final grade will be based on 40% of points from homework assignments and 60% of the result of an exam.
There will be four homework assignments, one for each block. Assignments will give up to 10 points each. In order to successfully pass the course, at least three homeworks must be successfully defended.
Homeworks can be done alone or in pairs. Pairs will be formed randomly by the lecturers, separately for each homework. As said, you can always opt to do it alone.
Homework has to be presented during lab time to the lecturer on site: email submissions are not accepted. Both pair members must be present during presentation: in case one of them is not present, the homework of the missing person is not considered to be defended. It is also not guaranteed that both pair members get the same grade.
The homeworks have to be submitted to the university git and then defended: git details will be presented later by Juhan.
Homework deadline policy:
- Defended code must be submitted for defence latest one date before the defence deadline (example: defence deadline 22. Sept, submission 21. Sept).
- In case the homework is defended in time, you have one extra week to add missing details/improvements without losing points.
- In case the homework is not defended in time, you have two extra weeks to defend it, but in this case you will get only half the points.
- No homeworks are accepted after the two extra weeks after the deadline have passed.
- In order to be accepted to exam you have to successfully defend at least three of the four homeworks.
Grades and additional homework info available at https://ained.ttu.ee
Exam
Exam dates:
- January 11, U05-103
- January 18, U05-103
Q&A sessions:
- January 8, at 14:00, ICT-411
- January 15, at 14:00, ICT-411
The exam will contain one block of questions per each of the four blocks below. The exact requirements and materials to read are these:
Materials for search algorithms
The search algorithms block was based on the following chapters from the book Artificial Intelligence, a Modern Approach, 3rd Edition, by Stewart Russell and Peter Norvig. (The book is available in TUT library as [1] and [2]):
- Chapter 3: Solving problems by searching
- Chapter 4: Beyond classical search
- Chapter 5: Adversarial search
- Chapter 6: Constraint satisfaction problems
In particular, it will be necessary to be able to choose best methods from the ones mentioned in those chapters for solving particular problems. In addition it is necessary to be able to charachterize the properties of these approaches in terms of relevant criteria (branching factor, time complexity, space complexity, completeness).
Materials for knowledge representation
The exam question will consist of a small exercise in encoding given knowledge, not theoretical questions.
In order to do the exercise you need to:
- Understand the relationships and be able to transform between "standard relational databases" aka SQL bases, RDF, RDFS and 1st order logic. Encoding SQL tables in RDF, encoding RDF in logic, encoding SQL queries in 1st order logic. Understand all of the first lecture notes as ppt and as as pdf.
- Understand basics of RDF and RDFS from RDF, RDFs, beginning of w3c RDF primer. Understand basics of RDFA, see RDFa in wikipedia and beginning of w3c RDFa primer (read it while avoiding complex details). OWL and the other complex special rule languages are not required.
Materials for reasoning and deduction
Again, the exam question will consist of a small exercise in encoding given knowledge and performing derivations, not theoretical questions.
In order to do the exercise you need to:
- Be able to do small derivations in first order logic using the basic resolution method. Using equality and special strategies not required. See for intro and use the chapters "Resolution", "The Resolution Procedure" and "The ANL Loop" in a book by Geoff to check out and understand simple derivation examples. Be able to encode RDF and RDFS rules in 1st order logic and to perform derivations, see the materials from the previous section.
- Understand the core ideas of nonmonotonic logic (avoid complex details here, just scan to get an overview). Understand the basics of default logic (be able to encode small problems and perform derivations).
- Understand the core ideas and principal differences between probabilistic and fuzzy logic, using this material (thorough details, encoding examples, formulas and calculations not required).
Materials for learning
The learning exam questions will be based on the following materials:
- Chapter 18: Learning from examples. In Artificial Intelligence, a Modern Approach, 3rd Edition
- Chapter 3 (3.1, 3.2, 3.5): Linear regression. In [Introduction to Statistical Learning http://www-bcf.usc.edu/~gareth/ISL/] by Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani
- Chapter 4 (4.1, 4.2, 4.3, 4.6.5) Logistic regression, comparison with K-nearest neighbours, in ISL
- Chapter 5 (5.1, 5.2) Cross-Validation, Bootstrap. In ISL
- Chapter 8 (8.1, 8.2) Decision trees, Bagging, Random Forests, Boosting. In ISL
- Chapter 9 (9.1, 9.2 (9.3-5)) Support vector machines. In ISL
- Perceptron and neural networks (based on the lecture slides)
Course structure
The course will consist of four interconnected blocks covering crucial areas of the subject:
Search algorithms
Homework is available in Moodle. To log in you will need to use your TUT e-mail account in Office 365. The groups have been assigned to participants who registered to the course in Moodle on September 14. Homework defence deadline: 22. September.
NB! Konsultatsioon / Q&A session 1: September 17, 17.00-19.00 in room ICT-411. If the room is empty, please call 2326 from the door phone of the department of CS.
- Tree search, graph search, formulating problems to be solved by search (recap of what you know)
- Depth first search, breadth first search, depth limited search, iterative deepening search; and Heuristic search (A*), properties of heuristic functions
- Stochastic local search; and Constraint satisfaction problems
- Adversarial search (games, minimax, alpha-beta pruning)
Knowledge representation
Knowledge representation homework
Homework defence deadline: 3. November. Presentation afer this deadline will give half of the points. Absolute deadline 1 December.
Work should be submitted to git. Latest one day before deadline.
New groups and repositories are available.
Useful in-depth material for reading as free pdf-s:
- Knowledge Representation and Reasoning
- Handbook of Knowledge Representation
- Interesting to browse: recent conference proceedings
- Interesting to browse: course materials, course materials
Block structure:
Intro: SQL, logic and RDF.
First lecture as ppt and as as pdf
Natural language and restricted natural language
Lecture intro as ppt and as as pdf
See also about IBM Watson
RDFa, RDFs, OWL and rules
Lecture intro as ppt and as as pdf
- Start with the wikipedia take on RDF and RDFs
- RDFa in wikipedia and w3c RDFa primer
- Continue with w3c RDF primer which contains also RDFs aka RDF schema.
- OWL OWL wikipedia, w3c OWL primer
Context, metainformation and rules.
- Other important KR languages: RIF, Common Logic, RuleML, TPTP language,KIF
Reasoning and deduction
Automated reasoning homework 2015
Homework defence deadline: 1. December. No presentations accepted after 8. December.
Useful books for reading:
- T.Tamme, T.Tammet, R.Prank. Loogika: mõtlemisest tõestamiseni. TÜ Kirjastus, 2002
- Geoff reasoning course notes
Test and compare simple propositional solver algorithms:
Block structure:
First order logic solvers
- one page about predicate logic provers
- into for prop case
- wiki intro to pred logic case
- a book by Geoff created from course notes: best content IMHO despite weird formatting.
- Otter by McCune: use it for experimenting.
- tiny examples of problems for otter
Propositional solvers
- Read about propositional solvers
- Use http://logictools.org/ to experiment with random problems of various sizes and solver algorithms
SMT solvers
- Intro to SMT from wikipedia
- SMT tutorial slides (8 first slides were shown in the lecture)
- Z3 tutorial in Python, links to binaries etc as compiled by Juhan Ernits.
- The official binary releases of Z3 now include Java support. The example is available here.
- The sample code built during the lecture is available in Moodle.
Logic for uncertain knowledge
- nonmonotonic logic: long intro to main concepts
- default logic: main material for this lecture.
- belief and knowledge
Interesting to try out:
- dlv system for answer set programming: usable for implementing default logic
Things we looked at before:
- Uncertain_prob_fuzzy.ppt Intro to probabilistic and fuzzy logic.
- Vienna_tanel_2.pdf Additional examples and combining.
Learning
- Nov 23: Decision trees
- Nov 30: Linear regression. Chapter 3 of Introduction to Statistical Learning
- Dec 1: Lab: Cross validation, notes from lab.
- Dec 7: Logistic regression, introduction to SVMs, ensemble methods
- Dec 14: Neural networks, Summary
- Dec 15: GridSearch example (added to the end of the crossvalidation lab).
Homework task
(3 best results of 4 homeworks will be taken into account in marking.)
The task is to sign up to the "What's Cooking" challenge at Kaggle, download the data and evaluate at least 3 different learning algorithms in Scikit learn on that data. You should submit the appropriate scripts and a report of the results with explanation. NB! It is necessary to run cross-validation in all cases and explain why you think why some approach performs better than the other.
It is encouraged but not mandated to submit your best result to Kaggle.
Homework defence deadline: 15. December.
Additional reading
Introduction to Statistical Learning, a freely available book. The Elements of Statistical Learning, a freely available book including a bit more advanced treatment of topics in machine learning.