Data Mining (ITI8730)
Information for perspective students: Up-to-date information about the course will be added to this page by 31.08.2023. Below you can see slides from the previous year. Testing procedures will change! The course is open to students with valid TalTech UniID! The course targets M.Sc. curricula students. It is expected that the students are familiar with the Calculus, Linear algebra, Probability, Statistics and possess basic to intermediate knowledge of at least one programming language.
This course is not recommended for students of B.Sc. curricula.
Fall 2022/2023
ITI8730: Data Mining and network analysis
Old code for this course is IDN0110
Taught by: Sven Nõmm
EAP: 6.0
Lectures: Tuesdays
Labs (practices):
Link to join MS Teams
Consultation: by appointment only Please do not hesitate to ask for appointment!!!
For communication please use the following e-mail: sven.nomm@taltech.ee
Prerequisites to join the course
Students are expected to be familiar with the foundations of Calculus, Linear algebra, Probability theory and Statistics and possess the knowledge of at least one programming language.
Overview
The course aims to provide knowledge of theory behind different methods of data mining and develop practical skills in applying those methods on practice. Is is spanned around four "super problems" of data mining:
- Clustering
- Classification
- Association pattern mining
- Outlier analysis
Main topics of the course:
- Data types and Data Preparation
- Similarity and Distances, Association Pattern Mining,
- Cluster Analysis, Classification, Outlier analysis
- Data streams, Text Data, Time Series, Discrete Sequences,
- Spatial Data, Graph Data, Web Data, Social Network Analysis
Evaluation
- 2x mandatory open book tests. Each test gives 10% of the final grade. One make-up attempt for each test.
- 3x mandatory home assignments (Computational assignment +short write up.) Each assignment gives 10% of the final grade. Late (after deadline) assignments are accepted with penalty of 10% for each day except Saturdays and Sundays.
- final exam (gives 50 % of the final grade): Written report on assigned topic + discussion with lecturer.
Exam prerequisites: All 2 closed book tests are accepted (graded as 51 or higher), all 3 home assignments are accepted (graded as 51 or higher).
Home assignments, code examples, data files and useful links will be distributed by means of Moodle environment. Course enrollment process in Moodle TBA.
Lectures
Week 1 30.08.22 Distance function
Week 2 06.09.22 Cluster analysis I
Week 3 13.09.22 Cluster analysis II
Week 4 20.09.22 Cluster analysis III
Week 5 27.09.22 Outlier analysis
Week 6 4.10.22 Classification I
Week 7 11.10.22 Classification II
Week 8 18.10.22 Regression
Week 9 25.10.22 Association Pattern mining
Week 9 27.10.22 Open book test I
Week 10 01.11.22 Distance and Similarity II
Week 11 08.11.22 Mining the Time series
Week 11 08.11.22 Mining data streams
Week 12 15.11.22 Text data mining
Week 13 22.11.22 Graph data mining and Social analysis
Home assignment 3 will be published on 24.11.2022
Make up test 1 26.12.22
Week 14 29.11.22 Privacy preserving data mining
Week 15 06.12.22 and 08.12.22 Online practices devoted to text data mining and graph data mining
Note that practice material is available only in TalTech Moodle environment!!!
Week 16 13.12.22 Open book test 2
Make-up test if necessary will be given on 15.12.2022