Data Mining (ITI8730)

Allikas: Kursused
Redaktsioon seisuga 5. september 2022, kell 14:36 kasutajalt Sven (arutelu | kaastöö)
Mine navigeerimisribale Mine otsikasti

Fall 2022/2023

ITI8730: Data Mining and network analysis

Old code for this course is IDN0110

Taught by: Sven Nõmm

EAP: 6.0

Lectures: Tuesdays 16:30 - 18:00 ICT-315

Labs (practices): Thursdays 14:00 - 15:30 ICT-401

Link to join MS Teams It is advisable to use MS Teams client application and log in with TalTech account.


Consultation: by appointment only Please do not hesitate to ask for appointment!!! For communication please use the following e-mail: sven.nomm@taltech.ee

Prerequisites to join the course

Students are expected to be familiar with the foundations of Calculus, Linear algebra, Probability theory and Statistics and possess the knowledge of at least one programming language.

Overview

The course aims to provide knowledge of theory behind different methods of data mining and develop practical skills in applying those methods on practice. Is is spanned around four "super problems" of data mining:

  • Clustering
  • Classification
  • Association pattern mining
  • Outlier analysis

Main topics of the course:

  • Data types and Data Preparation
  • Similarity and Distances, Association Pattern Mining,
  • Cluster Analysis, Classification, Outlier analysis
  • Data streams, Text Data, Time Series, Discrete Sequences,
  • Spatial Data, Graph Data, Web Data, Social Network Analysis

Evaluation

  • 2x mandatory open book tests. Each test gives 10% of the final grade. One make-up attempt for each test.
  • 3x mandatory home assignments (Computational assignment +short write up.) Each assignment gives 10% of the final grade. Late (after deadline) assignments are accepted with penalty of 10% for each day except Saturdays and Sundays.
  • final exam (gives 50 % of the final grade): Written report on assigned topic + discussion with lecturer.

Exam prerequisites: All 2 closed book tests are accepted (graded as 51 or higher), all 3 home assignments are accepted (graded as 51 or higher).

Home assignments, code examples, data files and useful links will be distributed by means of Moodle environment. Course enrollment process in Moodle TBA.

Lectures

Week 1 30.08.22 Distance function

Slides


Week 2 06.09.22 Cluster analysis I

Slides


Week 3 13.09.22 Cluster analysis II

Week 4 20.09.22 Cluster analysis III

Week 5 27.09.22 Outlier analysis

Week 6 4.10.22 Classification I

Week 7 11.10.22 Classification II

Week 8 18.10.22 Regression

Week 9 25.10.22 Association Pattern mining

Week 9 TBC 27.10.22 Open book test I

Week 10 TBC 01.11.22 Distance and Similarity II

Week 10 TBC 03.11.22 Mining the Time series

Week 11 TBC 08.11.22 Mining data streams

Week 12 TBC 15.11.22 Text data mining

Week 13 TBC 22.11.22 Graph data mining

Week 14 TBC 29.11.22 Social networks analysis

Week 15 TBC 06.12.22 Privacy preserving data mining

Week 16 TBC 13.12.22 Open book test 2