Data Mining (ITI8730)

Allikas: Kursused
Redaktsioon seisuga 16. detsember 2021, kell 09:31 kasutajalt Sven (arutelu | kaastöö)
Mine navigeerimisribale Mine otsikasti

Fall 2021/2022

ITI8730: Data Mining and network analysis

Old code for this course is IDN0110

Taught by: Sven Nõmm

EAP: 6.0

Lectures: Tuesdays 14:00 - 15:30 SOC-414

Labs (practices): Thursdays 16:00 - 17:30 ICT-403

Link to join MS Teams https://teams.microsoft.com/l/channel/19%3a2PRNmKxRN9GR2oG68vo3_-25RYYTxAbZrA5dJ0YfoAA1%40thread.tacv2/General?groupId=9dde2da7-1d60-49a4-ac28-2d065338e369&tenantId=3efd4d88-9b88-4fc9-b6c0-c7ca50f1db57 It is advisable to use MS Teams client application and log in with TalTech account.

To join the course in talTech Moodle please use code "UseR!!!"

Consultation: by appointment only Please do not hesitate to ask for appointment!!! For communication please use the following e-mail: sven.nomm@taltech.ee

Prerequisites to join the course

Students are expected to be familiar with the foundations of Calculus, Linear algebra, Probability theory and Statistics and possess the knowledge of at least one programming language.

Overview

The course aims to provide knowledge of theory behind different methods of data mining and develop practical skills in applying those methods on practice. Is is spanned around four "super problems" of data mining:

  • Clustering
  • Classification
  • Association pattern mining
  • Outlier analysis

Main topics of the course:

  • Data types and Data Preparation
  • Similarity and Distances, Association Pattern Mining,
  • Cluster Analysis, Classification, Outlier analysis
  • Data streams, Text Data, Time Series, Discrete Sequences,
  • Spatial Data, Graph Data, Web Data, Social Network Analysis

Evaluation

  • 3x mandatory closed book tests. Each test gives 10% of the final grade. One make-up attempt for each test.
  • 3x mandatory home assignments (Computational assignment +short write up.) Each assignment gives 10% of the final grade. Late (after deadline) assignments are accepted with penalty of 10% for each day except Saturdays and Sundays.
  • final exam (gives 40 % of the final grade): Written report on assigned topic + discussion with lecturer.

Exam prerequisites: All 3 closed book tests are accepted (graded as 51 or higher), all 3 home assignments are accepted (graded as 51 or higher).

Home assignments, code examples, data files and useful links will be distributed by means of Moodle environment. Course enrollment process in Moodle TBA.

Lectures

Week 1 Distance function

Slides

Week 2 Cluster analysis

Slides

Week 3 Cluster analysis

Slides

Week 4 Outlier analysis

Slides

Week 5 Classification

Slides

Week 6 Closed Book I Test 05.10.2021

Home assignment defense 07.10.2021

Week 7 The lecture on 12.10.2021 will be given in ONLINE mode only

General analysis of the results of closed book test 1, continuation of the lecture 5, interesting discussion of a few recent trends.

Week 8 Classification

Slides

Week 9 Regression

Slides

Week 10 Associative pattern mining

Slides

Make-up test 1 Online only; November the 4th 16:00

Week 11 Tests

09.11.21 14:00 defense of the home assignments (hybrid). Online options are also available in the evening! 11.11.21 Opened book test online!

Week 12 Similarity and distance II

Slides

Week 13 Mining Time Series and Mining Data Streams

NB!!! 23.11.2021 Lecture will start 14:15

Slides


Slides

Week 14 Mining text data

Slides

Week 15 Graph data mining and social networks analysis

Slides Slides Slides

Week 16 Home Assignment defense on 14.12.2021 and Open Book Test 3 on 16.12.2021

No lecture or practice streaming this week!