Data Mining and network analysis IDN0110 2016

Allikas: Kursused
Redaktsioon seisuga 8. veebruar 2016, kell 14:04 kasutajalt Sven (arutelu | kaastöö)
Mine navigeerimisribale Mine otsikasti

Spring 2015/2016

IDN0110: Data Mining and network analysis

Taught by: Sven Nõmm

EAP: 6.0

Time and place:

 Lectures: Mondays  Weeks: 1,3,5,7,9,11,13,15   17:15-18:45  ICT-312
                    Weeks: 2,4,6,8,10,12,14,16  17:45-19:15  ICT-A2
 Labs:     Tuesdays                             14:00-15:30  ICT-401


Consultation: by appointment only Thursdays 17.30-18-30 Additional information: sven.nomm@ttu.ee

Overview

The course aims to provide knowledge of theory behind different methods of data mining and develop practical skills in applying those methods on practice. Is is spanned around four "super problems" of data mining:

  • Clustering
  • Classification
  • Association pattern mining
  • Outlier analysis

Main topics of the course:

  • Data types and Data Preparation
  • Similarity and Distances, Association Pattern Mining,
  • Cluster Analysis, Classification, Outlier analysis
  • Data streams, Text Data, Time Series, Discrete Sequences,
  • Spatial Data, Graph Data, Web Data, Social Network Analysis
  • Privacy-Preserving Data Mining

Evaluation

  • 2x mandatory closed book tests. Each test gives 10% of the final grade.
  • 4x mandatory home assignments (Computational assignment +short write up.) 30% of the final grade (computed on the basis of three best results)
  • final exam (gives 50 % of the final grade): Written report on assigned topic + discussion with lecturer.

Exam prerequisites: both closed book tests are accepted (graded as 51 or higher), all 4 home assignments are accepted (graded as 51 or higher).

  • 91 < score -- grade 5 (excellent)
  • 81 < score < 90 -- grade 4 (very good)
  • 71 < score < 80 -- grade 3 (good)
  • 61 < score < 70 -- grade 2 (satisfactory)
  • 51 < score < 60 -- grade 1 (acceptable)

score ≤ 50 -- a student has failed to pass

Lectures

Lecture slides, necessary files, links and other necessary information would appear here before the lecture or practice.

Lecture 1: Introduction and Data Preparation

Slides

Practice 1 (Due to the software problems Lab will be repeated on 9.02.2016

Lab 1 manual Exercise 1 Exercise 2 Data file for the exercise 1 Data file for the exercise 2

Lecture 2: Distance and Similarity Part I

Slides