Data Mining and network analysis IDN0110 2016
Spring 2015/2016
IDN0110: Data Mining and network analysis
Taught by: Sven Nõmm
EAP: 6.0
Time and place: NB! Note time and places of the lectures on even weeks have been changed!!!
Lectures: Mondays 17:15-18:45 ICT-312 Labs: Tuesdays 14:00-15:30 ICT-401
Consultations and Examinations (Preliminary).
Place ICT-405
Tuesday 24.05 Consultation 14:00 - 15:30
Tuesday 31.05 Examination 1 14:00 - 15:30
Friday 10.06 Examination 2 14:00 - 15:30
Tuesday 14.06 Make up Examination 14:00 - 15:30
Consultation: by appointment only Thursdays 17.30-18-30
Additional information: sven.nomm@ttu.ee
Overview
The course aims to provide knowledge of theory behind different methods of data mining and develop practical skills in applying those methods on practice. Is is spanned around four "super problems" of data mining:
- Clustering
- Classification
- Association pattern mining
- Outlier analysis
Main topics of the course:
- Data types and Data Preparation
- Similarity and Distances, Association Pattern Mining,
- Cluster Analysis, Classification, Outlier analysis
- Data streams, Text Data, Time Series, Discrete Sequences,
- Spatial Data, Graph Data, Web Data, Social Network Analysis
- Privacy-Preserving Data Mining
Evaluation
- 2x mandatory closed book tests. Each test gives 10% of the final grade.
- 4x mandatory home assignments (Computational assignment +short write up.) 30% of the final grade (computed on the basis of three best results)
- final exam (gives 50 % of the final grade): Written report on assigned topic + discussion with lecturer.
Exam prerequisites: both closed book tests are accepted (graded as 51 or higher), all 4 home assignments are accepted (graded as 51 or higher).
- 91 < score -- grade 5 (excellent)
- 81 < score < 90 -- grade 4 (very good)
- 71 < score < 80 -- grade 3 (good)
- 61 < score < 70 -- grade 2 (satisfactory)
- 51 < score < 60 -- grade 1 (acceptable)
score ≤ 50 -- a student has failed to pass
Lectures
Lecture slides, necessary files, links and other necessary information would appear here before the lecture or practice.
Lecture 1: Introduction and Data Preparation
Practice 1 (Due to the software problems Lab will be repeated on 9.02.2016
Lab 1 manual Exercise 1 Exercise 2 Data file for the exercise 1 Data file for the exercise 2
Lecture 2: Distance and Similarity Part I
Practice 2
Lecture 3: Distance and Similarity Part I
NB! Moodle environment for the course has been activated
If you need the code to enroll please contact the teacher by e-mail. I will continue to upload lecture slides here. All other resources including home assignments will be available thorough the moodle only!!!
Lecture 4: Distance and Similarity Part I
Lecture 5: Cluster Analysis
Home Assignment 1
NB! Home Assignment 1 is available in the Moodle environment of this course! In order to access it one should have login and password for ained.ttu.ee and enroll their self to the course!
Lecture 6: Cluster Analysis
Lecture 7: Outlier Analysis
Lecture 8: Outlier Analysis
Lecture 9: Closed Book Test
07.04.2016
There will be no consultation today.
Lecture 10: Mining Data Streams
Lecture 11: Mining Time Series
April the 25th: The Lecture is cancelled! Please Accept my apology. The practice on Tuesday the 26 will take place according to the schedule.