Erinevus lehekülje "Data Mining (ITI8730)" redaktsioonide vahel

Allikas: Kursused
Mine navigeerimisribale Mine otsikasti
 
(ei näidata sama kasutaja 84 vahepealset redaktsiooni)
1. rida: 1. rida:
Fall 2021/2022
+
 
 +
<span style="color:red"> Information for perspective students:</span>
 +
 
 +
<span style="color:red"> Lecture schedule and slides content are tentative. Order of the topics has changed compared to the last year.  Please follow the course page in TalTech Moodle for up to date information and lecture content!!!</span>
 +
 
 +
<span style="color:red"> The course is open to students with valid TalTech UniID!
 +
The course targets M.Sc. curricula students.  It is expected that the students are familiar with the Calculus, Linear algebra, Probability, Statistics and possess basic to intermediate knowledge of at least one programming language. This course is not recommended for students of B.Sc. curricula.
 +
</span>
 +
 
 +
<span style="color:red">
 +
Code to join course page  in Moodle and MS Teams will be provided to the students via ÕIS e-mail on Monday September the 2nd.
 +
</span>
 +
 
 +
<span style="color:red">
 +
Those planning to use their own computers please install "R" and "R-studio".
 +
</span>
 +
 
 +
 
 +
Fall 2024
  
 
ITI8730: Data Mining and network analysis
 
ITI8730: Data Mining and network analysis
5. rida: 23. rida:
 
Old code for this course is IDN0110
 
Old code for this course is IDN0110
  
Taught by: Sven Nõmm
+
Taught by: Prof. Sven Nõmm
 +
 
 +
Teaching assistants: Mihhail Daniljuk, Anton Osvald Kuusk, Jaak Kapten.
  
 
EAP: 6.0
 
EAP: 6.0
 
   
 
   
Lectures:  Tuesdays 14:00 - 15:30 SOC-414
+
Lectures:  Tuesdays 12:00 - 13:30 ICO-217 (IT college building)
 
                        
 
                        
Labs (practices):    Thursdays 16:00 - 17:30 ICT-403
+
Labs (practices):    Thursdays 14:00 - 15:30 ICT-121
 
 
Link to join MS Teams https://teams.microsoft.com/l/channel/19%3a2PRNmKxRN9GR2oG68vo3_-25RYYTxAbZrA5dJ0YfoAA1%40thread.tacv2/General?groupId=9dde2da7-1d60-49a4-ac28-2d065338e369&tenantId=3efd4d88-9b88-4fc9-b6c0-c7ca50f1db57
 
It is advisable to use MS Teams client application and log in with TalTech account.
 
  
To join the course in talTech Moodle please use code "UseR!!!"
+
Link to join MS Teams (will be provided to the students regestrated via ÕIS on Monday September the 2nd by 17:00)
  
 
Consultation: '''by appointment only''' Please do not hesitate to ask for appointment!!!
 
Consultation: '''by appointment only''' Please do not hesitate to ask for appointment!!!
26. rida: 43. rida:
 
==Overview ==
 
==Overview ==
 
The course aims to provide knowledge of theory behind different methods of data mining and develop practical skills in applying those methods on practice. Is is spanned around four "super problems" of data mining:
 
The course aims to provide knowledge of theory behind different methods of data mining and develop practical skills in applying those methods on practice. Is is spanned around four "super problems" of data mining:
 +
* Classification
 
* Clustering
 
* Clustering
* Classification
 
 
* Association pattern mining
 
* Association pattern mining
 
* Outlier analysis
 
* Outlier analysis
34. rida: 51. rida:
 
* Data types and Data Preparation
 
* Data types and Data Preparation
 
* Similarity and Distances, Association Pattern Mining,
 
* Similarity and Distances, Association Pattern Mining,
* Cluster Analysis, Classification, Outlier analysis
+
* Classification, Cluster Analysis, Outlier analysis
 
* Data streams, Text Data, Time Series, Discrete Sequences,
 
* Data streams, Text Data, Time Series, Discrete Sequences,
* Spatial Data, Graph Data, Web Data, Social Network Analysis
+
* Graph Data, Social Network Analysis
  
 
==Evaluation==
 
==Evaluation==
*3x mandatory closed book tests. Each test gives 10% of the final grade. One make-up attempt for each test.
+
*2x mandatory closed book tests. Each test gives 10% of the final grade. One make-up attempt for each test.
 
*3x mandatory home assignments (Computational assignment +short write up.) Each assignment gives 10% of the final grade. Late (after deadline) assignments are accepted with penalty of 10% for each day except Saturdays and Sundays.
 
*3x mandatory home assignments (Computational assignment +short write up.) Each assignment gives 10% of the final grade. Late (after deadline) assignments are accepted with penalty of 10% for each day except Saturdays and Sundays.
*final exam (gives 40 % of the final grade): Written report on assigned topic + discussion with lecturer.
+
*final exam (gives 50 % of the final grade): Written report on assigned topic + discussion with lecturer.
Exam prerequisites: All 3 closed book tests are accepted (graded as 51 or higher), all 3 home assignments are accepted (graded as 51 or higher).
+
Exam prerequisites: All 2 closed book tests are accepted (graded as 51 or higher), all 3 home assignments are accepted (graded as 51 or higher).
  
 
Home assignments, code examples, data files and useful links will be distributed by means of Moodle environment. Course enrollment  process in Moodle TBA.
 
Home assignments, code examples, data files and useful links will be distributed by means of Moodle environment. Course enrollment  process in Moodle TBA.
 +
Please note below are the slides from previous year, ordering of some topic and some content has change. Use it for reference purposes only. Up to data slides are provided by means of TalTech Moodle Environment.
  
=Lectures =
+
=Lectures and Time line =
== Week 1  Distance function ==
+
== 03.09.24 Distance function ==
[[Media:Lecture1_DM2021_Introduction_distance_function.pdf ‎|Slides]]
+
[[Media:Lecture_01_DM2024_Introduction_distance_functions.pdf ‎|Slides]]
 
 
== Week 2  Cluster analysis ==
 
[[Media:Lecture_02_DM2021_Cluster_analysis.pdf ‎|Slides]]
 
  
== Week 3  Cluster analysis ==
+
== 10.09.24 Classification I ==
[[Media:Lecture_03_DM_2021_Cluster_analysis_EM.pdf ‎|Slides]]
+
[[Media:Lecture_02_DM2024_Classification_I.pdf ‎|Slides]]
  
== Week 4  Outlier analysis ==
 
[[Media:Lecture_04_DM2021_Anomaly_and_Outlier_Analysis.pdf ‎|Slides]]
 
  
== Week 5  Classification ==
+
== 17.09.24 Classification II ==
[[Media:Lecture_5_DM2021_Classification.pdf ‎|Slides]]
+
[[Media:Lecture_03_Classification_II_DM_2024.pdf ‎|Slides]]
  
== Week 6  Closed Book I Test 05.10.2021 ==
+
== 24.09.24 ==
Home assignment defense 07.10.2021
+
[[Media:Lecture_04_DM2024_Regression_analysis_and_data_preparation.pdf ‎|Slides]]
  
== Week 7  The lecture on 12.10.2021 will be given in ONLINE mode only ==
+
== 01.10.24 Cluster analysis I==
General analysis of the results of closed book test 1, continuation of the lecture 5, interesting discussion of a few recent trends.
+
[[Media:Lecture_05_DM2024_Cluster_analysis_I.pdf ‎|Slides]]
  
== Week 8  Classification ==
+
== 08.10.24 Association pattern mining ==
[[Media:Lecture_06_Classification_2_DM_2021.pdf ‎|Slides]]
+
[[Media:Lecture_06_DM2024_Association_Pattern_Mining.pdf ‎|Slides]]
  
== Week 9  Regression ==
+
== 15.10.24 Clustering II ==
[[Media:Lecture_07_Data_preparation_regression_DM_2021.pdf ‎|Slides]]
+
[[Media:Lecture_07_DM_2024_Cluster_analysis_EM_algorithm.pdf ‎|Slides]]
  
 +
== 22.10.24 Anomaly and Outlier Analysis ==
 +
[[Media:Lecture_08_DM2024_Anomaly_and_Outlier_Analysis.pdf ‎|Slides]]
  
== Week 10  Regression ==
 
Make-up test 1  Online only November the 4th  16:00
 
  
== Week 11 Regression ==
+
== 05.11.24 Similarity and Distance II ==
09.11.21 14:00 defense of the home assignments (hybrid). Online options are also available in the evening!
+
[[Media:Lecture_09_DM2024_Similarity_and_Distance_2.pdf ‎|Slides]]
11.11.21  Opened book test online!
 

Viimane redaktsioon: 4. november 2024, kell 15:55

Information for perspective students:

Lecture schedule and slides content are tentative. Order of the topics has changed compared to the last year. Please follow the course page in TalTech Moodle for up to date information and lecture content!!!

The course is open to students with valid TalTech UniID! The course targets M.Sc. curricula students. It is expected that the students are familiar with the Calculus, Linear algebra, Probability, Statistics and possess basic to intermediate knowledge of at least one programming language. This course is not recommended for students of B.Sc. curricula.

Code to join course page in Moodle and MS Teams will be provided to the students via ÕIS e-mail on Monday September the 2nd.

Those planning to use their own computers please install "R" and "R-studio".


Fall 2024

ITI8730: Data Mining and network analysis

Old code for this course is IDN0110

Taught by: Prof. Sven Nõmm

Teaching assistants: Mihhail Daniljuk, Anton Osvald Kuusk, Jaak Kapten.

EAP: 6.0

Lectures: Tuesdays 12:00 - 13:30 ICO-217 (IT college building)

Labs (practices): Thursdays 14:00 - 15:30 ICT-121

Link to join MS Teams (will be provided to the students regestrated via ÕIS on Monday September the 2nd by 17:00)

Consultation: by appointment only Please do not hesitate to ask for appointment!!! For communication please use the following e-mail: sven.nomm@taltech.ee

Prerequisites to join the course

Students are expected to be familiar with the foundations of Calculus, Linear algebra, Probability theory and Statistics and possess the knowledge of at least one programming language.

Overview

The course aims to provide knowledge of theory behind different methods of data mining and develop practical skills in applying those methods on practice. Is is spanned around four "super problems" of data mining:

  • Classification
  • Clustering
  • Association pattern mining
  • Outlier analysis

Main topics of the course:

  • Data types and Data Preparation
  • Similarity and Distances, Association Pattern Mining,
  • Classification, Cluster Analysis, Outlier analysis
  • Data streams, Text Data, Time Series, Discrete Sequences,
  • Graph Data, Social Network Analysis

Evaluation

  • 2x mandatory closed book tests. Each test gives 10% of the final grade. One make-up attempt for each test.
  • 3x mandatory home assignments (Computational assignment +short write up.) Each assignment gives 10% of the final grade. Late (after deadline) assignments are accepted with penalty of 10% for each day except Saturdays and Sundays.
  • final exam (gives 50 % of the final grade): Written report on assigned topic + discussion with lecturer.

Exam prerequisites: All 2 closed book tests are accepted (graded as 51 or higher), all 3 home assignments are accepted (graded as 51 or higher).

Home assignments, code examples, data files and useful links will be distributed by means of Moodle environment. Course enrollment process in Moodle TBA. Please note below are the slides from previous year, ordering of some topic and some content has change. Use it for reference purposes only. Up to data slides are provided by means of TalTech Moodle Environment.

Lectures and Time line

03.09.24 Distance function

Slides

10.09.24 Classification I

Slides


17.09.24 Classification II

Slides

24.09.24

Slides

01.10.24 Cluster analysis I

Slides

08.10.24 Association pattern mining

Slides

15.10.24 Clustering II

Slides

22.10.24 Anomaly and Outlier Analysis

Slides


05.11.24 Similarity and Distance II

Slides