Course Syllabus

 

Instructor:

Yangqiu Song

Email:

yqsong@cse.ust.hk

Telephone:

6987

Office:

3518 (lift 25/26)

 

Course code: COMP5222/MATH5471

 

Course Description:

This course will introduce a number of important statistical methods and modeling principles for analyzing large-scale data sets, with a focus on complex data structures such as text and graph data. Topics covered include sequential models, structure prediction models, deep learning models, etc., as well as open research problems in related areas.

 

Course Outcomes:

On successful completion of this course, the students should:

  • Demonstrate machine learning algorithm design skills for related applications;
  • Analyze the quality of results to domain problems;
  • Develop a program that can handle existing real problems.

 

Course Prerequisites:  

  • Computer science: object-oriented programming and data structures, design and analysis of algorithms; Mathematics: multivariable calculus, linear algebra and matrix analysis, probability and statistics.
  • Students are expected to have probability, linear algebra, and machine learning background. It is suggested to take am introduction machine learning course before taking this course.

 

Course Topics:

Statistical learning will be an integral part of postgraduate student training. Text and graph data are emerging data structures that are useful in many practical applications. Knowing how to deal with them will benefit a lot of students for their future career. This course will introduce basic and advanced statistical learning models, algorithms, and applications of text and graph data.  It will provide a comprehensive set of knowledge to deal will real data analytics problems in the era of big data.

Topics

Introduction

Sequence Modeling

Featurized Sequence Modeling

Neural Sequence Modeling

SGD Optimization

Word Embedding

Topic Models

Graph Models

Sequence Tagging

Constraint Models

Knowledge Graphs

 

 

Performance Evaluation: In general, the earned grade in the course will be based on the calculated total points according to the following schedule:

 

Activity or Task

Max Point Value

Four reading notes

20%

Mid-term project proposal

10%

Project report

30%

Final project presentation

10%

Final exam

30%

Total

100

 

 

 

Course Summary:

Date Details Due