CS 8803 DMM: Data Management and Machine Learning
(Fall 2020)
Announcements:
- This will be a fully online course due to COVID19
- Please Join the course Slack Workspace. The Slack Workspace will be used for all communications between students, the TA, and the instructor. The link to the workspace can be found on Canvas.
- All lectures and students will be conducted in BlueJeans. The link to the meeting URL can be found on Canvas.
Motivation
Big data processing poses many challenges, which are often
characterized by the three V's (volume, velocity, and variety). On
the other hand, machine learning is increasingly used by all kinds
of data-driven applications. This course explores the interactions
between these two exciting fields. This
blogpost provides one
perspective of such interactions.
Topics
Because of the purpose above, the course will be covered topics
broadly categorized as follows:
- Utilizing machine learning technologies to solve hard data
management challenges, such as data
cleaning
- Utilizing data management technologies to solve hard machine
learning challenges, such as model interpretation, debugging,
and feature engineering.
Objectives
The course covers a wide range of moder challenges and
sub-topics in both data management and machine learning. The
students will get familiar with these sub-topics, and gain a
deep understanding of one sub-topic by doing presentations and
course projects.
Furthermore, since this is a graduate seminar, another
important objective is to train students to master basic skills
for being a researcher. The course will create a number of
opportunities for students to learn how to read a
paper, how
to write a paper review, how to give a
research talk, and how to
write a research paper.
Logistics
- Instructor: Xu Chu
- Email: xu.chu@cc.gatech.edu
- TA: Hantian Zhang
- Email: hantian.zhang@gatech.edu
- Time: Mon and Wed, 3:30 - 4:45pm
- Location: Online using Bluejeans. Link can be found on Canvas
- Office Hours: We will use Slack for student-teacher interactions. The link to the Slack workspace can be found on Canvas
Prerequisites
- Students should have basic understandings of data analytics
and machine learning. Though not required, an undergraduate
course in relational database systems and an undergraduate
course in machine learning would be helpful. References provide some relevant
courses and materials.
Academic Honesty:
Grading
- Paper Presentations: 20%
- Paper Reviews: 15%
- Class Participation: 15%
- Course Project: 50%
References