CS 8803 DMM: Data Management and Machine Learning
        (Fall 2019)
      
      
      
        
      
      
      
      Announcements:
      
        - No classes on 09/02, 10/14 and 11/27 due to Labor Day Break, Fall Recess and Thanksgiving respectively. 
         
        - Major course announcements will be
            made available here and on GT Canvas
         
        - Lecture slides have been uploaded to Files section on Canvas 
         
        - Include your preferred teammates for paper presentation in your "paper preference" email. We will try to accomodate that. Otherwise, students who express their interests in same paper will be asked to form a presentation group. 
          Note that paper presentation group can be different from Project group.
         
      
      Motivation
      
      Big data processing poses many challenges, which are often
      characterized by the three V's (volume, velocity, and variety). On
      the other hand, machine learning is increasingly used by all kinds
      of data-driven applications. This course explores the interactions
      between these two exciting fields. This 
blogpost provides one
      perspective of such interactions.
      
Topics
      Because of the purpose above, the course will be covered topics
        broadly categorized as follows:
      
      
        - Utilizing machine learning technologies to solve hard data
          management challenges, such as data
          cleaning      
         
        - Utilizing data management technologies to solve hard machine
          learning challenges, such as model interpretation, debugging,
          and feature engineering.
         
      
      Objectives
      The course covers a wide range of moder challenges and
        sub-topics in both data management and machine learning. The
        students will get familiar with these sub-topics, and gain a
        deep understanding of one sub-topic by doing presentations and
        course projects. 
      Furthermore, since this is a graduate seminar, another
        important objective is to train students to master basic skills
        for being a researcher. The course will create a number of
        opportunities for students to learn how to read a
          paper, how
          to write a paper review, how to give a
          research talk, and how to
          write a research paper. 
      Logistics
      We will be using 
Canvas
      for course announcements, uploading materials that should not be
      made public, and student discussions such as forming project
      groups. 
      
        - Instructor:   Xu Chu 
 
        
          - Email: xu.chu@cc.gatech.edu
           
          - Office: Klauss 3322
           
        
        - TA: Aman Shishodia
 
        
          - Email: aman.shishodia@gatech.edu
 
        
        - Time: Mon and Wed, 3:00 - 4:15pm
 
        - Location: College of Computing 102
 
        - Office Hours: 3:30-4.30pm every Tue starting from Aug 20th
        
 
        - Appointments: E-mail me to book a
          slot. The subject of the email should be "CS
            8803".
         
      
      Prerequisites
      
        -  Students should have basic understandings of data analytics
          and machine learning. Though not required, an undergraduate
          course in relational database systems and an undergraduate
          course in machine learning would be helpful. References provide some relevant
          courses and materials.
 
      
      Academic Honesty:
      
      
      Grading
      
      
        - One Paper Presentation: 20%
 
        - Paper Reviews: 15%
 
        - Class Participation: 15%
 
        - Course Project: 50%
 
      
      References