Web Analytics Made Easy -
StatCounter

Welcome to the Chu Data Lab

We are an academic research group headed by Prof. Xu Chu, and we are a part of the School of Computer Science at Georgia Tech. We are members of the Database Group and affiliate members of the ML center and Institute for Data Engineering and Science in Georgia Tech.

We are generally interested in data management and machine learning. In particular, we are interested in practical and challenging problems that are in the intersection of these two fields. Example problems we are actively working on include: machine learning for data cleaning and integration, data cleaning for machine learning, training data generation for image and tabular data, automatic feature engineering, and systems for managing machine learning analytics pipelines. For more information, please visit our research page, and a relevant graduate course we are offering every year.

We are looking for passionate new PhD students, Postdocs, and Master students to join the team (more info) !

Follow us on Twitter

News

04/05/2020

Our ZeroER work on performing entity resolution with zero labeled examples has been accepted to SIGMOD 2020

12/03/2019

Our GOGGLES work on domain-agnostic training data labeling has been accepted to SIGMOD 2020

08/07/2019

Our team in collaboration with Alibaba won the third place in the KDD 2019 AutoML Challenge

08/01/2019

Our ACM book on data cleaning is up for sale on Amazon

04/25/2019

We are releasing CleanML, a benchmark for cleaning for ML

03/13/2019

We are releasing GOGGLES, a system for automatic generation of training data!

03/12/2019

Our PIClean system demo is accepted by SIGMOD 2019!

02/08/2019

After two years in the making, we have finally finished our manuscript for the data cleaning book! It will be published by ACM Books hopefully in early 2019, stay tuned!

02/01/2019

We are excited to learn that we are granted the 2019 JP Morgan Faculty Research Awards!

... see all News