ZeroER: Entity Resolution using Zero Labeled Examples
Renzhi Wu, Sanya Chaba, Saurabh Sawlani, Xu Chu, Saravanan Thirumuruganathan
In Proceedings of the 2020 ACM SIGMOD Conference on Management of Data
GOGGLES: Automatic Training Data Generation with Affinity Coding
Nilaksh Das, Sanya Chaba, Sakshi Gandhi, Duen Horng Chau, Xu Chu
In Proceedings of the 2020 ACM SIGMOD Conference on Management of Data
Data Cleaning
Ihab F. Ilyas, Xu Chu
ACM Book Series
CleanML: A Benchmark for Joint Data Cleaning and Machine Learning [Experiments and Analysis]
Peng Li, Xi Rao, Jennifer Blase, Yue Zhang, Xu Chu, Ce Zhang
arxiv, 2019
PIClean: a Probabilistic and Interactive Data Cleaning System
Zhuoran Yu, Xu Chu
In Proceedings of the 2019 ACM SIGMOD Conference on Management of Data, Amsterdam, Netherlands
Transform-Data-by-Example (TDE): An Extensible Search Engine for Data Transformation
Yeye He, Xu Chu, Kris Ganjam, Yudian Zheng, Vivek Narasayya, Surajit Chaudhuri
In the 43rd Interntaional Confernce on Very Large Databases, VLDB 2018, Brazil
Data Cleaning (Book Chapter)
Xu Chu
In Encyclopedia of Big Data Technologies
Transform-Data-by-Example (TDE): Extensible Data Transformation in Excel
Yeye He, Kris Ganjam, Yue Wang, Vivek Narasayya, Surajit Chaudhuri, Xu Chu, Yudian Zheng
In Proceedings of the 2018 ACM SIGMOD Conference on Management of Data, Houston, USA
HoloClean: Holistic Data Repairs with Probabilistic Inference
Theodoros Rekatsinas, Xu Chu, Ihab F. Ilyas, Christopher Ré
In the 43rd Interntaional Confernce on Very Large Databases, VLDB 2017, Munich, Germany
Detecting Data Errors: Where are we and what needs to be done?
Ziawasch Abedjan, Xu Chu, Dong Deng, Raul Castro Fernandez, Ihab F. Ilyas, Mourad Ouzzani, Paolo Papotti, Michael Stonebraker, and Nan Tang
In the 42nd Interntaional Confernce on Very Large Databases, VLDB 2016, New Delhi, India
Distributed Data Deduplication
Xu Chu, Ihab F. Ilyas, Paraschos Koutris
In the 42nd Interntaional Confernce on Very Large Databases, VLDB 2016, New Delhi, India
Qualitative Data Cleaning
Xu Chu, Ihab F. Ilyas
In the 42nd Interntaional Confernce on Very Large Databases, VLDB 2016, New Delhi, India
Slides
Data Cleaning: Overview and Emerging Challenges
Xu Chu, Ihab F. Ilyas, Sanjay Krishnan, Jiannan Wang
In Proceedings of the 2016 ACM SIGMOD Conference on Management of Data, San Francisco, USA
Slides
CLAMS: Bringing Quality to Data Lakes
Mina Farid, Alexandra Roatis, Ihab F. Ilyas, Hella-Franziska Hoffmann, Xu Chu
In Proceedings of the 2016 ACM SIGMOD Conference on Management of Data, San Francisco, USA
Trends in Cleaning Relational Data: Consistency and Deduplication
Ihab F. Ilyas, Xu Chu
In Foundations and Trends® in Databases, Volume 5, Issue 4, 2015
SEMA-JOIN : Joining Semantically-Related Tables Using Big Table Corpora
Yeye He, Kris Ganjam, Xu Chu
In the 41st Interntaional Confernce on Very Large Databases, VLDB 2015, Kohala Coast, Hawai‘i, USA
KATARA: A Data Cleaning System Powered by Knowledge Bases and Crowdsourcing
Xu Chu, John Morcos, Ihab F. Ilyas, Mourad Ouzzani, Paolo Papotti, Nan Tang, Yin Ye
In Proceedings of the 2015 ACM SIGMOD Conference on Management of Data, Melbourne, Australia
TEGRA: Table Extraction by Global Record Alignment
Xu Chu, Yeye He, Kaushik Chakrabarti, Kris Ganjam
In Proceedings of the 2015 ACM SIGMOD Conference on Management of Data, Melbourne, Australia
KATARA: Reliable Data Cleaning with Knowledge Bases and Crowdsourcing
Xu Chu, John Morcos, Ihab F. Ilyas, Mourad Ouzzani, Paolo Papotti, Nan Tang, Yin Ye
In the 41st Interntaional Confernce on Very Large Databases, VLDB 2015, Kohala Coast, Hawai‘i, USA
Discovering Denial Constraints
Xu Chu, Ihab F. Ilyas, Paolo Papotti
In the 40th Interntaional Confernce on Very Large Databases, VLDB 2014, Hangzhou, China
RuleMiner: Data Quality Rules Discovery
Xu Chu, Ihab F. Ilyas, Paolo Papotti, Yin Ye
In Proceedings of the IEEE International Conference on Data Engineering, ICDE 2014, Chicago, USA
Holistic Data Cleaning: Putting Violations into Context
Xu Chu, Ihab F. Ilyas, Paolo Papotti
In Proceedings of the IEEE International Conference on Data Engineering, ICDE 2013, Brisbane, Australia