Deep entity matching with pre-trained language models Y Li, J Li, Y Suhara, AH Doan, WC Tan arXiv preprint arXiv:2004.00584, 2020 | 369 | 2020 |
Annotating columns with pre-trained language models Y Suhara, J Li, Y Li, D Zhang, Ç Demiralp, C Chen, WC Tan Proceedings of the 2022 International Conference on Management of Data, 1493 …, 2022 | 100 | 2022 |
Snippext: Semi-supervised opinion mining with augmented data Z Miao, Y Li, X Wang, WC Tan Proceedings of The Web Conference 2020, 617-628, 2020 | 99 | 2020 |
Deep entity matching: Challenges and opportunities Y Li, J Li, Y Suhara, J Wang, W Hirota, WC Tan Journal of Data and Information Quality (JDIQ) 13 (1), 1-17, 2021 | 66 | 2021 |
Rotom: A meta-learned data augmentation framework for entity matching, data cleaning, text classification, and beyond Z Miao, Y Li, X Wang Proceedings of the 2021 International Conference on Management of Data, 1303 …, 2021 | 56 | 2021 |
Verification of hierarchical artifact systems A Deutsch, Y Li, V Vianu ACM Transactions on Database Systems (TODS) 44 (3), 1-68, 2019 | 55 | 2019 |
VERIFAS: A practical verifier for artifact systems Y Li, A Deutsch, V Vianu arXiv preprint arXiv:1705.10007, 2017 | 46 | 2017 |
Semantics-aware dataset discovery from data lakes with contextualized column-based representation learning G Fan, J Wang, Y Li, D Zhang, R Miller arXiv preprint arXiv:2210.01922, 2022 | 42 | 2022 |
Subjective databases Y Li, AX Feng, J Li, S Mumick, A Halevy, V Li, WC Tan arXiv preprint arXiv:1902.09661, 2019 | 42 | 2019 |
Automatic verification of database-centric systems A Deutsch, R Hull, Y Li, V Vianu ACM SIGLOG News 5 (2), 37-56, 2018 | 37 | 2018 |
Empowering business-level blockchain users with a rules framework for smart contracts T Astigarraga, X Chen, Y Chen, J Gu, R Hull, L Jiao, Y Li, P Novotny Service-Oriented Computing: 16th International Conference, ICSOC 2018 …, 2018 | 30 | 2018 |
Machamp: A generalized entity matching benchmark J Wang, Y Li, W Hirota Proceedings of the 30th ACM International Conference on Information …, 2021 | 26 | 2021 |
Sudowoodo: Contrastive self-supervised learning for multi-purpose data integration and preparation R Wang, Y Li, J Wang 2023 IEEE 39th International Conference on Data Engineering (ICDE), 1502-1515, 2023 | 25 | 2023 |
Data augmentation for ml-driven data preparation and integration Y Li, X Wang, Z Miao, WC Tan Proceedings of the VLDB Endowment 14 (12), 3182-3185, 2021 | 25 | 2021 |
Table discovery in data lakes: State-of-the-art and future directions G Fan, J Wang, Y Li, RJ Miller Companion of the 2023 International Conference on Management of Data, 69-75, 2023 | 22 | 2023 |
Mining order-preserving submatrices from probabilistic matrices Q Fang, W Ng, J Feng, Y Li ACM Transactions on Database Systems (TODS) 39 (1), 1-43, 2014 | 19 | 2014 |
Mining bucket order-preserving submatrices in gene expression data Q Fang, W Ng, J Feng, Y Li IEEE transactions on knowledge and data engineering 24 (12), 2218-2231, 2011 | 16 | 2011 |
Effective entity matching with transformers Y Li, J Li, Y Suhara, AH Doan, WC Tan The VLDB Journal 32 (6), 1215-1235, 2023 | 14 | 2023 |
Machop: an end-to-end generalized entity matching framework J Wang, Y Li, W Hirota, E Kandogan Proceedings of the Fifth International Workshop on Exploiting Artificial …, 2022 | 13 | 2022 |
Teddy: A system for interactive review analysis X Zhang, J Engel, S Evensen, Y Li, Ç Demiralp, WC Tan Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems …, 2020 | 12 | 2020 |