Publications
2025
- NAACLFighting Spurious Correlations in Text Classification via a Causal Learning PerspectiveYuqing Zhou and Ziwei ZhuIn Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), Apr 2025
In text classification tasks, models often rely on spurious correlations for predictions, incorrectly associating irrelevant features with the target labels. This issue limits the robustness and generalization of models, especially when faced with out-of-distribution data where such spurious correlations no longer hold. To address this challenge, we propose the Causally Calibrated Robust Classifier (CCR), which aims to reduce models’ reliance on spurious correlations and improve model robustness. Our approach integrates a causal feature selection method based on counterfactual reasoning, along with an unbiased inverse propensity weighting (IPW) loss function. By focusing on selecting causal features, we ensure that the model relies less on spurious features during prediction. We theoretically justify our approach and empirically show that CCR achieves state-of-the-art performance among methods without group labels, and in some cases, it can compete with the models that utilize group labels. Our code can be found at: https://github.com/yuqing-zhou/Causal-Learning-For-Robust-Classifier.
@inproceedings{zhou-zhu-2025-fighting, title = {Fighting Spurious Correlations in Text Classification via a Causal Learning Perspective}, author = {Zhou, Yuqing and Zhu, Ziwei}, editor = {Chiruzzo, Luis and Ritter, Alan and Wang, Lu}, booktitle = {Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)}, month = apr, year = {2025}, address = {Albuquerque, New Mexico}, publisher = {Association for Computational Linguistics}, url = {https://aclanthology.org/2025.naacl-long.215/}, doi = {10.18653/v1/2025.naacl-long.215}, pages = {4264--4274}, isbn = {979-8-89176-189-6}, }
2024
- Findings of EMNLPNavigating the Shortcut Maze: A Comprehensive Analysis of Shortcut Learning in Text Classification by Language ModelsYuqing Zhou, Ruixiang Tang, Ziyu Yao, and 1 more authorIn Findings of the Association for Computational Linguistics: EMNLP 2024, Nov 2024
Language models (LMs), despite their advances, often depend on spurious correlations, undermining their accuracy and generalizability. This study addresses the overlooked impact of subtler, more complex shortcuts that compromise model reliability beyond oversimplified shortcuts. We introduce a comprehensive benchmark that categorizes shortcuts into occurrence, style, and concept, aiming to explore the nuanced ways in which these shortcuts influence the performance of LMs. Through extensive experiments across traditional LMs, large language models, and state-of-the-art robust models, our research systematically investigates models’ resilience and susceptibilities to sophisticated shortcuts. Our benchmark and code can be found at: https://github.com/yuqing-zhou/shortcut-learning-in-text-classification.
@inproceedings{zhou-etal-2024-navigating-shortcut, title = {Navigating the Shortcut Maze: A Comprehensive Analysis of Shortcut Learning in Text Classification by Language Models}, author = {Zhou, Yuqing and Tang, Ruixiang and Yao, Ziyu and Zhu, Ziwei}, editor = {Al-Onaizan, Yaser and Bansal, Mohit and Chen, Yun-Nung}, booktitle = {Findings of the Association for Computational Linguistics: EMNLP 2024}, month = nov, year = {2024}, address = {Miami, Florida, USA}, publisher = {Association for Computational Linguistics}, url = {https://aclanthology.org/2024.findings-emnlp.146/}, doi = {10.18653/v1/2024.findings-emnlp.146}, pages = {2586--2614}, }
2023
- CIKMA generalized propensity learning framework for unbiased post-click conversion rate estimationYuqing Zhou, Tianshu Feng, Mingrui Liu, and 1 more authorIn Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, Birmingham, United Kingdom, Nov 2023
This paper addresses the critical gap in the unbiased estimation of post-click conversion rate (CVR) in recommender systems. Existing CVR prediction methods, such as Inverse Propensity Score (IPS) and various Doubly Robust (DR) based estimators, overlook the impact of propensity estimation on the model bias and variance, thus leading to a debiasing performance gap. We propose a Generalized Propensity Learning (GPL) framework to directly minimize the bias and variance in CVR prediction models. The proposed method works as a complement to existing methods like IPS, DR, MRDR, and DRMSE to improve prediction performance by reducing their bias and variance. Extensive experiments on real-world datasets and semi-synthetic datasets demonstrate the significant performance promotion brought by our proposed method. Data and code can be found at: https://github.com/yuqing-zhou/GPL.
@inproceedings{zhou2023generalized, title = {A generalized propensity learning framework for unbiased post-click conversion rate estimation}, author = {Zhou, Yuqing and Feng, Tianshu and Liu, Mingrui and Zhu, Ziwei}, booktitle = {Proceedings of the 32nd ACM International Conference on Information and Knowledge Management}, pages = {3554--3563}, year = {2023}, url = {https://doi.org/10.1145/3583780.3614760}, doi = {10.1145/3583780.3614760}, location = {Birmingham, United Kingdom}, }