About me

Hi, I am Yongchan Kwon, an Assistant Professor in the Department of Statistics at Columbia University. My research focuses on developing more interpretable and rigorous machine learning methods, directly motivated by scientific questions. I received a Ph.D. at Seoul National University (Advisor: Prof. Myunghee Cho Paik) and did postdoc at Stanford University (Mentor: Prof. James Zou).


  • Assistant Professor, Department of Statistics, Columbia University, 2022 - Current
  • Postdoc Researcher, Department of Biomedical Data Science, Stanford University, 2020 - 2022


  • Ph.D in Statistics, Seoul National University, 2013 - 2020
  • B.S. in Mathematical Sciences, Korea Advanced Institute of Science and Technology (KAIST), 2009 - 2013

Selected Publications

  • Kwon, Y.*, Wu, E.*, Wu, K.*, and Zou, J. (2024). DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and Diffusion Models. International Conference on Learning Representations. (ICLR 2024). [URL]. [GitHub].

  • Jiang, K.*, Liang, W.*, Zou, J. and Kwon, Y.* (2023). OpenDataVal: a Unified Benchmark for Data Valuation. Advances in Neural Information Processing Systems (NeurIPS 2023) Track on Datasets and Benchmarks. [URL]. [Website]. [GitHub].

  • Kwon, Y. and Zou, J. (2023). Data-OOB: Out-of-bag Estimate as a Simple and Efficient Data Value. International Conference on Machine Learning (ICML 2023). [URL]. [GitHub].

  • Liang, W.*, Mao, Y.*, Kwon, Y.*, Yang, X., and Zou, J. (2023). On the nonlinear correlation of ML performance between data subpopulations. International Conference on Machine Learning (ICML 2023). [URL]. [Website].

  • Kwon, Y. and Zou, J. (2022). WeightedSHAP: analyzing and improving Shapley based feature attributions. Neural Information Processing Systems (NeurIPS 2022). [URL]. [GitHub].

  • Liang, W.*, Zhang, Y.*, Kwon, Y.*, Yeung, S., and Zou, J. (2022). Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning. Neural Information Processing Systems (NeurIPS 2022). [URL]. [Website]. [GitHub].

  • Kwon, Y. and Zou, J.. (2022). Beta-Shapley: a Unified and Noise-reduced Data Valuation Framework for Machine Learning. Proceedings of the 25th International Conference on Artificial Intelligence and Statistics (AISTATS 2022), PMLR 151:8780-8802. [URL]. [GitHub]. (selected for oral presentation, Top 2.6%).

  • Kwon, Y., Kim, W., Won, J.-H., and Paik, M.C.. (2020). Principled learning method for Wasserstein distributionally robust optimization with local perturbations. Proceedings of the 37th International Conference on Machine Learning (ICML 2020), PMLR 119:5567-5576. [URL]. [GitHub].


Email: yk3012 (at) columbia (dot) edu