• News
  • 3/5(木)政治学方法論研究部会・早稲田因果推論ワークショップ合同開催のお知らせ

3/5(木)政治学方法論研究部会・早稲田因果推論ワークショップ合同開催のお知らせ

3/5(木)政治学方法論研究部会・早稲田因果推論ワークショップ合同開催のお知らせ

0305

THU 2026
Place
早稲田大学3号館 802教室 /Room 802, Building 3, Waseda Campus, Waseda University
Time
13:30-15:00
Posted
Fri, 20 Feb 2026

政治学方法論分野で活躍する研究者を招いて、ワークショップを行います。どなたでもご参加いただけます。今回は、早稲田因果推論ワークショップhttps://causal.w.waseda.jp/ )と合同で開催します。事前登録は不要です。奮ってご参加ください。

発表者:勝又裕斗(東京大学)

日程35日(木) 13:30-15:00

場所:早稲田大学早稲田キャンパス3号館802教室
https://www.waseda.jp/inst/student/assets/uploads/2020/08/15_campusmap_2020.pdf

題目Design and Analysis with Machine Learning-Generated Variables: A Unified Framework for Prediction Bias and the Illusory Sample Size

概要Machine learning (ML) has revolutionized the social sciences by enabling researchers to extract variables from massive unstructured datasets, such as text and images. However, using these ML-predicted variables in downstream statistical analyses leads to substantial bias and invalid inference. To overcome this, we propose a unified statistical framework for valid inference with ML-predicted variables that utilizes a small set of hand-labeled observations. Unlike existing model-specific corrections, our generic framework accommodates predicted outcomes, treatments, or covariates across a wide range of estimators, including linear regression, fixed effects, survival analysis, and instrumental variable estimation. Crucially, we uncover an `illusory sample size problem: contrary to common intuition, massive unlabeled datasets do not reduce estimation variance without sufficient hand-labeled data. Accordingly, our framework uses a small pilot dataset to optimize data collection, balancing labeling costs against estimation precision. We demonstrate the frameworks utility by revisiting a study on election fraud using ballot image data.

発表言語:日本語