- News
- 3/5(木)政治学方法論研究部会・早稲田因果推論ワークショップ合同開催のお知らせ
3/5(木)政治学方法論研究部会・早稲田因果推論ワークショップ合同開催のお知らせ
Dates
カレンダーに追加0305
THU 2026- Place
- 早稲田大学3号館 802教室 /Room 802, Building 3, Waseda Campus, Waseda University
- Time
- 13:30-15:00
- Posted
- Fri, 20 Feb 2026
政治学方法論分野で活躍する研究者を招いて、ワークショップを行います。どなたでもご参加いただけます。今回は、早稲田因果推論ワークショップ( https://causal.w.waseda.jp/ )と合同で開催します。事前登録は不要です。奮ってご参加ください。
発表者:勝又裕斗(東京大学)
日程:3月5日(木) 13:30-15:00
場所:早稲田大学早稲田キャンパス3号館802教室
https://www.waseda.jp/inst/student/assets/uploads/2020/08/15_campusmap_2020.pdf
題目:Design and Analysis with Machine Learning-Generated Variables: A Unified Framework for Prediction Bias and the Illusory Sample Size
概要:Machine learning (ML) has revolutionized the social sciences by enabling researchers to extract variables from massive unstructured datasets, such as text and images. However, using these ML-predicted variables in downstream statistical analyses leads to substantial bias and invalid inference. To overcome this, we propose a unified statistical framework for valid inference with ML-predicted variables that utilizes a small set of hand-labeled observations. Unlike existing model-specific corrections, our generic framework accommodates predicted outcomes, treatments, or covariates across a wide range of estimators, including linear regression, fixed effects, survival analysis, and instrumental variable estimation. Crucially, we uncover an `illusory sample size’ problem: contrary to common intuition, massive unlabeled datasets do not reduce estimation variance without sufficient hand-labeled data. Accordingly, our framework uses a small pilot dataset to optimize data collection, balancing labeling costs against estimation precision. We demonstrate the framework’s utility by revisiting a study on election fraud using ballot image data.
発表言語:日本語