Essex Summer School at Waseda ([email protected]), which started in 2016, had been held every summer until 2019 in collaboration with the University of Essex. Due to the global COVID-19 pandemic, the Center for Positive/Empirical Analysis of Political Economy had to seek alternative educational opportunities and support for the face-to-face summer course this year. Despite the challenges that 2020 has presented, the 53rd ESS was held online by the Essex Summer School in Social Security Data Analysis (SSDA). To help advance [email protected] to the next level, the Center encouraged graduate students to take this online course by supporting its tuition.
14 students received the subsidy by the Top Global University project and completed the course successfully. Our student survey revealed that students worked very hard during the two weeks sessions regardless of the time difference, and instructors and TA were always very helpful and sincere. Students met their classmates from various countries online and enjoyed discussions. They also felt that the overall course content and outcomes were satisfactory and helpful for their research in the future. The Center is glad that ESS online was successful and became a valuable experience for the students.
Below, two students share their experiences.
3I: Web Scraping and Data Management for Social Scientists
The contents covered in this course were as follows (extracted and partially modified from the course syllabus):
• Day 1: Review of R and tidyverse (dplyr, tidyr)
• Day 2: Review of R programming (functions, conditionals, iteration)
• Day 3: Introduction to web APIs (concepts of API and collecting Twitter data with RTweet)
• Day 4: Writing API queries (The New York Times Article API)
• Day 5: Introduction to webscraping (how websites work, HTML and CSS, Selector Gadget)
• Day 6: Webscraping using RVest
• Day 8: Text Analysis 1 (preprocessing, dictionary methods, distinctive words)
• Day 9: Text Analysis 2 (Structural Topic Modelling)
• Day 10: Introduction to GitHub
The course was organized in a mixture of both lecture-based and exercise-based sessions. Therefore, we were able to grasp new concepts without great difficulty and were able to apply those newly learned skills to real-life situations. By taking this course, I was able to learn different ways of collecting data from internet as well as how to pre-process and organize those data. Now that I have completed the course, I feel more confident in applying data analysis skills to my own research. Since the data I am interested in for my research are published in different formats (some in PDF and others as HTML), I would like to collect and transform those raw data into an appropriate form based on everything I learnt from this course.
Frankly speaking, taking the course at midnight for 10 days (due to an 8-hour time difference) was sometimes quite challenging but the course was indeed very productive and fruitful, so overall, it was definitely worth taking. Moreover, if the course were conducted on-campus, I would not have been able to attend it this year due to time constraints, therefore I am really grateful that I was given this opportunity to learn skills necessary for me to complete writing the master thesis online. If this course is available again next year, I would totally recommend it to anyone who is interested in web scraping or data analysis in general.
3N: Advanced Machine Learning for Social Scientist
The lecture introduced the basic methods of machine learning. By comparing multiple methods, I was able to understand which method is suitable for what purpose. The comments of the professor who has been performing research in machine learning were also very useful. Although I was familiar with some of the machine learning methods, I feel that I am now able to put them into practice by re-learning about them with examples and exercises. For example, the validation-set approach and the k-fold cross validation approach were done many times in the exercises, and I would like to use these approaches again in the future.
In addition, the lecture was spent most of the time on R code and packages. I was not familiar with R coding before, but the explanation helped me to understand it better.
In my field of research, the explanation is more important than the predictive accuracy of the model. Therefore, I would basically continue to do the analysis with linear models that I have done in the past. Rather than writing a paper using machine learning, I’ll be using what I’ve learned in my lectures in the selection of variables and models. In the meantime, I would like to continue to learn and understand other machine learning methods on my own. I would like to learn more about the mathematical background of machine learning that was not covered in this lecture.
Lastly, the ESS online enabled me to participate in classes and join talk sessions that were not available in the usual [email protected] sessions in the past, even while staying in Japan. I honestly recommend everyone who is thinking about taking these classes in the future to take precious opportunity to learn more. Don’t miss the chance and step forward for the future!