Introduction to Programming for Social Data Science


Instructor Name

Blake Miller

Instructor Biography

Blake Miller is an Assistant Professor of Political Science at the University of Pennsylvania. They are also affiliated with the Center for the Study of Contemporary China. Blake's research examines how states control media, the internet, and civil society organizations. They are currently working on a book manuscript on information control. Blake received their PhD in Political Science and Scientific Computing from the University of Michigan. Prior to joining Penn, they were an Assistant Professor of Computational Social Science at the London School of Economics and Political Science, and a Post-Doctoral Fellow at the Dartmouth College Program in Quantitative Social Science.


Course Description

The abundance of large-scale data and the computational tools to analyze them have transformed social science research, dramatically broadening the scope of social scientific enquiry. This course covers a variety of computational approaches to studying human behavior from text data, digital traces of behavior, social media data, and other sources. Each class will be broken into two main sections. First, we will introduce examples of computational social science research, outlining key methodological strategies, challenges, and best practices for researchers. Second, we will cover basics of computer programming and data management necessary for carrying out one’s own computational social science research. The course will introduce basic coding in Python, but no prior coding experience will be necessary.


Learning Outcomes

  1. to introduce students to important concepts and methodologies related to computational social science research

  2. to teach students practical concerns and best practices for data management and data collection

  3. to build foundational skills necessary to construct useful datasets for their research from unstructured, semi-structured, and secondary data

  4. to build a roadmap for continued learning through promoting awareness of more advanced and specialized tools and where to look for problem-solving/reference.


Course Schedule

Lecture

Topic (2.5 teaching hours)

1

Course Intro, What is Computational Social Science?

2

Survey of Computational Social Science Research

3

Basic Programming: Variables, Operators

4

Basic Programming: Conditionals, Loops

5

Basic Programming: Functions

6

Data Types and Data Wrangling

7

Data and Research Best Practices

8

Text Data

9

Social Media Data and APIs

10

Gathering Data from the Web


Grading

  1. Quizzes (60%, in class): There will four in-class quizzes. The quizzes will test knowledge of the material covered in the previous classes and readings.

  2. Final Problem Set (40%, due July 19): You will have one final problem set to apply all of the things we learned. For the problem set, you will analyze a dataset we will discuss in class in the final lectures.