社会数据分析,Python应用入门


课程教师

Blake Miller

教师简介

Blake Miller is an Assistant Professor of Computational Social Science in the Methodology Department at the London School of Economics and Political Science. He received his PhD in Political Science and Scientific Computing from the University of Michigan in 2018 where he was also a graduate research affiliate in the Lieberthal-Rogel Center for Chinese Studies. Before coming to LSE, he was a Post-Doctoral Fellow at the Dartmouth College Program in Quantitative Social Science. Blake has also spent several years in Silicon Valley as an executive for tech start-up companies. For more information, please visit www.blakeapm.com.


课程内容

The massive amount of data available online contin- ues to increase the bounds of social scientific inquiry. Researchers in both academia and the private sector can gain a greater understanding of human behavior by analyzing the abundant social data stored online. To make use of these data, one must first master technical skills necessary to gather and process these data, which can be quite challenging to do properly.

The main goal of this course is to provide students with the necessary tools for the construction, processing, and cleaning of data found online.  After taking this course, students will have mastered the requisite tools needed to construct datasets out of unstructured, semi-structured, and structured online data.


预期目标

  1. to introduce students to important concepts and methodologies related to the management, collection, processing, and cleaning of data for social science and public policy research

  2. to teach students practical concerns and best practices for data man- agement and data collection

  3. to build foundational skills necessary to construct useful datasets for their research from unstructured, semi-structured, and secondary data

  4. to build a roadmap for continued learning through promoting awareness of more advanced and specialized tools and where to look for problem- solving/reference.


课程安排

  • Day 1: Course Intro, Computational Social Science

    Lazer, David, and Jason Radford. 2017. “Data Ex Machina: Introduction to Big Data.” Annual Review of Sociology 43(1): 19–39. (Download Paper)

  • Day 4: Basic Python (Part 1)

    Think Python chapters 1-4

  • Day 5: Basic Python (Part 2)

    Think Python chapters 5-9

  • Day 6: Basic Python (Part 3)

    Think Python chapters 10-14

  • Day 7: Intro to Data and Data Ethics

    Willis, D. (2014). Professors’research project stirs political out- rage in Montana. New York Times.  (Download Article)

    Optional readings:

    * Leslie,D.(2023). The ethics of computational social sci- ence. In Handbook of Computational Social Science for Pol-icy, pages 57–104. Springer International Publishing Cham. (Download Book)

    * Kramer, A.D., Guillory, J. E., and Hancock, J. T. (2014). Experimental evidence of massive-scale emotional contagion through social networks. Proceedings of the National academy of Sciences of the United States of America, 111(24):8788. (Download Paper)

    * Lewis, K., Kaufman, J., Gonzalez, M., Wimmer, A., and Christakis, N. (2008). Tastes, ties, and time:  A new so- cial network dataset using facebook.com. Social networks, 30(4):330–342 (Download Paper)

    * Munger, K. (2017). Tweetment effects on the tweeted: Ex- perimentally reducing racist harassment.  Political Behavior, 39:629–649. (Download Paper)

  • Lecture 10: Obtaining Data from the Web

    no readings.


Textbooks and Course Materials

Readings for each session are detailed below.

  • Textbooks: Think Python: How to Think Like a Computer Scientist, 2nd Edition (English, Chinese)

  • Other resources:

Unix Shell Tutorial from Software Carpentry

Version Control with Git Tutorial from Software Carpentry


Required Software

This course is taught in Python, using Python 3. You will need to have Python 3 installed on your computer and bring it to class each session. If you have not yet installed Python 3, you will need to do so. Please use the following resources for installing Python 3 on your machine:

For Windows users:

  1. Install Sublime Text from the official Sublime Text website.

  2. Install Git Bash

  3. Install Python 3 (click “Latest Python3 Release”)

  4. Open Git Bash and run python3  --version to verify Python 3 is accessible. If that does not work, try python  --version.

For Mac users:

  1. Open Terminal on Mac (Applications → Utilities → Terminal).

  2. Install Xcode Command Line Tools by running xcode-select --install in Terminal.

  3. Install homebrew by pasting the installation command from the website into Terminal. Alternatively download the installer here.

  4. Install Sublime Text from the official Sublime Text website and follow the installation instructions.

  5. Use homebrew to install Python 3 by running brew install python in Terminal.

  6. Run python3 --version in Terminal to verify Python 3 is accessible.


Grading

  1. Quizzes (60%, in class): There will four in-class quizzes. The quizzes will test knowledge of the material covered in the previous classes and readings.

  2. Final Problem Set (40%, due July 19): You will have one final problem set to apply all of the things we learned. For the problem set, you will analyze a dataset we will discuss in class in the final lectures.