Methodology
Terminology
Throughout this report, I'll use a few words that are often colloquial synonyms but have distinct meaning in this report.
- A subject area is a logical grouping of courses. A department can offer one or many subject areas.
- A course is a quarter-long unit of teaching.
- A section is a particular offering of a course for a given term.
Enrollment Data
All enrollment data used in this project was scraped from the UCLA Registrar's publicly accessible Schedule of Classes. The Schedule of Classes provides a listing of all sections offered by UCLA from Winter 1999 to present. UCLA Extension classes are listed in a separate schedule and thus are not included in this dataset.
The scraper was written in Go and deployed as multiple AWS Lambda functions.1 Updates to course listings were updated every day, section and enrollment information for courses were updated every hour.
For sections from Fall 2019 to Spring 2020, I scraped enrollment data hourly in order to track trends in section enrollment.
The data contains covers enrollment trends during the first few weeks of Fall 2019, Winter 2020 enrollment data both before and during the quarter, and Spring enrollment data before the quarter.
- Fall 2019 data was scraped from September 11, 2019 to November 2, 2019.
- Winter 2020 data was scraped from November 2, 2019 to February 9, 2020.
- Spring 2020 data was scraped from February 9, 2020 to March 8, 2020.
I was also able to scrape historical data for courses dating back to Winter 1999 to Spring 2020. This data only provides the most recent enrollment count for each section – which includes the number of students who were enrolled in the section at the end of the quarter.2
Enrollment data for 300 and 500 level courses were not collected for most subject areas, as these courses are for graduate-level teaching training and research, which I found to not be very interesting data. In addition, many subject areas had listings for 300 or 500 level courses that were not enrolled in, leading to additional noise in the database.
Courses numbered with a 300 or 500 were collected for Law, Dentistry, and Medicine courses, as these schools have different numbering systems. Note that enrollment data may be incomplete for these schools as they use separate academic calendars from the rest of the university, thus enrollment trends for the above period may not actually reflect when their enrollment periods are.
There are occaisionally some missing enrollment numbers or entries in the data, particularly for the Fall 2019 data as it was my first time scraping this dataset. I encourage you to let me know if you find any errors in the data.
Buildings and Classrooms
Additional information about classrooms and buildings were scraped from the official UCLA Map, building list, and Classroom Grid Search. Over the past 20 years at UCLA, classroom sizes have changed with renovations and buildings have both been created and destroyed. All classroom data was scraped in February 2020 and is only current to February 2020; it doesn't take into account any historical changes.
The scraper itself was written in Python.
Information about buildings and classrooms are stored as two tables in the database: buildings
and rooms
, respectively.
buildings
contains an id, name, abbreviation, and coordinates of all buildings listed by the registrar. rooms
contains rows of an id, reference to a building in the buildings
table, room number, and the maximum capacity of the room.
Schools and Divisions
Information about the schools and divisions of UCLA was manually compiled from Wikipedia's list, cross-referenced with information from UCLA's website about undergraduate and graduate academics. Data about departments and subject areas was compiled from the schedule of classes, the Registrar's list of Department and Subject Area Codes, and a spreadsheet of historical subject areas provided to me by the Registrar's office. For some departments in the College of Letters and Science, it wasn't clear which division it fell under. These departments were classified under L&S - Misc
.2
Data
The data set contains:
- 65 terms (99W to 20S)
- 147 Departments
- 282 subject areas
- 40,239 courses
- 353,255 sections
- Over 30,000,000 rows hourly of enrollment data
In total, the data takes up around 2 gigabytes as a PostgreSQL dump file. You can download the data from the UCLA Dataverse.