The Hellscape that is Scraping Legislative Data as an Open Source Project
Collecting complex legislative data from hundreds of government websites isn't all sunshine & rainbows. We'll cover some of the strangest things we've encountered with scraping these sites: from awful APIs to entire government offices disappearing. Most official sources have been easy enough to find, while others have been more tricky, with at least one coming from an IP address scribbled on a napkin. Since we're also an open source project, we have some struggles with things like getting helpful issue submissions and ensuring clarity around the project ecosystem.
Open States strives to improve civic engagement at the state and federal level by providing data and tools regarding state legislatures by aggregating legislative information from all 50 states, Congress, Washington, D.C., and Puerto Rico. This information is then standardized, cleaned, and published to the public for free. At the cost of our sanity.
This talk is recommended for folks interested in the process of scraping government data, contributing to open source civic tech projects, or for those that enjoy hearing about other people’s misery.