OpenStates Integration & Contribution Opportunities
This document outlines our integration with OpenStates/Plural Policy and potential opportunities to contribute code back to the open-source community.
📚 References Added to Citations
We have properly cited and referenced the following OpenStates resources:
In Root Citations (CITATIONS.md)
- ✅ OpenStates/Plural Policy main site
- ✅ Bulk data downloads (CSV, JSON, PostgreSQL)
- ✅ Scrapers repository: https://github.com/openstates/openstates-scrapers
- ✅ Local database documentation: https://docs.openstates.org/contributing/local-database/
- ✅ Code of Conduct: https://docs.openstates.org/code-of-conduct/
- ✅ Schema documentation: https://github.com/openstates/people/blob/master/schema.md
In Website Documentation (website/docs/data-sources/citations.md)
- ✅ Comprehensive OpenStates/Plural Policy section
- ✅ PostgreSQL dump setup instructions
- ✅ Contribution guidelines
- ✅ BibTeX citation
In Contributing Guide (CONTRIBUTING.md)
- ✅ Code of Conduct alignment with OpenStates
- ✅ Upstream contribution guidelines
- ✅ Testing requirements for scraper contributions
🔄 Our Current OpenStates Integration
Data We Use
-
PostgreSQL Monthly Dumps (9.8GB+)
- Complete legislative database for all 50 states
- Script:
scripts/bulk_legislative_download.py --postgres --month 2026-04 - Setup:
scripts/setup_openstates_db.sh - Use: Local SQL queries, no API rate limits
-
CSV/JSON Session Data
- Per-state legislative sessions
- Bill text, votes, sponsors
- Committee assignments
-
Video Sources
- YouTube channel URLs from
sourcesfield - Granicus video portal links
- Meeting archive locations
- YouTube channel URLs from
Our Implementation
File: discovery/openstates_sources.py
- Fetches jurisdiction data via API
- Extracts video sources (YouTube, Vimeo, Granicus)
- Maps to our jurisdiction database
File: scripts/bulk_legislative_download.py
- Downloads PostgreSQL dumps
- Downloads CSV/JSON session data
- Handles all 50 states + DC + PR
🤝 Code We Could Contribute to OpenStates Scrapers
The openstates-scrapers repository uses Scrapy to collect legislative data. We have complementary code that could enhance their project:
1. Video Source Discovery Patterns
Our Code: discovery/youtube_channel_discovery.py
What it does:
- Finds all YouTube channels for a jurisdiction (not just first match)
- Scrapes homepages for YouTube links
- Uses YouTube Data API for verification
- Discovers Vimeo and Granicus portals
Potential Contribution:
- Add video source extraction to OpenStates scrapers
- Enhance
sourcesfield with verified YouTube channels - Automate Granicus portal discovery
Example Pattern:
# Our code finds these patterns
patterns = {
"youtube_channel": r"youtube\.com/(?:c/|channel/|user/|@)([\w-]+)",
"vimeo_channel": r"vimeo\.com/([\w-]+)",
"granicus": r"granicus\.com/([^/]+)",
}
2. Legistar/Granicus Platform Detection
Our Code: discovery/url_discovery_agent.py
What it does:
- Identifies Legistar instances across cities
- Maps Granicus video portals
- Extracts meeting URLs and agendas
Potential Contribution:
- Enhance OpenStates scrapers with meeting video links
- Add Legistar meeting agenda extraction
- Contribute URL validation patterns
Platform Patterns We Use:
platforms = {
"granicus": ["granicus.com", "legistar.com"],
"youtube": ["youtube.com", "youtu.be"],
"vimeo": ["vimeo.com"],
}
3. Meeting Archive Scraping
Our Code: agents/scraper.py
What it does:
- Scrapes PDF meeting minutes
- Extracts text from scanned documents (OCR)
- Parses meeting dates and types
- Handles multiple document formats
Potential Contribution:
- Add meeting minutes text extraction to OpenStates
- Enhance bill analysis with meeting context
- Link bills to meeting discussions
📝 How to Contribute to OpenStates Scrapers
Following their local database documentation:
1. Setup OpenStates Development Environment
# Clone the scrapers repository
git clone https://github.com/openstates/openstates-scrapers.git
cd openstates-scrapers
# Install dependencies
pip install -r requirements.txt
# Setup local PostgreSQL database
createdb openstates
# Import schema (if needed)
psql -d openstates -f schema/openstates.sql
2. Test Your Scraper Locally
# Run a specific state scraper
os-update al --scrape --rpm 10
# Validate data
os-update al --scrape --validate
3. Follow Their Code of Conduct
All contributions must follow the OpenStates Code of Conduct:
- Be respectful and professional
- Welcome diverse perspectives
- Focus on what's best for the community
- Show empathy towards other contributors
4. Submit Pull Request
# Create feature branch
git checkout -b feature/video-sources
# Make changes (add video discovery to a state scraper)
# Example: scrapers/al/videos.py
# Test thoroughly
os-update al --scrape --rpm 10
# Commit and push
git commit -m "Add video source discovery for Alabama legislature"
git push origin feature/video-sources
# Open PR on GitHub
🎯 Specific Contribution Ideas
Priority 1: Add Video Sources to Scrapers
Goal: Enhance the sources field with verified video links
States to Start With:
- Alabama - Has YouTube channel, needs verification
- California - @CALegislature (well-documented)
- Texas - Multiple chambers on YouTube
- New York - Both Assembly and Senate channels
Implementation:
# In scrapers/al/__init__.py
class AlabamaScraper(BaseScraper):
def scrape_sources(self):
"""Add video sources for Alabama legislature."""
return {
"youtube": "https://www.youtube.com/@AlabamaLegislature",
"granicus": "https://alabama.granicus.com/ViewPublisher.php?view_id=6",
}
Priority 2: Meeting Minutes Integration
Goal: Link bills to meeting discussions
Use Case:
- When bill HB123 is discussed in committee
- Link to YouTube timestamp of discussion
- Extract quotes from meeting minutes
- Connect legislators' comments to votes
Implementation:
# Add meeting metadata to bill objects
bill.add_source(
url="https://www.youtube.com/watch?v=xyz&t=1234s",
note="Committee discussion at 20:34"
)
Priority 3: Granicus Portal Scraping
Goal: Automate discovery of Granicus video portals
Pattern:
- Many jurisdictions use Granicus for meeting videos
- URLs follow pattern:
{jurisdiction}.granicus.com/ViewPublisher.php?view_id={id} - Could automate discovery and link to OpenStates jurisdictions
🔒 License Compatibility
Our License
- Code: Open source (check root LICENSE file)
- Data: Citations required (see CITATIONS.md)
OpenStates License
- Code: BSD-style license (permissive)
- Data: Public domain (bulk downloads)
- Content: Varies by state (some restrictions)
✅ Compatible: Our code contributions would be compatible with their license.
📚 Required Reading Before Contributing
Before submitting any code to OpenStates, review:
-
Local Database Setup: https://docs.openstates.org/contributing/local-database/
- How to set up PostgreSQL locally
- How to run scrapers in development
- How to test data quality
-
Scraper Development Guide: https://docs.openstates.org/contributing/scrapers/
- Scrapy patterns used
- Data validation requirements
- Testing procedures
-
Code of Conduct: https://docs.openstates.org/code-of-conduct/
- Community standards
- Communication guidelines
- Enforcement policies
-
Schema Documentation: https://github.com/openstates/people/blob/master/schema.md
- Data model structure
- Required vs optional fields
- Relationship patterns
🚀 Next Steps
For This Project
- ✅ Citations Added - OpenStates properly credited
- ✅ Code of Conduct - Aligned with their standards
- ✅ Local Database - PostgreSQL dumps integrated
- ⏳ Test Contributions - Validate our code works with their schema
For Community Contribution
- Identify Target State - Choose state needing video sources
- Test Locally - Set up OpenStates dev environment
- Develop Scraper - Add video discovery code
- Submit PR - Follow their contribution guidelines
- Iterate - Respond to code review feedback
💡 Benefits of Contributing
For OpenStates:
- Enhanced video source coverage
- Better meeting-to-bill linkage
- More comprehensive legislative tracking
For Our Project:
- Upstream improvements benefit us
- Community recognition
- Better data quality for all users
For Civic Tech:
- Shared infrastructure improvements
- Reduced duplication of effort
- Stronger open-source ecosystem
📞 Questions?
- OpenStates Discord: https://discord.gg/openstates
- GitHub Discussions: https://github.com/openstates/openstates-scrapers/discussions
- Email: Open States team (check repository for contact info)
Last Updated: April 29, 2026
Maintained By: Open Navigator team