Skip to main content

Enterprise Tech Integration Guide

This guide documents the enterprise technology platforms and programs that support Open Navigator's data infrastructure.

Implementation Status Legend

  • Active - Fully implemented and in production use
  • 🔄 Recommended - Implementation recommended for enhancement
  • 📚 Reference - Used as inspiration for data modeling
  • 🔍 Evaluation - Under consideration for future adoption

1. Cloud & Data Platforms

✅ Microsoft: Tech for Social Impact

Status: ACTIVE - Nonprofit CDM fully implemented

What we use:

  • Nonprofit Common Data Model (CDM) for constituent management
  • 8 core entities: CONSTITUENT, DONATION, CAMPAIGN, DESIGNATION, MEMBERSHIP, VOLUNTEER_ACTIVITY, PROGRAM_DELIVERY, PROGRAM_OUTCOME

Files:

Resources:


🔄 Google: Data Commons

Status: RECOMMENDED - Implementation available, not yet deployed

What we use:

  • Knowledge Graph API for jurisdiction demographics
  • 100+ variables per jurisdiction (income, education, health, housing)
  • Simplifies Census Bureau data access

Implementation:

Next Steps:

  1. Install dependencies: pip install datacommons datacommons-pandas
  2. Update discovery/census_ingestion.py to use Data Commons client
  3. Replace manual Census API calls with simplified DC API
  4. Add time-series enrichment for historical trends

Example Usage:

from discovery.google_data_commons import DataCommonsClient

client = DataCommonsClient()

# Enrich a single jurisdiction
data = client.enrich_jurisdiction("01073") # Jefferson County, AL
print(data["Median_Income_Household"]) # $65,000

# Bulk enrich multiple jurisdictions
fips_codes = ["01073", "01089", "01097"]
df = client.enrich_jurisdictions_bulk(fips_codes)

# Get time series
df_ts = client.get_time_series("01073", start_year=2015)

Benefits:

  • ✅ Simpler API than raw Census Bureau
  • ✅ 100+ pre-integrated variables
  • ✅ Automatic data quality validation
  • ✅ Time series support
  • ✅ No API key required (free tier)

🔄 AWS: Open Data for Good

Status: PLANNED - Best practices for dataset exports

What we use:

  • Parquet format best practices
  • S3 storage patterns
  • AWS Glue Data Catalog

Recommendations for /exports folder:

  1. Format: Use Parquet with Snappy compression
  2. Partitioning: Partition by state/county/year
  3. Versioning: Enable S3 versioning for lineage
  4. Catalog: Use AWS Glue for schema management
  5. Querying: Athena for SQL without ETL

Next Steps:

  1. Review AWS Registry examples: https://registry.opendata.aws
  2. Update export scripts to generate Parquet
  3. Document partitioning strategy
  4. Consider AWS Glue for metadata

2. Data Engineering Platforms

✅ Databricks: Databricks for Good

Status: ACTIVE - Full implementation

What we use:

  • Unity Catalog: Model registry and data governance
  • Delta Lake: Bronze/Silver/Gold lakehouse architecture
  • MLflow: Agent deployment and experiment tracking
  • Model Serving: Auto-scaling REST endpoints for agents
  • Agent Bricks: Mosaic AI Agent Framework

Files:

  • pipeline/delta_lake.py - Delta Lake pipeline
  • agents/mlflow_classifier.py - Policy classifier agent
  • agents/mlflow_base.py - Base MLflow agent class
  • databricks/deployment.py - Unity Catalog deployment
  • databricks/evaluation.py - Agent evaluation framework
  • databricks/notebooks/01_agent_bricks_quickstart.py - Quickstart notebook

Resources:

Delta Sharing for Public Exports:

from databricks import delta_sharing

# Share Gold layer tables
share = delta_sharing.SharingClient()
share.create_share(
name="one_civic_data",
tables=["gold.jurisdictions", "gold.meetings", "gold.nonprofits"]
)

🔍 Snowflake: Snowflake for Good

Status: EVALUATION - Consider for enterprise data sharing

What we use:

  • Data Marketplace for Census/ESG data
  • Data sharing capabilities

Evaluation Criteria:

  • Cost vs. Databricks
  • Data Marketplace value-add
  • Enterprise collaboration needs

📚 Oracle: NetSuite Social Impact

Status: REFERENCE - Inspiration for nonprofit accounting

What we use:

  • Fund accounting model patterns
  • Grant tracking workflows

Resources:


📚 Salesforce: Nonprofit Success Pack (NPSP)

Status: REFERENCE - Inspiration for constituent management

What we use:

  • Household accounts model
  • Recurring donations pattern
  • Program engagement tracking

NPSP → ONE Mappings:

NPSP ObjectOur EntityUse Case
ContactCONSTITUENTDonor, volunteer, beneficiary
OpportunityDONATIONFinancial contributions
CampaignCAMPAIGNFundraising campaigns
Engagement PlanVOLUNTEER_ACTIVITYVolunteer tracking
Program CohortPROGRAM_DELIVERYProgram participants

Resources:


3. Infrastructure & AI

📚 Cisco: Crisis Response

Status: REFERENCE - Inspiration for platform resilience

Focus:

  • Network connectivity during emergencies
  • System resilience patterns

Resources:


📚 IBM: Science for Social Good

Status: REFERENCE - AI/ML use case patterns

Focus:

  • Watson AI for civic applications
  • Blockchain for transparency
  • Quantum computing potential

Resources:


🔍 Meta: Data for Good

Status: EVALUATION - Population mapping potential

What we use:

  • High-Resolution Population Density Maps
  • Social Connectedness Index

Evaluation:

  • Integration with demographics
  • Use for underserved area identification

Resources:


Summary: Current vs. Planned Integrations

PlatformStatusPriorityEffortValue
Microsoft CDM✅ Active--HIGH
Databricks✅ Active--HIGH
Google Data Commons🔄 RecommendedHIGHLowHIGH
AWS Best Practices🔄 PlannedMEDIUMMediumMEDIUM
Snowflake🔍 EvaluationLOWMediumMEDIUM
Meta Data for Good🔍 EvaluationLOWMediumMEDIUM
Salesforce NPSP📚 Reference---
Oracle NetSuite📚 Reference---
Cisco📚 Reference---
IBM📚 Reference---
  1. Google Data Commons (Immediate - Low effort, High value)

    • Install dependencies
    • Update census ingestion
    • Test with sample jurisdictions
    • Deploy to production
  2. AWS Export Optimization (Next sprint - Medium effort, Medium value)

    • Convert exports to Parquet
    • Implement partitioning
    • Document patterns
  3. Databricks Delta Sharing (Future - Medium effort, Medium value)

    • Configure sharing
    • Create public share
    • Document access
  4. Snowflake/Meta Evaluation (Backlog - TBD)

    • POC evaluation
    • Cost-benefit analysis
    • Decision by end of quarter

How to Cite These Partnerships

All enterprise technology partnerships are properly cited in:

Citations & Data Sources - Enterprise Tech for Social Good

Includes:

  • Full program URLs
  • Implementation status
  • License information
  • BibTeX citations (where applicable)
  • Code examples