Enterprise Tech Integration Guide
This guide documents the enterprise technology platforms and programs that support Open Navigator's data infrastructure.
Implementation Status Legend
- ✅ Active - Fully implemented and in production use
- 🔄 Recommended - Implementation recommended for enhancement
- 📚 Reference - Used as inspiration for data modeling
- 🔍 Evaluation - Under consideration for future adoption
1. Cloud & Data Platforms
✅ Microsoft: Tech for Social Impact
Status: ACTIVE - Nonprofit CDM fully implemented
What we use:
- Nonprofit Common Data Model (CDM) for constituent management
- 8 core entities: CONSTITUENT, DONATION, CAMPAIGN, DESIGNATION, MEMBERSHIP, VOLUNTEER_ACTIVITY, PROGRAM_DELIVERY, PROGRAM_OUTCOME
Files:
- See Nonprofit & Philanthropy section
- ERD: Data Model
Resources:
- GitHub: https://github.com/microsoft/Industry-Accelerator-Nonprofit
- License: MIT
🔄 Google: Data Commons
Status: RECOMMENDED - Implementation available, not yet deployed
What we use:
- Knowledge Graph API for jurisdiction demographics
- 100+ variables per jurisdiction (income, education, health, housing)
- Simplifies Census Bureau data access
Implementation:
- Code:
discovery/google_data_commons.py - Install:
pip install datacommons datacommons-pandas - Documentation: https://docs.datacommons.org/api/
Next Steps:
- Install dependencies:
pip install datacommons datacommons-pandas - Update
discovery/census_ingestion.pyto use Data Commons client - Replace manual Census API calls with simplified DC API
- Add time-series enrichment for historical trends
Example Usage:
from discovery.google_data_commons import DataCommonsClient
client = DataCommonsClient()
# Enrich a single jurisdiction
data = client.enrich_jurisdiction("01073") # Jefferson County, AL
print(data["Median_Income_Household"]) # $65,000
# Bulk enrich multiple jurisdictions
fips_codes = ["01073", "01089", "01097"]
df = client.enrich_jurisdictions_bulk(fips_codes)
# Get time series
df_ts = client.get_time_series("01073", start_year=2015)
Benefits:
- ✅ Simpler API than raw Census Bureau
- ✅ 100+ pre-integrated variables
- ✅ Automatic data quality validation
- ✅ Time series support
- ✅ No API key required (free tier)
🔄 AWS: Open Data for Good
Status: PLANNED - Best practices for dataset exports
What we use:
- Parquet format best practices
- S3 storage patterns
- AWS Glue Data Catalog
Recommendations for /exports folder:
- Format: Use Parquet with Snappy compression
- Partitioning: Partition by
state/county/year - Versioning: Enable S3 versioning for lineage
- Catalog: Use AWS Glue for schema management
- Querying: Athena for SQL without ETL
Next Steps:
- Review AWS Registry examples: https://registry.opendata.aws
- Update export scripts to generate Parquet
- Document partitioning strategy
- Consider AWS Glue for metadata
2. Data Engineering Platforms
✅ Databricks: Databricks for Good
Status: ACTIVE - Full implementation
What we use:
- Unity Catalog: Model registry and data governance
- Delta Lake: Bronze/Silver/Gold lakehouse architecture
- MLflow: Agent deployment and experiment tracking
- Model Serving: Auto-scaling REST endpoints for agents
- Agent Bricks: Mosaic AI Agent Framework
Files:
pipeline/delta_lake.py- Delta Lake pipelineagents/mlflow_classifier.py- Policy classifier agentagents/mlflow_base.py- Base MLflow agent classdatabricks/deployment.py- Unity Catalog deploymentdatabricks/evaluation.py- Agent evaluation frameworkdatabricks/notebooks/01_agent_bricks_quickstart.py- Quickstart notebook
Resources:
- Documentation: https://docs.databricks.com/
- Unity Catalog: https://docs.databricks.com/en/data-governance/unity-catalog/
- Solution Accelerators: https://www.databricks.com/solutions/accelerators
Delta Sharing for Public Exports:
from databricks import delta_sharing
# Share Gold layer tables
share = delta_sharing.SharingClient()
share.create_share(
name="one_civic_data",
tables=["gold.jurisdictions", "gold.meetings", "gold.nonprofits"]
)
🔍 Snowflake: Snowflake for Good
Status: EVALUATION - Consider for enterprise data sharing
What we use:
- Data Marketplace for Census/ESG data
- Data sharing capabilities
Evaluation Criteria:
- Cost vs. Databricks
- Data Marketplace value-add
- Enterprise collaboration needs
📚 Oracle: NetSuite Social Impact
Status: REFERENCE - Inspiration for nonprofit accounting
What we use:
- Fund accounting model patterns
- Grant tracking workflows
Resources:
📚 Salesforce: Nonprofit Success Pack (NPSP)
Status: REFERENCE - Inspiration for constituent management
What we use:
- Household accounts model
- Recurring donations pattern
- Program engagement tracking
NPSP → ONE Mappings:
| NPSP Object | Our Entity | Use Case |
|---|---|---|
| Contact | CONSTITUENT | Donor, volunteer, beneficiary |
| Opportunity | DONATION | Financial contributions |
| Campaign | CAMPAIGN | Fundraising campaigns |
| Engagement Plan | VOLUNTEER_ACTIVITY | Volunteer tracking |
| Program Cohort | PROGRAM_DELIVERY | Program participants |
Resources:
- GitHub: https://github.com/SalesforceFoundation/NPSP
- License: BSD-3-Clause
3. Infrastructure & AI
📚 Cisco: Crisis Response
Status: REFERENCE - Inspiration for platform resilience
Focus:
- Network connectivity during emergencies
- System resilience patterns
Resources:
📚 IBM: Science for Social Good
Status: REFERENCE - AI/ML use case patterns
Focus:
- Watson AI for civic applications
- Blockchain for transparency
- Quantum computing potential
Resources:
🔍 Meta: Data for Good
Status: EVALUATION - Population mapping potential
What we use:
- High-Resolution Population Density Maps
- Social Connectedness Index
Evaluation:
- Integration with demographics
- Use for underserved area identification
Resources:
Summary: Current vs. Planned Integrations
| Platform | Status | Priority | Effort | Value |
|---|---|---|---|---|
| Microsoft CDM | ✅ Active | - | - | HIGH |
| Databricks | ✅ Active | - | - | HIGH |
| Google Data Commons | 🔄 Recommended | HIGH | Low | HIGH |
| AWS Best Practices | 🔄 Planned | MEDIUM | Medium | MEDIUM |
| Snowflake | 🔍 Evaluation | LOW | Medium | MEDIUM |
| Meta Data for Good | 🔍 Evaluation | LOW | Medium | MEDIUM |
| Salesforce NPSP | 📚 Reference | - | - | - |
| Oracle NetSuite | 📚 Reference | - | - | - |
| Cisco | 📚 Reference | - | - | - |
| IBM | 📚 Reference | - | - | - |
Recommended Implementation Order
-
Google Data Commons (Immediate - Low effort, High value)
- Install dependencies
- Update census ingestion
- Test with sample jurisdictions
- Deploy to production
-
AWS Export Optimization (Next sprint - Medium effort, Medium value)
- Convert exports to Parquet
- Implement partitioning
- Document patterns
-
Databricks Delta Sharing (Future - Medium effort, Medium value)
- Configure sharing
- Create public share
- Document access
-
Snowflake/Meta Evaluation (Backlog - TBD)
- POC evaluation
- Cost-benefit analysis
- Decision by end of quarter
How to Cite These Partnerships
All enterprise technology partnerships are properly cited in:
Citations & Data Sources - Enterprise Tech for Social Good
Includes:
- Full program URLs
- Implementation status
- License information
- BibTeX citations (where applicable)
- Code examples