Enterprise Tech Integration Guide

This guide documents the enterprise technology platforms and programs that support Open Navigator's data infrastructure.

Implementation Status Legend

✅ Active - Fully implemented and in production use
🔄 Recommended - Implementation recommended for enhancement
📚 Reference - Used as inspiration for data modeling
🔍 Evaluation - Under consideration for future adoption

1. Cloud & Data Platforms

Status: ACTIVE - Nonprofit CDM fully implemented

What we use:

Nonprofit Common Data Model (CDM) for constituent management
8 core entities: CONSTITUENT, DONATION, CAMPAIGN, DESIGNATION, MEMBERSHIP, VOLUNTEER_ACTIVITY, PROGRAM_DELIVERY, PROGRAM_OUTCOME

Files:

See Nonprofit & Philanthropy section
ERD: Data Model

Resources:

GitHub: https://github.com/microsoft/Industry-Accelerator-Nonprofit
License: MIT

🔄 Google: Data Commons

Status: RECOMMENDED - Implementation available, not yet deployed

What we use:

Knowledge Graph API for jurisdiction demographics
100+ variables per jurisdiction (income, education, health, housing)
Simplifies Census Bureau data access

Implementation:

Code: discovery/google_data_commons.py
Install: pip install datacommons datacommons-pandas
Documentation: https://docs.datacommons.org/api/

Next Steps:

Install dependencies: pip install datacommons datacommons-pandas
Update discovery/census_ingestion.py to use Data Commons client
Replace manual Census API calls with simplified DC API
Add time-series enrichment for historical trends

Example Usage:

from discovery.google_data_commons import DataCommonsClient

client = DataCommonsClient()

# Enrich a single jurisdiction
data = client.enrich_jurisdiction("01073")  # Jefferson County, AL
print(data["Median_Income_Household"])  # $65,000

# Bulk enrich multiple jurisdictions
fips_codes = ["01073", "01089", "01097"]
df = client.enrich_jurisdictions_bulk(fips_codes)

# Get time series
df_ts = client.get_time_series("01073", start_year=2015)

Benefits:

✅ Simpler API than raw Census Bureau
✅ 100+ pre-integrated variables
✅ Automatic data quality validation
✅ Time series support
✅ No API key required (free tier)

🔄 AWS: Open Data for Good

Status: PLANNED - Best practices for dataset exports

What we use:

Parquet format best practices
S3 storage patterns
AWS Glue Data Catalog

Recommendations for /exports folder:

Format: Use Parquet with Snappy compression
Partitioning: Partition by state/county/year
Versioning: Enable S3 versioning for lineage
Catalog: Use AWS Glue for schema management
Querying: Athena for SQL without ETL

Next Steps:

Review AWS Registry examples: https://registry.opendata.aws
Update export scripts to generate Parquet
Document partitioning strategy
Consider AWS Glue for metadata

2. Data Engineering Platforms

✅ Databricks: Databricks for Good

Status: ACTIVE - Full implementation

What we use:

Unity Catalog: Model registry and data governance
Delta Lake: Bronze/Silver/Gold lakehouse architecture
MLflow: Agent deployment and experiment tracking
Model Serving: Auto-scaling REST endpoints for agents
Agent Bricks: Mosaic AI Agent Framework

Files:

pipeline/delta_lake.py - Delta Lake pipeline
agents/mlflow_classifier.py - Policy classifier agent
agents/mlflow_base.py - Base MLflow agent class
databricks/deployment.py - Unity Catalog deployment
databricks/evaluation.py - Agent evaluation framework
databricks/notebooks/01_agent_bricks_quickstart.py - Quickstart notebook

Resources:

Documentation: https://docs.databricks.com/
Unity Catalog: https://docs.databricks.com/en/data-governance/unity-catalog/
Solution Accelerators: https://www.databricks.com/solutions/accelerators

Delta Sharing for Public Exports:

from databricks import delta_sharing

# Share Gold layer tables
share = delta_sharing.SharingClient()
share.create_share(
    name="one_civic_data",
    tables=["gold.jurisdictions", "gold.meetings", "gold.nonprofits"]
)

🔍 Snowflake: Snowflake for Good

Status: EVALUATION - Consider for enterprise data sharing

What we use:

Data Marketplace for Census/ESG data
Data sharing capabilities

Evaluation Criteria:

Cost vs. Databricks
Data Marketplace value-add
Enterprise collaboration needs

Status: REFERENCE - Inspiration for nonprofit accounting

What we use:

Fund accounting model patterns
Grant tracking workflows

Resources:

https://netsuite.com/social-impact

📚 Salesforce: Nonprofit Success Pack (NPSP)

Status: REFERENCE - Inspiration for constituent management

What we use:

Household accounts model
Recurring donations pattern
Program engagement tracking

NPSP → ONE Mappings:

NPSP Object	Our Entity	Use Case
Contact	CONSTITUENT	Donor, volunteer, beneficiary
Opportunity	DONATION	Financial contributions
Campaign	CAMPAIGN	Fundraising campaigns
Engagement Plan	VOLUNTEER_ACTIVITY	Volunteer tracking
Program Cohort	PROGRAM_DELIVERY	Program participants

Resources:

GitHub: https://github.com/SalesforceFoundation/NPSP
License: BSD-3-Clause

3. Infrastructure & AI

📚 Cisco: Crisis Response

Status: REFERENCE - Inspiration for platform resilience

Focus:

Network connectivity during emergencies
System resilience patterns

Resources:

https://cisco.com/crisis-response

Status: REFERENCE - AI/ML use case patterns

Focus:

Watson AI for civic applications
Blockchain for transparency
Quantum computing potential

Resources:

https://ibm.com/social-good

🔍 Meta: Data for Good

Status: EVALUATION - Population mapping potential

What we use:

High-Resolution Population Density Maps
Social Connectedness Index

Evaluation:

Integration with demographics
Use for underserved area identification

Resources:

https://dataforgood.facebook.com

Summary: Current vs. Planned Integrations

Platform	Status	Priority	Effort	Value
Microsoft CDM	✅ Active	-	-	HIGH
Databricks	✅ Active	-	-	HIGH
Google Data Commons	🔄 Recommended	HIGH	Low	HIGH
AWS Best Practices	🔄 Planned	MEDIUM	Medium	MEDIUM
Snowflake	🔍 Evaluation	LOW	Medium	MEDIUM
Meta Data for Good	🔍 Evaluation	LOW	Medium	MEDIUM
Salesforce NPSP	📚 Reference	-	-	-
Oracle NetSuite	📚 Reference	-	-	-
Cisco	📚 Reference	-	-	-
IBM	📚 Reference	-	-	-

Recommended Implementation Order

Google Data Commons (Immediate - Low effort, High value)
- Install dependencies
- Update census ingestion
- Test with sample jurisdictions
- Deploy to production
AWS Export Optimization (Next sprint - Medium effort, Medium value)
- Convert exports to Parquet
- Implement partitioning
- Document patterns
Databricks Delta Sharing (Future - Medium effort, Medium value)
- Configure sharing
- Create public share
- Document access
Snowflake/Meta Evaluation (Backlog - TBD)
- POC evaluation
- Cost-benefit analysis
- Decision by end of quarter

How to Cite These Partnerships

All enterprise technology partnerships are properly cited in:

Citations & Data Sources - Enterprise Tech for Social Good

Includes:

Full program URLs
Implementation status
License information
BibTeX citations (where applicable)
Code examples

Implementation Status Legend​

1. Cloud & Data Platforms​

✅ Microsoft: Tech for Social Impact​

🔄 Google: Data Commons​

🔄 AWS: Open Data for Good​

2. Data Engineering Platforms​

✅ Databricks: Databricks for Good​

🔍 Snowflake: Snowflake for Good​

📚 Oracle: NetSuite Social Impact​

📚 Salesforce: Nonprofit Success Pack (NPSP)​

3. Infrastructure & AI​

📚 Cisco: Crisis Response​

📚 IBM: Science for Social Good​

🔍 Meta: Data for Good​

Summary: Current vs. Planned Integrations​

Recommended Implementation Order​

How to Cite These Partnerships​

Implementation Status Legend

1. Cloud & Data Platforms

✅ Microsoft: Tech for Social Impact

🔄 Google: Data Commons

🔄 AWS: Open Data for Good

2. Data Engineering Platforms

✅ Databricks: Databricks for Good

🔍 Snowflake: Snowflake for Good

📚 Oracle: NetSuite Social Impact

📚 Salesforce: Nonprofit Success Pack (NPSP)

3. Infrastructure & AI

📚 Cisco: Crisis Response

📚 IBM: Science for Social Good

🔍 Meta: Data for Good

Summary: Current vs. Planned Integrations

Recommended Implementation Order

How to Cite These Partnerships