Skip to main content

Schema Migration Guide

Overview

CommunityOne has migrated from a domain-specific oral health schema to a generic community engagement data platform. This enables broader civic tech applications beyond health policy.

What Changed

File Rename

  • Old: databricks/oral_health_schema.sql
  • New: databricks/communityone_schema.sql
  • Status: Legacy file renamed to .deprecated suffix

Table Renames

Old NameNew NamePurpose
fact_oral_health_observationfact_communityone_observationGeneric community outcome measurements
(no oral health prefix in other tables)All dimensions remain the sameGeography, date, measure, etc.

Dimension Table Updates

New: dim_jurisdiction

Replaces inline jurisdiction data with proper dimension table:

CREATE TABLE dim_jurisdiction (
jurisdiction_key string NOT NULL,
jurisdiction_id string, -- OCD-ID format
jurisdiction_name string,
jurisdiction_type string, -- city, county, state, district
geography_key string,
ocd_id string,
website_url string,
population int,
...
)

New: dim_organization

Nonprofit and foundation master dimension (IRS EO-BMF):

CREATE TABLE dim_organization (
organization_key string NOT NULL,
ein string,
organization_name string,
ntee_code string,
foundation_code string, -- 10-13=Foundation, 15=Public Charity
is_private_foundation boolean, -- 990-PF filers
asset_amount decimal(18, 2),
income_amount decimal(18, 2),
...
)

New Fact Tables (Previously Missing)

1. fact_grant - Grant Transactions

Purpose: Track individual grants between funders and recipients

Data Sources:

  • IRS Form 990 Schedule I (grants paid by nonprofits)
  • IRS Form 990-PF (private foundation giving)
  • USASpending.gov API (federal grants)
  • State grant databases
CREATE TABLE fact_grant (
grant_key string NOT NULL,
recipient_org_key string, -- FK to dim_organization
recipient_jurisdiction_key string, -- FK to dim_jurisdiction
funder_org_key string, -- FK to dim_organization
funder_jurisdiction_key string, -- FK to dim_jurisdiction
grant_amount decimal(18, 2),
grant_purpose string,
program_area string,
award_date_key int,
start_date_key int,
end_date_key int,
is_multi_year boolean,
funding_source string, -- federal, state, foundation, corporate
...
)

Example Queries:

-- Find all federal grants to dental nonprofits in Alabama
SELECT
g.grant_amount,
g.grant_purpose,
o.organization_name,
j.jurisdiction_name
FROM fact_grant g
JOIN dim_organization o ON g.recipient_org_key = o.organization_key
JOIN dim_jurisdiction j ON j.jurisdiction_key = g.recipient_jurisdiction_key
WHERE o.ntee_code LIKE 'E%' -- Health services
AND j.state_code = 'AL'
AND g.funding_source = 'federal'
AND g.grant_purpose LIKE '%dental%';

-- Track foundation giving patterns (990-PF data)
SELECT
funder.organization_name,
COUNT(*) as grant_count,
SUM(g.grant_amount) as total_giving,
AVG(g.grant_amount) as avg_grant_size
FROM fact_grant g
JOIN dim_organization funder ON g.funder_org_key = funder.organization_key
WHERE funder.is_private_foundation = TRUE
GROUP BY funder.organization_name
ORDER BY total_giving DESC;

2. fact_nonprofit_finance - Annual 990 Filings

Purpose: Detailed nonprofit financial health and revenue sources

CREATE TABLE fact_nonprofit_finance (
filing_key string NOT NULL,
organization_key string,
ein string,
tax_year int,
total_revenue decimal(18, 2),
total_expenses decimal(18, 2),
grants_paid decimal(18, 2),
government_grants decimal(18, 2), -- Revenue source
foundation_grants decimal(18, 2), -- Revenue source
corporate_donations decimal(18, 2), -- Revenue source
individual_donations decimal(18, 2), -- Revenue source
program_service_revenue decimal(18, 2), -- Earned income
overhead_ratio decimal(8, 4), -- Calculated metric
fundraising_efficiency decimal(8, 4), -- Calculated metric
...
)

Example Queries:

-- Compare revenue sources for health vs education nonprofits
SELECT
SUBSTR(o.ntee_code, 1, 1) as sector,
AVG(f.government_grants / f.total_revenue * 100) as govt_pct,
AVG(f.foundation_grants / f.total_revenue * 100) as foundation_pct,
AVG(f.individual_donations / f.total_revenue * 100) as individual_pct
FROM fact_nonprofit_finance f
JOIN dim_organization o ON f.organization_key = o.organization_key
WHERE o.ntee_code IN ('E', 'B') -- Health, Education
AND f.total_revenue > 0
GROUP BY SUBSTR(o.ntee_code, 1, 1);

-- Find most efficient nonprofits
SELECT
o.organization_name,
f.total_revenue,
f.overhead_ratio,
f.fundraising_efficiency
FROM fact_nonprofit_finance f
JOIN dim_organization o ON f.organization_key = o.organization_key
WHERE f.tax_year = 2023
AND f.overhead_ratio < 0.25 -- Less than 25% overhead
AND f.fundraising_efficiency > 4.0 -- $4+ raised per $1 spent
ORDER BY f.total_revenue DESC;

3. fact_jurisdiction_budget - Government Finances

Purpose: Track government budgets and spending priorities

CREATE TABLE fact_jurisdiction_budget (
budget_key string NOT NULL,
jurisdiction_key string,
fiscal_year int,
total_revenue decimal(18, 2),
total_expenditures decimal(18, 2),
federal_grants decimal(18, 2),
state_grants decimal(18, 2),
property_tax_revenue decimal(18, 2),
...
)

4. fact_meeting - Meetings & Public Hearings

Purpose: Track government transparency and public engagement

CREATE TABLE fact_meeting (
meeting_key string NOT NULL,
jurisdiction_key string,
meeting_date_key int,
meeting_type string,
has_agenda boolean,
has_minutes boolean,
has_video boolean,
topic_tags array<string>,
...
)

5. bridge_grant_program_area - Grant Multi-Purpose Support

Purpose: Handle grants supporting multiple program areas

CREATE TABLE bridge_grant_program_area (
grant_key string NOT NULL,
program_area_code string NOT NULL,
program_area_desc string,
allocation_pct decimal(5, 2), -- % of grant to this area
...
)

Migration Steps

1. For Databricks Users

Update your Unity Catalog schema creation scripts:

-- Old approach (DEPRECATED)
-- CREATE TABLE catalog.schema.fact_oral_health_observation ...

-- New approach
CREATE TABLE catalog.schema.fact_communityone_observation ...;
CREATE TABLE catalog.schema.fact_grant ...;
CREATE TABLE catalog.schema.fact_nonprofit_finance ...;
CREATE TABLE catalog.schema.dim_organization ...;
CREATE TABLE catalog.schema.dim_jurisdiction ...;

2. For Existing Data

If you have data in fact_oral_health_observation:

-- Rename table
ALTER TABLE catalog.schema.fact_oral_health_observation
RENAME TO fact_communityone_observation;

-- Or migrate data
INSERT INTO fact_communityone_observation
SELECT
observation_key,
measure_key,
geography_key,
NULL as jurisdiction_key, -- NEW column
stratification_key,
...
FROM fact_oral_health_observation;

3. Update Application Code

Python/SQL queries:

# Old
df = spark.table("fact_oral_health_observation")

# New
df = spark.table("fact_communityone_observation")

Documentation references:

  • Update ERD diagrams
  • Update API documentation
  • Update data dictionary

New Capabilities Enabled

1. Grant Flow Analysis

Track money flow from funders to recipients:

SELECT
funder.organization_name as funder,
recipient.organization_name as recipient,
SUM(g.grant_amount) as total_grants,
COUNT(*) as grant_count
FROM fact_grant g
JOIN dim_organization funder ON g.funder_org_key = funder.organization_key
JOIN dim_organization recipient ON g.recipient_org_key = recipient.organization_key
WHERE funder.is_private_foundation = TRUE
GROUP BY funder.organization_name, recipient.organization_name;

2. Nonprofit-Government Relationships

Which nonprofits receive the most government funding?

SELECT
o.organization_name,
SUM(CASE WHEN g.funding_source IN ('federal', 'state')
THEN g.grant_amount ELSE 0 END) as govt_grants,
COUNT(CASE WHEN g.funding_source IN ('federal', 'state')
THEN 1 END) as govt_grant_count
FROM dim_organization o
LEFT JOIN fact_grant g ON o.organization_key = g.recipient_org_key
GROUP BY o.organization_name
HAVING govt_grants > 0
ORDER BY govt_grants DESC;

3. Foundation Investment Patterns

990-PF Schedule I analysis:

-- Where are private foundations investing?
SELECT
g.program_area,
COUNT(DISTINCT funder.organization_key) as foundation_count,
SUM(g.grant_amount) as total_investment,
AVG(g.grant_amount) as avg_grant_size
FROM fact_grant g
JOIN dim_organization funder ON g.funder_org_key = funder.organization_key
WHERE funder.is_private_foundation = TRUE
AND g.program_area IS NOT NULL
GROUP BY g.program_area
ORDER BY total_investment DESC;

4. Financial Health Benchmarking

-- Compare your nonprofit to sector averages
WITH sector_avg AS (
SELECT
SUBSTR(o.ntee_code, 1, 1) as sector,
AVG(f.overhead_ratio) as avg_overhead,
AVG(f.fundraising_efficiency) as avg_efficiency
FROM fact_nonprofit_finance f
JOIN dim_organization o ON f.organization_key = o.organization_key
WHERE f.tax_year = 2023
GROUP BY SUBSTR(o.ntee_code, 1, 1)
)
SELECT
o.organization_name,
f.overhead_ratio,
s.avg_overhead as sector_avg_overhead,
f.fundraising_efficiency,
s.avg_efficiency as sector_avg_efficiency
FROM fact_nonprofit_finance f
JOIN dim_organization o ON f.organization_key = o.organization_key
JOIN sector_avg s ON SUBSTR(o.ntee_code, 1, 1) = s.sector
WHERE f.tax_year = 2023
AND o.ein = 'YOUR-EIN-HERE';

Backward Compatibility

Deprecated Fields

The following fields in dim_measure are renamed for generic use:

Old FieldNew FieldNotes
nohss_indicator_nbrindicator_nbrGeneric indicator number
nohss_indicator_group_typeindicator_group_typeGeneric grouping
nohss_indicator_descindicator_descGeneric description

Views for Compatibility

Create views to maintain old query compatibility:

CREATE VIEW fact_oral_health_observation AS
SELECT * FROM fact_communityone_observation
WHERE measure_key IN (
SELECT measure_key FROM dim_measure
WHERE indicator_group_type = 'oral_health'
);

Questions?