Schema Migration Guide
Overview
CommunityOne has migrated from a domain-specific oral health schema to a generic community engagement data platform. This enables broader civic tech applications beyond health policy.
What Changed
File Rename
- Old:
databricks/oral_health_schema.sql - New:
databricks/communityone_schema.sql - Status: Legacy file renamed to
.deprecatedsuffix
Table Renames
| Old Name | New Name | Purpose |
|---|---|---|
fact_oral_health_observation | fact_communityone_observation | Generic community outcome measurements |
| (no oral health prefix in other tables) | All dimensions remain the same | Geography, date, measure, etc. |
Dimension Table Updates
New: dim_jurisdiction
Replaces inline jurisdiction data with proper dimension table:
CREATE TABLE dim_jurisdiction (
jurisdiction_key string NOT NULL,
jurisdiction_id string, -- OCD-ID format
jurisdiction_name string,
jurisdiction_type string, -- city, county, state, district
geography_key string,
ocd_id string,
website_url string,
population int,
...
)
New: dim_organization
Nonprofit and foundation master dimension (IRS EO-BMF):
CREATE TABLE dim_organization (
organization_key string NOT NULL,
ein string,
organization_name string,
ntee_code string,
foundation_code string, -- 10-13=Foundation, 15=Public Charity
is_private_foundation boolean, -- 990-PF filers
asset_amount decimal(18, 2),
income_amount decimal(18, 2),
...
)
New Fact Tables (Previously Missing)
1. fact_grant - Grant Transactions
Purpose: Track individual grants between funders and recipients
Data Sources:
- IRS Form 990 Schedule I (grants paid by nonprofits)
- IRS Form 990-PF (private foundation giving)
- USASpending.gov API (federal grants)
- State grant databases
CREATE TABLE fact_grant (
grant_key string NOT NULL,
recipient_org_key string, -- FK to dim_organization
recipient_jurisdiction_key string, -- FK to dim_jurisdiction
funder_org_key string, -- FK to dim_organization
funder_jurisdiction_key string, -- FK to dim_jurisdiction
grant_amount decimal(18, 2),
grant_purpose string,
program_area string,
award_date_key int,
start_date_key int,
end_date_key int,
is_multi_year boolean,
funding_source string, -- federal, state, foundation, corporate
...
)
Example Queries:
-- Find all federal grants to dental nonprofits in Alabama
SELECT
g.grant_amount,
g.grant_purpose,
o.organization_name,
j.jurisdiction_name
FROM fact_grant g
JOIN dim_organization o ON g.recipient_org_key = o.organization_key
JOIN dim_jurisdiction j ON j.jurisdiction_key = g.recipient_jurisdiction_key
WHERE o.ntee_code LIKE 'E%' -- Health services
AND j.state_code = 'AL'
AND g.funding_source = 'federal'
AND g.grant_purpose LIKE '%dental%';
-- Track foundation giving patterns (990-PF data)
SELECT
funder.organization_name,
COUNT(*) as grant_count,
SUM(g.grant_amount) as total_giving,
AVG(g.grant_amount) as avg_grant_size
FROM fact_grant g
JOIN dim_organization funder ON g.funder_org_key = funder.organization_key
WHERE funder.is_private_foundation = TRUE
GROUP BY funder.organization_name
ORDER BY total_giving DESC;
2. fact_nonprofit_finance - Annual 990 Filings
Purpose: Detailed nonprofit financial health and revenue sources
CREATE TABLE fact_nonprofit_finance (
filing_key string NOT NULL,
organization_key string,
ein string,
tax_year int,
total_revenue decimal(18, 2),
total_expenses decimal(18, 2),
grants_paid decimal(18, 2),
government_grants decimal(18, 2), -- Revenue source
foundation_grants decimal(18, 2), -- Revenue source
corporate_donations decimal(18, 2), -- Revenue source
individual_donations decimal(18, 2), -- Revenue source
program_service_revenue decimal(18, 2), -- Earned income
overhead_ratio decimal(8, 4), -- Calculated metric
fundraising_efficiency decimal(8, 4), -- Calculated metric
...
)
Example Queries:
-- Compare revenue sources for health vs education nonprofits
SELECT
SUBSTR(o.ntee_code, 1, 1) as sector,
AVG(f.government_grants / f.total_revenue * 100) as govt_pct,
AVG(f.foundation_grants / f.total_revenue * 100) as foundation_pct,
AVG(f.individual_donations / f.total_revenue * 100) as individual_pct
FROM fact_nonprofit_finance f
JOIN dim_organization o ON f.organization_key = o.organization_key
WHERE o.ntee_code IN ('E', 'B') -- Health, Education
AND f.total_revenue > 0
GROUP BY SUBSTR(o.ntee_code, 1, 1);
-- Find most efficient nonprofits
SELECT
o.organization_name,
f.total_revenue,
f.overhead_ratio,
f.fundraising_efficiency
FROM fact_nonprofit_finance f
JOIN dim_organization o ON f.organization_key = o.organization_key
WHERE f.tax_year = 2023
AND f.overhead_ratio < 0.25 -- Less than 25% overhead
AND f.fundraising_efficiency > 4.0 -- $4+ raised per $1 spent
ORDER BY f.total_revenue DESC;
3. fact_jurisdiction_budget - Government Finances
Purpose: Track government budgets and spending priorities
CREATE TABLE fact_jurisdiction_budget (
budget_key string NOT NULL,
jurisdiction_key string,
fiscal_year int,
total_revenue decimal(18, 2),
total_expenditures decimal(18, 2),
federal_grants decimal(18, 2),
state_grants decimal(18, 2),
property_tax_revenue decimal(18, 2),
...
)
4. fact_meeting - Meetings & Public Hearings
Purpose: Track government transparency and public engagement
CREATE TABLE fact_meeting (
meeting_key string NOT NULL,
jurisdiction_key string,
meeting_date_key int,
meeting_type string,
has_agenda boolean,
has_minutes boolean,
has_video boolean,
topic_tags array<string>,
...
)
5. bridge_grant_program_area - Grant Multi-Purpose Support
Purpose: Handle grants supporting multiple program areas
CREATE TABLE bridge_grant_program_area (
grant_key string NOT NULL,
program_area_code string NOT NULL,
program_area_desc string,
allocation_pct decimal(5, 2), -- % of grant to this area
...
)
Migration Steps
1. For Databricks Users
Update your Unity Catalog schema creation scripts:
-- Old approach (DEPRECATED)
-- CREATE TABLE catalog.schema.fact_oral_health_observation ...
-- New approach
CREATE TABLE catalog.schema.fact_communityone_observation ...;
CREATE TABLE catalog.schema.fact_grant ...;
CREATE TABLE catalog.schema.fact_nonprofit_finance ...;
CREATE TABLE catalog.schema.dim_organization ...;
CREATE TABLE catalog.schema.dim_jurisdiction ...;
2. For Existing Data
If you have data in fact_oral_health_observation:
-- Rename table
ALTER TABLE catalog.schema.fact_oral_health_observation
RENAME TO fact_communityone_observation;
-- Or migrate data
INSERT INTO fact_communityone_observation
SELECT
observation_key,
measure_key,
geography_key,
NULL as jurisdiction_key, -- NEW column
stratification_key,
...
FROM fact_oral_health_observation;
3. Update Application Code
Python/SQL queries:
# Old
df = spark.table("fact_oral_health_observation")
# New
df = spark.table("fact_communityone_observation")
Documentation references:
- Update ERD diagrams
- Update API documentation
- Update data dictionary
New Capabilities Enabled
1. Grant Flow Analysis
Track money flow from funders to recipients:
SELECT
funder.organization_name as funder,
recipient.organization_name as recipient,
SUM(g.grant_amount) as total_grants,
COUNT(*) as grant_count
FROM fact_grant g
JOIN dim_organization funder ON g.funder_org_key = funder.organization_key
JOIN dim_organization recipient ON g.recipient_org_key = recipient.organization_key
WHERE funder.is_private_foundation = TRUE
GROUP BY funder.organization_name, recipient.organization_name;
2. Nonprofit-Government Relationships
Which nonprofits receive the most government funding?
SELECT
o.organization_name,
SUM(CASE WHEN g.funding_source IN ('federal', 'state')
THEN g.grant_amount ELSE 0 END) as govt_grants,
COUNT(CASE WHEN g.funding_source IN ('federal', 'state')
THEN 1 END) as govt_grant_count
FROM dim_organization o
LEFT JOIN fact_grant g ON o.organization_key = g.recipient_org_key
GROUP BY o.organization_name
HAVING govt_grants > 0
ORDER BY govt_grants DESC;
3. Foundation Investment Patterns
990-PF Schedule I analysis:
-- Where are private foundations investing?
SELECT
g.program_area,
COUNT(DISTINCT funder.organization_key) as foundation_count,
SUM(g.grant_amount) as total_investment,
AVG(g.grant_amount) as avg_grant_size
FROM fact_grant g
JOIN dim_organization funder ON g.funder_org_key = funder.organization_key
WHERE funder.is_private_foundation = TRUE
AND g.program_area IS NOT NULL
GROUP BY g.program_area
ORDER BY total_investment DESC;
4. Financial Health Benchmarking
-- Compare your nonprofit to sector averages
WITH sector_avg AS (
SELECT
SUBSTR(o.ntee_code, 1, 1) as sector,
AVG(f.overhead_ratio) as avg_overhead,
AVG(f.fundraising_efficiency) as avg_efficiency
FROM fact_nonprofit_finance f
JOIN dim_organization o ON f.organization_key = o.organization_key
WHERE f.tax_year = 2023
GROUP BY SUBSTR(o.ntee_code, 1, 1)
)
SELECT
o.organization_name,
f.overhead_ratio,
s.avg_overhead as sector_avg_overhead,
f.fundraising_efficiency,
s.avg_efficiency as sector_avg_efficiency
FROM fact_nonprofit_finance f
JOIN dim_organization o ON f.organization_key = o.organization_key
JOIN sector_avg s ON SUBSTR(o.ntee_code, 1, 1) = s.sector
WHERE f.tax_year = 2023
AND o.ein = 'YOUR-EIN-HERE';
Backward Compatibility
Deprecated Fields
The following fields in dim_measure are renamed for generic use:
| Old Field | New Field | Notes |
|---|---|---|
nohss_indicator_nbr | indicator_nbr | Generic indicator number |
nohss_indicator_group_type | indicator_group_type | Generic grouping |
nohss_indicator_desc | indicator_desc | Generic description |
Views for Compatibility
Create views to maintain old query compatibility:
CREATE VIEW fact_oral_health_observation AS
SELECT * FROM fact_communityone_observation
WHERE measure_key IN (
SELECT measure_key FROM dim_measure
WHERE indicator_group_type = 'oral_health'
);
Questions?
- Schema issues: See Data Model ERD
- Grant data sources: See Nonprofit Data Sources
- 990-PF parsing: See Form 990 XML Guide