Quick Start Guide
Three Servicesβ
This project runs three separate services. Launch all three at once with ./start-all.sh:
| Service | Port (Local) | Live URL | Description |
|---|---|---|---|
βοΈ Open Navigator (web_app) | 5173 | www.communityone.com | Main application β search, filters, heatmap, data exploration |
π Documentation (web_docs) | 3000 | www.communityone.com/docs | Docusaurus site with complete guides and tutorials |
π₯ API Backend (api) | 8000 | www.communityone.com/api/docs | FastAPI server with AI agents |
π‘ LIVE DEMO: Visit www.communityone.com to use the hosted application.
π» LOCAL DEV: After running
./start-all.sh, visit http://localhost:5173.
Prerequisitesβ
- Python 3.11+
- Node.js 18+
- A local Postgres warehouse on
localhost:5433(databasesopen_navigatorandopenstates) β see Configuration below - PostgreSQL client tools 17 (
psql,pg_dump,pg_restore) β recommended version. The local warehouse server runs PG 16, but a 17 client dumps it and the (PG 16/17) Neon serving DB, and reads both dump formats. Keeppg_dumpandpg_restoreon the same major version (β₯ any server you back up) to avoidunsupported versionerrors. - Docker (optional β only for the containerized deployment)
Installationβ
Option 1: Start Everything at Once (Recommended)β
# Clone repository
git clone https://github.com/getcommunityone/open-navigator.git
cd open-navigator
# Install dependencies
./install.sh # Python backend (creates .venv + .env from template)
cd web_app && npm install && cd .. # React app
cd web_docs && npm install && cd .. # Documentation
# Start all three services in tmux
./start-all.sh
start-all.sh auto-installs the web_app/web_docs node_modules if they're missing,
so the two npm install steps above are optional on first run.
Option 2: Using Makefileβ
# Install
make install # Python backend
make install-web_app # React app
make install-docs # Documentation
# Start all services
make start-all
# β¦or individually:
make dev # API only
make dev-web_app # React app only
make dev-docs # Docs only
Option 3: Manual Setupβ
# Python backend
python3 -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install -r requirements.txt
# Optional: Spark + Delta Lake (only if you'll run Databricks/Spark scripts).
# Requires a Java runtime (e.g. `sudo apt install openjdk-17-jre-headless`).
# pip install -r requirements-spark.txt
# React app + documentation
cd web_app && npm install && cd ..
cd web_docs && npm install && cd ..
# Configure environment (see Configuration below)
cp .env.example .env
# Start services in separate terminals:
source .venv/bin/activate && python main.py serve # Terminal 1 β API (8000)
cd web_app && npm run dev # Terminal 2 β App (5173)
cd web_docs && npm start # Terminal 3 β Docs (3000)
Option 4: Windows (PowerShell)β
The .sh scripts and the make targets are Unix-oriented (start-all.sh uses
tmux, which Windows lacks). Use the PowerShell equivalents instead β they create
the same .venv, install from the same requirements.txt, and launch the same three
services, each in its own window:
# From the repo root, in PowerShell:
.\install.ps1 # Python backend: creates .venv, installs deps, seeds .env
cd web_app; npm install; cd ..
cd web_docs; npm install; cd ..
.\start-all.ps1 # API (8000) + App (5173) + Docs (3000), one window each
If you see "running scripts is disabled on this system", PowerShell's execution policy is blocking the script. Either allow local scripts once for your user:
Set-ExecutionPolicy -Scope CurrentUser RemoteSigned
β¦or run each script without changing the policy:
powershell -ExecutionPolicy Bypass -File .\install.ps1
powershell -ExecutionPolicy Bypass -File .\start-all.ps1
β οΈ Don't use
uv syncto set up the backend on any OS. The rootpyproject.tomlis a uv workspace whose members are onlypackages/*, souv syncinstalls those workspace libraries but not the top-levelrequirements.txtβ leaving out the dev tooling (pytest,black,ruff) and runtime deps likeyt-dlp. Install the backend fromrequirements.txt(.\install.ps1,./install.sh, orpip install -r requirements.txt), which is the complete set.uv syncis only for working inside thepackages/*libraries.
Tesseract / OCR on Windows:
install.ps1installs Tesseract viawinget(or Chocolatey) when available; otherwise grab the UB-Mannheim build. OCR is optional β the app runs without it.
Configurationβ
.env.example is organized in tiers so you only set what you actually use.
Copy it to .env and fill in values as needed:
cp .env.example .env
Required (minimum to run locally)β
The site needs exactly one variable to boot. It points the API at the
already-running local Postgres warehouse on port 5433, and the API reads it for
both the civic-data warehouse and the auth/user tables:
NEON_DATABASE_URL_DEV=postgresql://postgres:password@localhost:5433/open_navigator
The web app (5173) and docs (3000) need no env vars to boot. With just the line
above, ./start-all.sh brings up all three services.
Optional for local developmentβ
Set these only for the features that need them β the site runs without all of them:
# OAuth login (omit a provider to disable just that login button)
FRONTEND_URL=http://localhost:5173
API_BASE_URL=http://localhost:8000
# GOOGLE_CLIENT_ID / GOOGLE_CLIENT_SECRET, HUGGINGFACE_CLIENT_ID / _SECRET, etc.
# Stable JWT signing across restarts (auto-generated if unset)
# JWT_SECRET_KEY=$(openssl rand -hex 32)
# Bill / legislator / vote features (restore the Open States dump first)
# OPENSTATES_DATABASE_URL=postgresql://postgres:postgres@localhost:5433/openstates
Data-source API keys (ingestion only)β
Keys such as OPENAI_API_KEY, the GEMINI_API_KEY_* pool, CENSUS_API_KEY, and the
other source keys are needed only to run the ingestion/enrichment pipelines that
populate the warehouse β not to view the site.
Deploymentβ
HF_TOKEN / HF_ORGANIZATION (HuggingFace Spaces), NEON_DATABASE_URL (production
Neon), and the DATABRICKS_* variables are for deployment only and are not needed for
local development.
See
.env.examplefor the full, commented list of every variable across all five tiers.
Restore the Databaseβ
The three services give you the UI, but the app shows no civic data until you load a
warehouse snapshot into the local Postgres on localhost:5433. Restoring a shared dump
is far faster than rebuilding every dbt model from scratch.
If you have the Google Drive backup folder synced, it's one command:
make restore VERSION=snapshot-20260609 # dev only
Otherwise, restore a dump someone shared with you directly:
# Create the DB if needed, then restore (rebuilds the `public` serving schema the API reads):
PGPASSWORD=password createdb -h localhost -p 5433 -U postgres open_navigator 2>/dev/null || true
PGPASSWORD=password pg_restore -h localhost -p 5433 -U postgres -d open_navigator \
--clean --if-exists open_navigator.dump
The API serves the public schema in open_navigator by default (API_DB_SCHEMA=public).
After the restore, refresh http://localhost:5173 β search, maps, and the heatmap will be populated.
Restore only into a local/dev warehouse β never into a production database.
Access Pointsβ
π» Local development:
- π Main App: http://localhost:5173
- π Documentation: http://localhost:3000
- π₯ API Docs: http://localhost:8000/docs
π Live application:
- π Open Navigator: https://www.communityone.com
- π Documentation: https://www.communityone.com/docs
- π₯ API Docs: https://www.communityone.com/api/docs
Stop Servicesβ
./stop-all.sh
# or
make stop-all
Running the Systemβ
Start the API Serverβ
# Using the virtual environment
source .venv/bin/activate
python main.py serve
# Or using make
make run
Visit http://localhost:8000 for the API and http://localhost:8000/docs for interactive documentation.
Common Commandsβ
# Activate virtual environment (required for all commands)
source .venv/bin/activate
# Start API server
python main.py serve
# Run with auto-reload (development)
python main.py serve --reload
# Check system status
python main.py status
# Run tests
pytest
# Or using make
make test
Troubleshootingβ
"ModuleNotFoundError: No module named 'click'"β
You need to activate the virtual environment first:
source .venv/bin/activate
"Tesseract binary not found" or OCR errorsβ
The install.sh script automatically installs tesseract-ocr on Linux (via apt) and macOS (via brew). If it failed or you're on a different system, install manually:
Linux (Debian/Ubuntu):
sudo apt-get update && sudo apt-get install -y tesseract-ocr
macOS:
brew install tesseract
Verify installation:
tesseract --version
OCR is optional but enables text extraction from scanned PDFs and images.
"error: externally-managed-environment"β
Don't use pip install directly. Use the virtual environment:
# Create venv if not exists
python3 -m venv .venv
# Activate it
source .venv/bin/activate
# Now install
pip install -r requirements.txt
Permission denied when running install.shβ
chmod +x install.sh
./install.sh
Releases & Data Versioningβ
Open Navigator follows Semantic Versioning (MAJOR.MINOR.PATCH).
Every release is tied to a Postgres backup so that a given version of the code can
always be paired with the warehouse state it was built and tested against.
Given a version MAJOR.MINOR.PATCH (e.g. 1.4.2):
| Bump | When | Example |
|---|---|---|
| MAJOR | Breaking API/schema change, dropped table or endpoint, incompatible dbt model | 1.4.2 β 2.0.0 |
| MINOR | New data source, new endpoint, new dbt mart, backward-compatible feature | 1.4.2 β 1.5.0 |
| PATCH | Bug fix, data backfill, doc change, no schema or contract change | 1.4.2 β 1.4.3 |
A release bundles three things at the same version number:
- Code β a git tag (
vMAJOR.MINOR.PATCH). - Schema/marts β the dbt models as built at that tag.
- Data β a Postgres backup snapshot stored off-machine (see Database Backups below).
Database Backupsβ
Backup and restore are Makefile targets β no manual pg_dump/pg_restore needed. Each
stamps the dump files with version/date/git SHA and syncs them off-machine through Google
Drive. Pick the scope that fits β and note that two of the three are free of personal
user data:
| Command | Scope | Personal user data? |
|---|---|---|
make backup VERSION=v1.5.0 | Full β entire open_navigator warehouse (bronze/gold/staging/intermediate/public) + openstates. Self-contained; ~170 GB. | β οΈ Yes β includes the user/auth/social tables. Keep private. |
make backup-neon VERSION=v1.5.0 | Neon serving (recommended) β dumps the production Neon serving DB; civic data as standalone materialized tables. Small (~0.5 GB). | β No |
make backup-public VERSION=v1.5.0 | Local public β dumps only the local public serving schema; civic views + event_documents. | β No (personal tables excluded) |
make restore VERSION=v1.5.0 | Restore the full backup into the local warehouse. Dev only. | β |
make restore-neon VERSION=v1.5.0 | Restore a Neon snapshot into a separate local db (open_navigator_serving). Dev only. | β |
make restore-public VERSION=v1.5.0 | Restore the local public schema (needs gold present). Dev only. | β |
The VERSION label decides where the dump is filed inside the backup folder:
- A semver tag (
v1.5.0) βopen-navigator-backups/releases/v1.5.0/β for tagged releases. - Any other label (e.g.
2026-06-09,snapshot-20260609) βopen-navigator-backups/snapshots/<label>/β for ad-hoc point-in-time backups.
restore* searches both folders, so you restore with the same label you backed up with
regardless of which one it landed in. Exact commands live in the
backup targets in the Makefile.
Backing up the serving data without user PIIβ
To share or version the public civic data without shipping personal user information
(accounts, OAuth state, social graph, feed prefs, saved locations), use one of the
PII-free targets. Both exclude the same app/runtime tables that the Neon serving DB never
mirrors (user, contact_oauth_state, social_follows, user_lens_prefs,
user_locations, user_signal_prefs, meeting_document_gap_cache).
make backup-neon (recommended). Dumps the production Neon serving DB
(NEON_DATABASE_URL from .env). That DB is civic-only by construction β the
sync_public_to_neon.py loader never copies the user/auth tables β and its serving objects
are real materialized tables, so the dump is standalone (no dependency on gold):
make backup-neon VERSION=snapshot-20260609 # writes one small neon_serving_*.dump, no PII
pg_dump is read-only, so this is safe to run against prod. Restore into a separate
local database (never prod) and optionally point the API at it for a PII-free local
serving instance:
make restore-neon VERSION=snapshot-20260609 # β local db open_navigator_serving (dev only)
The Neon dump is taken with --no-owner --no-privileges, and restore-neon restores with
--role=$(PG_USER), so all objects end up owned by your local postgres user β not the
Neon role. (Same for make backup-public.)
make backup-public. Dumps the local public schema with the personal tables
excluded β civic views + event_documents only. It does not include the Postgres
extensions (pg_trgm, btree_gin, β¦) that live in public, so restoring it never drops
them and gold's indexes stay safe. Because the views reference gold, restore it onto a
warehouse that already has gold:
make backup-public VERSION=snapshot-20260609 # local public, personal tables excluded
Restore only into a local/dev warehouse β never into a production database. The full
make backupis the only one that contains user accounts; treat its dumps as private and do not share them via a public Drive link.
Client version. The backup targets auto-select the newest PostgreSQL client under
/usr/lib/postgresql/*/bin(so a PG 17pg_dumpis used for the PG 17 Neon server even when yourPATHstill points at PG 16). Install PG 17 client tools (see Prerequisites). Override the choice if needed withmake backup-neon VERSION=β¦ PG_BIN=/usr/lib/postgresql/17/bin/(orPG_BIN=to forcePATH).
Google Drive folder (one-time setup)β
make backup writes into a folder synced by Google Drive for Desktop, reached in WSL
through a symlink named open-navigator-backups in the repo root. Set it up once per machine:
# 1. Google Drive for Desktop must be running on Windows (so H:\My Drive is accessible).
# 2. Mount H: into WSL and make it persist across restarts:
sudo mkdir -p /mnt/h && sudo mount -t drvfs 'H:' /mnt/h
echo 'H: /mnt/h drvfs defaults 0 0' | sudo tee -a /etc/fstab
# 3. Create the Drive folder and link it into the repo (the symlink is gitignored):
mkdir -p "/mnt/h/My Drive/open-navigator-backups"
ln -sfn "/mnt/h/My Drive/open-navigator-backups" open-navigator-backups
# 4. Verify:
test -d open-navigator-backups/ && echo "β
Drive backup folder ready"
Different Drive letter? Swap
H://mnt/h. No Drive for Desktop? PointBACKUP_DIRat any folder and sync it withrcloneinstead.
Create a backupβ
With the Google Drive folder in place, dump both databases with one command. Pick any label β a semver tag for a release, or a date for an ad-hoc snapshot:
make backup VERSION=snapshot-20260609
This stages the dumps on local disk, copies them into
open-navigator-backups/snapshots/snapshot-20260609/ (a non-v label files under
snapshots/; each filename stamped with the version, date, and git SHA), and Google Drive
for Desktop syncs them off-machine automatically. Confirm they landed:
ls open-navigator-backups/snapshots/snapshot-20260609/
# open_navigator_snapshot-20260609_20260609_a1b2c3d.dump
# openstates_snapshot-20260609_20260609_a1b2c3d.dump
Share a snapshot with a collaboratorβ
- Run
make backup VERSION=<label>. - At drive.google.com, open the
open-navigator-backupsfolder β right-click the<label>folder β Share β set "Anyone with the link β Viewer" and copy the link. - The recipient either shares the same Drive folder and runs
make restore VERSION=<label>, or downloads the.dumpand restores it manually (see Restore the Database).
For a tagged release, also push the matching git tag and record the backup link + SHA in the Release History:
git tag -a v1.5.0 -m "feat: add grants.gov opportunities to search" && git push origin v1.5.0
Next Stepsβ
- Configure your
.envfile (see Configuration β onlyNEON_DATABASE_URL_DEVis required) - Start all three services:
./start-all.sh - Open the app at http://localhost:5173
- Check out the interactive API docs: http://localhost:8000/docs
For more details, see the main README.md.