Skip to main content

File Migration to Events Naming Convention

This guide shows how to use the migration script to rename old meeting/contact files to the new events_ naming convention.

Quick Start​

# 1. Dry run to see what would be renamed (safe, no changes)
python scripts/migrate_to_events_naming.py --dry-run

# 2. Perform the migration WITH backups (recommended)
python scripts/migrate_to_events_naming.py

# 3. Skip backups if you already have external backups
python scripts/migrate_to_events_naming.py --no-backup

# 4. Clean up backup directories after verifying migration
python scripts/migrate_to_events_naming.py --cleanup-backups

Migration Map​

The script automatically renames:

Old NameNew Name
meetings.parquetevents.parquet
meetings_calendar.parquetevents.parquet
meetings_transcripts.parquetevent_documents.parquet
meetings_topics.parquetevent_agenda_items.parquet
meetings_demographics.parquetevent_participants.parquet
meetings_decisions.parquetevent_bills.parquet
contacts_meeting_attendance.parquetevent_participants.parquet
events_events.parquetevents.parquet
events_event_documents.parquetevent_documents.parquet
events_event_participants.parquetevent_participants.parquet
events_event_agenda_items.parquetevent_agenda_items.parquet
events_event_bills.parquetevent_bills.parquet
events_event_media.parquetevent_media.parquet

Options​

--dry-run​

Show what would be renamed without making changes:

python scripts/migrate_to_events_naming.py --dry-run

--no-backup​

Skip creating backups (NOT recommended unless you have external backups):

python scripts/migrate_to_events_naming.py --no-backup

--cleanup-backups​

Remove all .migration_backup/ directories after verifying the migration:

# Dry run to see what would be deleted
python scripts/migrate_to_events_naming.py --cleanup-backups --dry-run

# Actually delete backups (will prompt for confirmation)
python scripts/migrate_to_events_naming.py --cleanup-backups

--directory​

Specify a different directory to scan (default: data/gold):

python scripts/migrate_to_events_naming.py --directory data/gold/states/AL

Safe Migration Process​

  1. Verify current files:

    find data/gold -name "*.parquet" -type f | sort
  2. Run dry-run to preview changes:

    python scripts/migrate_to_events_naming.py --dry-run
  3. Perform migration with backups:

    python scripts/migrate_to_events_naming.py

    This creates backups in .migration_backup/ directories (automatically gitignored).

  4. Verify the migration worked:

    # Check new files exist
    find data/gold -name "events_*.parquet" -type f | sort

    # Check the API still works
    cd api && uvicorn main:app --reload
  5. Clean up backups (after verification):

    python scripts/migrate_to_events_naming.py --cleanup-backups

Backup Location​

Backups are stored in .migration_backup/ directories next to the original files:

data/gold/states/AL/
├── events_events.parquet # New file
└── .migration_backup/
└── meetings_20260429_153022.parquet # Backup with timestamp

These directories are automatically ignored by git (see .gitignore).

Troubleshooting​

"Target already exists"​

If a new-named file already exists, the script will skip that file. You'll need to manually resolve:

# Option 1: Delete the old file if new one is correct
rm data/gold/states/AL/meetings.parquet

# Option 2: Compare and merge if needed
python -c "import pandas as pd; print(pd.read_parquet('old.parquet').equals(pd.read_parquet('new.parquet')))"

"No files found"​

If the script finds no files to rename, either:

  • Files are already using new naming ✅
  • You're scanning the wrong directory (use --directory)
  • Files don't match the expected names

Reverting Migration​

If you need to revert (and backups still exist):

# Restore from backups manually
cd data/gold/states/AL/.migration_backup
for f in *.parquet; do
original=$(echo $f | sed 's/_[0-9]\{8\}_[0-9]\{6\}//')
cp "$f" "../$original"
done