Skip to main content

eBoard Platform Manual Download Guide

Issue: Incapsula Bot Protectionโ€‹

eBoard Solutions (https://simbli.eboardsolutions.com) uses Incapsula anti-bot protection that blocks automated scraping, even with advanced tools like Playwright. The platform requires manual interaction to access meeting documents.

Affected School Districtsโ€‹

District (AL)jurisdiction_idPublic hub / board pageSimbli agendas & minutes
Tuscaloosa City School Districtschool_district_0103360Board of Education (Finalsite)Simbli meeting listing S=2088 ยท index s=2088
Tuscaloosa County School District (TCSS)school_district_0103390Board of Education (Finalsite; links to Simbli)Simbli meeting listing S=2092 ยท index s=2092

Curated Tuscaloosa city (school_district_0103360) and county (school_district_0103390) hub + Simbli URLs are in the dbt seed jurisdiction_website_url_overrides.csv. NCES does not emit Simbli links, so %simbli% appears only via these overrides after dbt seed + dbt run --select int_jurisdiction_websites.

Query from Postgresโ€‹

After dbt seed and dbt run --select int_jurisdiction_websites, use intermediate.int_jurisdiction_websites (see dbt_project/models/intermediate/int_jurisdiction_websites.sql โ€” public.* may be wrong or stale).

Tuscaloosa County School District โ€” hub + Simbli:

SELECT
jurisdiction_id,
organization_name,
website_source,
website_url
FROM intermediate.int_jurisdiction_websites
WHERE jurisdiction_id = 'school_district_0103390'
AND (
website_url ILIKE '%tcss.net%'
OR website_url ILIKE '%simbli.eboardsolutions.com%'
)
ORDER BY website_url;

Tuscaloosa City School District โ€” board hub + Simbli:

SELECT
jurisdiction_id,
organization_name,
website_source,
website_url
FROM intermediate.int_jurisdiction_websites
WHERE jurisdiction_id = 'school_district_0103360'
AND (
website_url ILIKE '%tuscaloosacityschools.com%'
OR website_url ILIKE '%simbli.eboardsolutions.com%'
)
ORDER BY website_url;

Simbli URLs only (Tuscaloosa City) โ€” prefer trim(website_url) and path patterns so you still match Simbli if the host string differs slightly:

SELECT
jurisdiction_id,
organization_name,
website_source,
trim(website_url) AS website_url
FROM intermediate.int_jurisdiction_websites
WHERE jurisdiction_id = 'school_district_0103360'
AND (
trim(website_url) ILIKE '%simbli%'
OR trim(website_url) ILIKE '%SB_MeetingListing.aspx%'
OR trim(website_url) ILIKE '%/SB_Meetings/%'
)
ORDER BY website_url;

If this returns nothing, confirm the seed row exists and rebuild int_jurisdiction_websites (dbt seed + dbt run).

Both districts (debug):

SELECT jurisdiction_id, organization_name, website_source, website_url
FROM intermediate.int_jurisdiction_websites
WHERE jurisdiction_id IN ('school_district_0103360', 'school_district_0103390')
ORDER BY jurisdiction_id, website_url;

Seed table only:

SELECT jurisdiction_id, website_url
FROM seeds.jurisdiction_website_url_overrides
WHERE jurisdiction_id IN ('school_district_0103360', 'school_district_0103390')
ORDER BY jurisdiction_id, website_url;

Manual Download Stepsโ€‹

1. Access Meeting Listingsโ€‹

  1. Visit the meetings URL above in your browser
  2. You'll see a calendar or list of board meetings
  3. Each meeting shows the date and has document links

2. Download Documentsโ€‹

For each meeting:

  • Click on the meeting date to view details
  • Look for:
    • Agenda (usually PDF)
    • Minutes (usually PDF)
    • Packets (supporting materials)
  • Right-click each document โ†’ "Save As"

3. Organize Downloadsโ€‹

Save files with naming pattern:

tuscaloosa_city_schools_YYYY-MM-DD_agenda.pdf
tuscaloosa_city_schools_YYYY-MM-DD_minutes.pdf

4. Import into Systemโ€‹

Once downloaded, you can import them manually:

from pipeline.delta_lake import DeltaLakePipeline
from agents.scraper import ScraperAgent
import asyncio

async def import_manual_pdfs(pdf_directory: str):
"""Import manually downloaded PDFs into the system."""
scraper = ScraperAgent()
async with scraper:
documents = []

for pdf_path in Path(pdf_directory).glob("*.pdf"):
# Extract content from PDF
content = await scraper._scrape_pdf_document(str(pdf_path))

if content:
# Parse filename for metadata
parts = pdf_path.stem.split('_')
date_str = parts[2] if len(parts) > 2 else ""
doc_type = parts[3] if len(parts) > 3 else "document"

doc = {
'document_id': hashlib.md5(str(pdf_path).encode()).hexdigest(),
'source_url': f'file://{pdf_path}',
'municipality': 'Tuscaloosa City Schools',
'state': 'AL',
'meeting_date': date_str,
'meeting_type': 'Board Meeting',
'title': pdf_path.stem,
'content': content,
'metadata': {'source': 'manual_download', 'platform': 'eboard'}
}
documents.append(doc)

# Write to Delta Lake
pipeline = DeltaLakePipeline()
pipeline.write_raw_documents(documents)

return documents

# Usage:
# asyncio.run(import_manual_pdfs('/path/to/downloaded/pdfs'))

Alternative: RSS Feedsโ€‹

Some eBoard installations offer RSS feeds or calendar exports:

  1. Look for RSS icon on meetings page
  2. Look for "Subscribe" or "Export to Calendar" options
  3. These may bypass the web interface restrictions

Future Enhancement Ideasโ€‹

  1. Browser Extension: Create a Chrome extension that scrapes while you browse
  2. API Discovery: Research if eBoard has any undocumented APIs
  3. Selenium Grid: Use residential proxy services for more sophisticated bot evasion
  4. Contact District: Request bulk export of meeting documents directly

Why Automation Failsโ€‹

eBoard's Incapsula protection includes:

  • Browser fingerprinting (detects headless browsers)
  • IP reputation checking
  • JavaScript challenges (requires full browser execution)
  • Session tracking (blocks rapid sequential requests)
  • Rate limiting per IP address

Even with Playwright running in visible mode, subsequent page navigations get blocked once the system detects automated patterns.

For comprehensive school district data:

  1. Prioritize: Focus on city government data (working well)
  2. Manual collection: Download key school board meetings manually
  3. Selective import: Import only the most relevant documents
  4. Direct contact: Reach out to school district IT for data sharing agreement

Statusโ€‹

  • โœ… Tuscaloosa City Government: Automated scraping works (SuiteOne Media platform)
  • โŒ Tuscaloosa City Schools: Manual download required (eBoard + Incapsula)
  • โŒ Tuscaloosa County Schools: Manual download required (eBoard + Incapsula)