peer-recovery-events

Peer Recovery Events

Small static site plus scheduled Python job for aggregating Massachusetts peer recovery events from selected Facebook pages, publishing a searchable event list, RSS feed, CSV/DOCX exports, Supabase history, and optional Google Calendar sync.

MVP scope

Not in the first pass: user accounts, a database, payments, real-time scraping, full-text reposting of Facebook descriptions, or automatic deletion of stale calendar events.

Architecture

GitHub Pages can handle the public frontend. The “backend” is a scheduled GitHub Action that runs the Python pipeline once a day. That keeps hosting simple and cheap:

  1. GitHub Actions runs python -m peer_recovery_events.run.
  2. The job calls Bright Data when BRIGHT_DATA_API_KEY is configured.
  3. The job writes fresh JSON, CSV, DOCX, and RSS files into docs/data/.
  4. If Supabase secrets are configured, the job upserts the current event rows and appends observation rows.
  5. If Google Calendar secrets are configured, the job upserts future events into that calendar.
  6. The static site in docs/ reads docs/data/events.json.

The generated JSON file is still the public read path. Supabase is for history and analysis, not required for the website to load. That gives you a cheap static site now and a data warehouse for future questions like which event types are common, where gaps exist, and which events get engagement.

Local run

python3 -m venv .venv
. .venv/bin/activate
pip install -e ".[dev]"
cp .env.example .env
direnv allow
python -m peer_recovery_events.run --sources config/sources.toml --output-dir docs/data
python3 -m http.server 8000 -d docs

Then open http://localhost:8000.

We use direnv locally: .envrc loads .env, and .env stays ignored by git. If you do not use direnv, export the same variables manually before running the pipeline. Without API secrets, the pipeline uses data/sample_bright_data_facebook_events.json so the UI has realistic development data.

Configuration

Edit config/sources.toml and set enabled = false only for pages you want to pause. All launch sources are enabled by default.

The Bright Data request uses:

GitHub Action secrets:

You can also use GOOGLE_APPLICATION_CREDENTIALS locally to point at a JSON key file.

To enable Supabase, run supabase/schema.sql in the Supabase SQL editor, then add SUPABASE_URL and SUPABASE_SERVICE_ROLE_KEY locally and as GitHub Action secrets. The two tables are peer_recovery_events for the latest event state and peer_recovery_event_observations for one observation per event per refresh.

Bright Data’s current Facebook Events docs show detailed fields such as url, event_id, name, description, start_date, end_date, location, address, organizer, organizer_url, attendees_count, interested_count, is_online, ticket_url, cover_photo, and category. Public samples also show a leaner shape with title, event_date, people_responded, and main_image. The normalizer accepts both.

For Google Calendar, the simplest low-friction setup is usually: create a dedicated Google Calendar, share it with the service account email with “Make changes to events”, then make the calendar public if you want people to subscribe. Domain-wide delegation is only needed for impersonating users or managing Workspace domain calendars.

This is not legal advice. Publicly visible event data can still be subject to Facebook’s terms, Bright Data’s terms, copyright, and organizer expectations. The implementation intentionally stores a short excerpt, links back to the Facebook event/page, and does not try to replace the source page. Before broad launch, review the exact sources, get permission from organizations where practical, honor takedown requests quickly, and keep source attribution prominent.

Tests

PYTHONPATH=src python3 -m unittest discover -s tests