Management Commands Reference
All Django management commands for the Vibemap platform, organized by domain.
# Get help for any command
python manage.py <command> --help
# List all commands with search
python manage.py list_commands
python manage.py list_commands --search "events"
python manage.py list_commands --category "Data Collection"
Pipelines
Vibemap uses multi-step pipelines for event collection and place enrichment. Each step can run independently or be orchestrated by a top-level *_pipeline_run command.
Event Pipeline
Collects events from venue websites and APIs through a discover-scrape-validate-store workflow.
| Step | Command | Status transition |
|---|---|---|
| 0 | event_pipeline_run | Orchestrates all steps |
| 1 | event_pipeline_discover | — → pending |
| 1b | event_pipeline_discover_web | — → pending |
| 2 | event_pipeline_scrape | pending → scraped |
| 3 | event_pipeline_validate | scraped → validated |
| 4 | event_pipeline_store | validated → stored |
| — | event_pipeline_export | Export EventLink records |
| — | event_pipeline_import_urls | Import URLs into pipeline |
event_pipeline_run
Unified entry point that runs both API-based and direct-link event scraping strategies.
# Full pipeline for a boundary
python manage.py event_pipeline_run --boundary "Portland, OR"
# API-only strategy with parallel processing
python manage.py event_pipeline_run --strategy api --boundary "Oakland, CA" --parallel
# Links-only, output discovered links to CSV
python manage.py event_pipeline_run --strategy links --discover_only --output_file events.csv
# Dry run with debug logging
python manage.py event_pipeline_run --boundary "San Francisco, CA" --dry_run --debug
| Argument | Description |
|---|---|
--strategy | both (default), api, or links |
--venue_search_term | Single search term for filtering venues |
--venue_search_terms | Multiple search terms (space-separated) |
--editorial_category | Editorial category filter (default: events_feed) |
--address | Address filter for venues |
--boundary | Geographic boundary filter |
--num_per_batch | Venues per batch |
--max_workers | Concurrent workers for parallel processing |
--parallel / --no-parallel | Toggle parallel processing (default: on) |
--debug | Verbose logging |
--dry_run | Run without saving |
--discover_only | Only discover links, don't scrape |
--output_file | Output path for discovered links (CSV or JSON) |
--links_file | Input file of links to scrape |
event_pipeline_discover
Crawls venue websites to find event URLs. Saves discovered links as EventLink records with status pending.
# Discover events for venues in a boundary
python manage.py event_pipeline_discover --boundary "Oakland, CA"
# Include DataForSEO API results as fallback
python manage.py event_pipeline_discover --boundary "Oakland, CA" --include_api
# API-only discovery
python manage.py event_pipeline_discover --editorial_category events_feed --api_only
# Output to file without saving to DB
python manage.py event_pipeline_discover --boundary "Oakland, CA" --output_file links.csv --dry_run
| Argument | Description |
|---|---|
--venue_search_term | Search term for filtering venues |
--editorial_category | Editorial category (default: events_feed) |
--num_per_batch | Venues per batch (default: 10) |
--include_api | Also search via DataForSEO API |
--api_only | Skip website crawling, use API only |
--parallel | Enable parallel processing |
--max_depth | Max crawl depth per site (default: 2) |
--output_file | Save links to file |
--progress_csv | Write progress CSV |
--dry_run | Preview without saving |
event_pipeline_discover_web
Discovers event links by searching the web using DataForSEO. Generates search queries per city and extracts event URLs from results.
# Discover events in Portland
python manage.py event_pipeline_discover_web --city "Portland, OR"
# Multiple cities with zip code expansion
python manage.py event_pipeline_discover_web --city "Oakland, CA" "San Francisco, CA" --search-by-zip
# Custom search queries, save to CSV
python manage.py event_pipeline_discover_web --city "Austin, TX" \
--search-queries "live music Austin" "Austin events this week" \
--csv-file austin_events.csv
# Dry run with city validation
python manage.py event_pipeline_discover_web --city "Seattle, WA" --validate-city --dry-run
| Argument | Description |
|---|---|
--city | City or cities to search |
--search-queries | Custom search queries |
--max-results-per-query | Max search results per query |
--max-links-per-site | Max event links per discovered site |
--output-file | JSON output (relative to scrapers/out/) |
--urls-file | URLs-only text output |
--csv-file | CSV output with metadata |
--dry-run | Preview without saving |
--max-workers | Concurrent workers |
--search-by-zip | Search across zip codes in the city area |
--zip-radius | Radius in miles for zip code search (default: 10) |
--max-zipcodes | Max zip codes to search (default: 20) |
--validate-city | Validate links are in the target city (slower) |
--no-enhance-queries | Use exact search queries only |
event_pipeline_scrape
Fetches pending EventLink URLs and extracts structured event data (title, date, description, location, price, images). Stores results as JSON in the EventLink scraped_data field.
# Scrape the next 100 pending links
python manage.py event_pipeline_scrape
# Scrape links for a specific venue
python manage.py event_pipeline_scrape --venue_id <uuid>
# Scrape specific URLs directly (no EventLink records needed)
python manage.py event_pipeline_scrape --urls "https://example.com/events/123" "https://example.com/events/456"
# Multi-event mode for calendar pages
python manage.py event_pipeline_scrape --urls "https://venue.com/calendar" --multi --max_events 20
# Import from CSV and scrape
python manage.py event_pipeline_scrape --input_csv links.csv --create
# Export progress to CSV
python manage.py event_pipeline_scrape --limit 50 --export_csv scraped_results.csv
| Argument | Description |
|---|---|
--status | Filter by status (default: pending) |
--limit | Max links to process (default: 100) |
--venue_id | Filter by venue UUID |
--venue_ids | Multiple venue UUIDs |
--timeout | Per-URL timeout in seconds (default: 120) |
--max_retries | Retry attempts (default: 3) |
--urls | Scrape specific URLs directly |
--input_csv | Input CSV with URLs |
--multi | Extract all events from calendar/listing pages |
--max_events | Max events per page in multi mode (default: 50) |
--create | Create EventLink records from scraped data |
--export_csv | Export results to CSV |
--progress_csv | Write progress CSV |
--boundary | Geographic boundary filter |
--theme | Theme filter |
event_pipeline_validate
Validates scraped event data: checks for duplicates, validates location data, and marks links as validated or skipped.
# Validate all scraped links
python manage.py event_pipeline_validate
# Skip duplicate checking (faster)
python manage.py event_pipeline_validate --no-skip-duplicates
# Validate for a specific venue
python manage.py event_pipeline_validate --venue_id <uuid> --limit 50
# Export validation results
python manage.py event_pipeline_validate --export_csv validation_report.csv
| Argument | Description |
|---|---|
--status | Filter by status (default: scraped) |
--limit | Max links to validate (default: 100) |
--venue_id | Filter by venue UUID |
--venue_ids | Multiple venue UUIDs |
--skip_location | Skip location validation |
--skip_duplicates / --no-skip-duplicates | Toggle duplicate detection (default: on) |
--export_csv | Export results to CSV |
--progress_csv | Write progress CSV |
event_pipeline_store
Creates HotspotsEvent database records from validated EventLink data. Downloads and attaches images, verifies persistence.
# Store validated events
python manage.py event_pipeline_store
# Dry run to preview
python manage.py event_pipeline_store --dry_run
# Store without images (faster)
python manage.py event_pipeline_store --no-add-images
# Send email notification after storing
python manage.py event_pipeline_store --send_email
| Argument | Description |
|---|---|
--status | Filter by status (default: validated) |
--limit | Max events to store (default: 100) |
--dry_run | Preview without creating records |
--add_images / --no-add-images | Toggle image download (default: on) |
--export_csv | Export results to CSV |
--progress_csv | Write progress CSV |
--send_email | Send notification email after storing |
event_pipeline_export
Exports EventLink records to file for analysis or transfer.
# Export all validated links to JSON
python manage.py event_pipeline_export --output events.json --status validated
# Export as CSV with metadata
python manage.py event_pipeline_export --output events.csv --format csv --include_metadata
# Export just URLs
python manage.py event_pipeline_export --output urls.txt --format urls
| Argument | Description |
|---|---|
--output | Output file path (required) |
--format | json (default), csv, or urls |
--status | Filter by status |
--venue_id | Filter by venue UUID |
--limit | Max records to export |
--include_metadata / --no-include-metadata | Include extra metadata (default: on) |
event_pipeline_import_urls
Imports event URLs from a list or file and optionally runs them through the pipeline.
# Import URLs from command line
python manage.py event_pipeline_import_urls --urls "https://example.com/event1" "https://example.com/event2"
# Import from file and run full pipeline
python manage.py event_pipeline_import_urls --input_file urls.txt --run_pipeline
# Import and link to a venue
python manage.py event_pipeline_import_urls --input_file urls.txt --venue_id <uuid>
| Argument | Description |
|---|---|
--urls | Space-separated URLs |
--input_file | File with one URL per line |
--venue_id | Associate with a venue |
--save_links / --no-save-links | Save as EventLink records (default: on) |
--run_pipeline | Run full pipeline after import |
--timeout | Per-URL timeout in seconds (default: 60) |
Place Pipeline
Enriches place records through a 4-step workflow: import, enrich, categorize, and extract images.
| Step | Command | Status transition |
|---|---|---|
| 0 | place_pipeline_run | Orchestrates all steps |
| 1 | place_pipeline_import | — → imported |
| 2 | place_pipeline_enrich | imported → enriched |
| 3 | place_pipeline_categorize | enriched → categorized |
| 4 | place_pipeline_images | categorized → complete |
place_pipeline_run
Orchestrates the full 4-step place enrichment pipeline.
# Full pipeline for a city
python manage.py place_pipeline_run --city "Oakland" --boundary "Oakland, CA"
# Enrich-only (skip import, start from step 2)
python manage.py place_pipeline_run --strategy enrich_only --boundary "Oakland, CA"
# Import only
python manage.py place_pipeline_run --strategy import_only --city "Portland"
# Dry run with limited scope
python manage.py place_pipeline_run --boundary "Oakland, CA" --limit 50 --dry_run
| Argument | Description |
|---|---|
--strategy | full (default), enrich_only, or import_only |
--skip_import | Skip import step |
--city | City name for Overture Maps import |
--address | Address filter |
--boundary | Geographic boundary |
--ed_cat | Editorial category filter |
--limit | Max places to process (default: 5000) |
--max_completeness | Only enrich places below this score (default: 0.6) |
--dry_run | Preview without changes |
--parallel | Enable parallel processing (default: on) |
--max_workers | Concurrent workers |
place_pipeline_import
Step 1: Imports places from Overture Maps and enriches with DataForSEO data. Creates PlaceEnrichmentTask tracking records.
# Import for a city
python manage.py place_pipeline_import --city "Oakland" --boundary "Oakland, CA"
# Skip Overture, only use DataForSEO
python manage.py place_pipeline_import --boundary "Oakland, CA" --skip_overture
# Import specific places
python manage.py place_pipeline_import --place_ids <uuid1> <uuid2>
| Argument | Description |
|---|---|
--city | City for Overture Maps import |
--address | Address filter |
--boundary | Geographic boundary |
--ed_cat | Editorial category |
--limit | Max places (default: 5000) |
--skip_overture | Skip Overture Maps import |
--skip_datasources | Skip DataForSEO enrichment |
--dry_run | Preview only |
--place_ids | Specific place UUIDs |
place_pipeline_enrich
Step 2: Scrapes venue websites using Crawl4AI/Gemini to fill in missing data (hours, phone, description, etc.) and improve completeness scores.
# Enrich imported places with low completeness
python manage.py place_pipeline_enrich --boundary "Oakland, CA" --max_completeness 0.5
# Enrich specific places
python manage.py place_pipeline_enrich --place_ids <uuid1> <uuid2>
# Dry run
python manage.py place_pipeline_enrich --boundary "Oakland, CA" --dry_run
| Argument | Description |
|---|---|
--status | Task status to process (default: imported) |
--limit | Max places to enrich (default: 100) |
--max_completeness | Only enrich below this score (default: 0.6) |
--address | Address filter |
--boundary | Geographic boundary |
--ed_cat | Editorial category |
--place_ids | Specific place UUIDs |
--dry_run | Preview only |
place_pipeline_categorize
Step 3: Applies NLP-based category and vibe classification to enriched places.
# Categorize enriched places
python manage.py place_pipeline_categorize --boundary "Oakland, CA"
# Categorize specific places
python manage.py place_pipeline_categorize --place_ids <uuid1> <uuid2>
| Argument | Description |
|---|---|
--status | Task status to process (default: enriched) |
--limit | Max places (default: 5000) |
--boundary | Geographic boundary |
--ed_cat | Editorial category |
--dry_run | Preview only |
--place_ids | Specific place UUIDs |
place_pipeline_images
Step 4: Extracts and scores business images from websites. Uses DataForSEO as fallback. Supports parallel processing.
# Extract images for categorized places
python manage.py place_pipeline_images --boundary "Oakland, CA"
# With parallel processing
python manage.py place_pipeline_images --boundary "Oakland, CA" --parallel --max_workers 8
| Argument | Description |
|---|---|
--status | Task status to process (default: categorized) |
--limit | Max places (default: 5000) |
--parallel | Enable parallel processing (default: on) |
--max_workers | Concurrent workers (default: 8) |
--dry_run | Preview only |
--place_ids | Specific place UUIDs |
Data Collection
scrape_data_for_seo
Bulk import of places and events from the DataForSEO API. Supports searching by category, address, or boundary with fuzzy matching and validation.
# Search for places in a boundary
python manage.py scrape_data_for_seo --search_type places --boundary "Oakland, CA"
# Search events for a specific venue type
python manage.py scrape_data_for_seo --search_type events --venue_search_term "music venues" --address "Portland, OR"
# Search all Google Maps categories for an area
python manage.py scrape_data_for_seo --search_type places --boundary "Tulsa, OK" \
--use_google_maps_categories --target_categories arts_culture,shopping
# Thorough search (don't stop early on existing places)
python manage.py scrape_data_for_seo --search_type places --boundary "Oakland, CA" \
--skip_early_termination --debug
| Argument | Description |
|---|---|
--search_type | places, events, or all |
--venue_search_term | Search term for venues |
--editorial_category | Editorial category filter |
--address | Address filter |
--boundary | Geographic boundary |
--search_all_categories | Search all activity categories |
--search_all_vibes | Search all vibes |
--use_google_maps_categories | Use organized Google Maps categories |
--target_categories | Comma-separated categories |
--num_per_batch | Venues per batch |
--skip_early_termination | Don't stop when finding existing places |
--skip_after_existing_places | Existing-place threshold before stopping (default: 200) |
--debug | Verbose logging |
--no_confirm | Skip confirmation prompts |
scrape_places_for_completeness
Scrapes venue websites to fill in missing data for places with low completeness scores. Uses Gemini AI for intelligent extraction and optionally generates AI descriptions.
# Scrape places below 50% completeness in Oakland
python manage.py scrape_places_for_completeness --max-completeness 0.5 --boundary "Oakland, CA"
# Limit to 20 places, ordered by rating count
python manage.py scrape_places_for_completeness --limit 20 --order-by "-aggregate_rating_count"
# Use Claude for description generation instead of Gemini
python manage.py scrape_places_for_completeness --boundary "Oakland, CA" --use-claude-description
# Scrape a single place
python manage.py scrape_places_for_completeness --place <uuid>
# Preview what would be scraped
python manage.py scrape_places_for_completeness --boundary "Oakland, CA" --dry-run
| Argument | Description |
|---|---|
--max-completeness | Only scrape places below this score (default: 0.6) |
--min-sources | Minimum data sources required (default: 0) |
--limit | Max places to process (default: 100) |
--address | Address filter |
--boundary | Geographic boundary |
--editorial_category | Editorial category |
--place | Single place UUID |
--use-gemini / --no-gemini | Toggle Gemini AI scraping (default: on) |
--use-structured / --no-structured | Toggle schema.org extraction (default: on) |
--generate-ai-description / --no-ai-description | Toggle AI description (default: on) |
--use-claude-description | Use Claude instead of Gemini for descriptions |
--order-by | Sort field (default: -aggregate_rating_count) |
--dry-run | Preview without scraping |
seed_osm_places
Imports place records from OpenStreetMap data within a bounding box.
# Seed places for a bounding box (minx,miny,maxx,maxy)
python manage.py seed_osm_places -b "-122.3,37.7,-122.2,37.8"
| Argument | Description |
|---|---|
-b / --bounds | Bounding box as minx,miny,maxx,maxy |
discover_vacancies_web
Discovers retail vacancies and commercial real estate listings by searching the web via DataForSEO.
# Find vacancies in a city
python manage.py discover_vacancies_web --city "Oakland, CA"
# Search for specific property types with AI detail extraction
python manage.py discover_vacancies_web --city "Portland, OR" \
--property-type retail --fetch-details
# Multiple cities, output to CSV
python manage.py discover_vacancies_web --city "Oakland, CA" "San Francisco, CA" \
--csv-file vacancies.csv
| Argument | Description |
|---|---|
--city | City or cities to search |
--search-queries | Custom search queries |
--property-type | Property type filter (default: all) |
--fetch-details | Use AI to extract details from each listing |
--max-detail-workers | Workers for detail fetching (default: 3) |
--output-file | JSON output |
--csv-file | CSV output |
--dry-run | Preview without saving |
enrich_vacancy_listings
Enriches vacancy listings with property details by scraping each listing page. Uses JSON-LD, HTML selectors, and AI extraction as fallback.
# Enrich listings from a CSV
python manage.py enrich_vacancy_listings --input-csv vacancies.csv --output-csv enriched.csv
# With AI extraction enabled
python manage.py enrich_vacancy_listings --input-csv vacancies.csv --output-csv enriched.csv --use-ai
# Limit and control rate
python manage.py enrich_vacancy_listings --input-csv vacancies.csv --output-csv enriched.csv \
--limit 50 --delay 2.0 --max-workers 1
| Argument | Description |
|---|---|
--input-csv | Input CSV file (required) |
--output-csv | Output CSV file (required) |
--processed-file | Track processed URLs for resume |
--max-workers | Concurrent workers (default: 2) |
--delay | Delay between requests in seconds (default: 1.0) |
--limit | Max listings to process |
--use-ai | Enable AI extraction |
--skip-scraper | Skip web scraping |
run_scrapper
General-purpose scraper runner for various data sources.
python manage.py run_scrapper
Data Sync
sync_categories
Updates the Category model from the categories YAML file in vibemap-constants.
python manage.py sync_categories
sync_vibes
Synchronizes the Vibe model to match the vibes YAML file. Adds new vibes and removes obsolete ones.
python manage.py sync_vibes
sync_vibes_from_text
Extracts vibes from place descriptions using the Vibemap NLP API and assigns them to places.
python manage.py sync_vibes_from_text
sync_subcategory_parents
Updates parent-child relationships in the subcategory hierarchy from YAML config.
python manage.py sync_subcategory_parents
sync_google_ratings
Updates Google ratings and review data for places via DataForSEO. Uses fuzzy matching to find the best Google Maps match.
python manage.py sync_google_ratings
sync_ratings
Recalculates and synchronizes aggregate ratings across all places.
python manage.py sync_ratings
sync_instagram
Pulls and updates Instagram data (followers, posts, profile info) for places with linked Instagram accounts.
python manage.py sync_instagram
sync_boundaries
Updates geographic boundary data from configured sources.
python manage.py sync_boundaries
sync_simpleview
Syncs place and event data with SimpleView CMS (Peoria.org). Creates and updates listings and events in the WordPress-based CMS.
python manage.py sync_simpleview
sync_mailchimp_cities
Synchronizes city lists with Mailchimp interest categories for email marketing segmentation.
python manage.py sync_mailchimp_cities
Data Quality
check_listings
Validates venue listings for completeness and data quality. Reports on missing fields and data inconsistencies.
# Check all listings
python manage.py check_listings
# Check listings in a specific area
python manage.py check_listings --address "Oakland, CA" --num_rows 100
# Filter by editorial category
python manage.py check_listings --ed_cat "Downtown Tulsa"
| Argument | Description |
|---|---|
--type | Record type |
--city | City filter |
--address | Address filter |
--boundary | Boundary filter |
--num_rows | Max rows to check |
--ed_cat | Editorial category |
--cat | Category filter |
--tag | Tag filter |
check_events
Validates recent event data for completeness and sends a quality report.
# Check last 50 events
python manage.py check_events --num_recent 50
# Check events in a city
python manage.py check_events --city "Portland"
| Argument | Description |
|---|---|
--num_recent | Number of recent events to check |
--type | Record type |
--city | City filter |
check_event_details
Validates event detail completeness (descriptions, images, dates, venues).
python manage.py check_event_details --city "Oakland" --num_rows 100
| Argument | Description |
|---|---|
--type | Record type |
--city | City filter |
--address | Address filter |
--num_rows | Max rows |
--ed_cat | Editorial category |
--cat | Category |
--tag | Tag |
check_recent
Quick validation of recently added/modified listings and events.
# Check last 20 records
python manage.py check_recent --num_recent 20
# Check recent records in a city
python manage.py check_recent --city "Oakland" --type places
| Argument | Description |
|---|---|
--num_recent | Number of recent records |
--type | places or events |
--city | City filter |
check_quality_gemini
Batch quality assessment of place data using the Gemini API. Sends batches of places to Gemini for data accuracy evaluation.
# Check quality for a boundary
python manage.py check_quality_gemini --boundary "Oakland, CA" --num-rows 20
# Export results to CSV
python manage.py check_quality_gemini --boundary "Oakland, CA" --output-dir ./reports --output-format csv
# Small batches for testing
python manage.py check_quality_gemini --boundary "Oakland, CA" --batch-size 2 --num-rows 5
| Argument | Description |
|---|---|
--boundary | Boundary name filter |
--ed-cat | Editorial category |
--num-rows | Places to check (default: 20) |
--batch-size | Places per API call, 1-10 (default: 5) |
--output-dir | Directory for exports |
--output-format | Output format |
check_django_matches
Verifies consistency between Django model records and external data sources. Useful for auditing CSV imports.
python manage.py check_django_matches --csv data.csv --address "Oakland, CA"
| Argument | Description |
|---|---|
--csv | CSV file path |
--type | Record type |
--city | City filter |
--address | Address filter |
--boundary | Boundary filter |
--num_rows | Max rows |
--ed_cat | Editorial category |
--cat | Category |
--tag | Tag |
check_yelp
Validates and enriches place data using Yelp business information. Uses fuzzy matching to find Yelp matches, validates coordinates, and extracts business hours.
python manage.py check_yelp
merge_duplicates
Identifies and merges duplicate place or event records based on name and location similarity.
python manage.py merge_duplicates
merge_duplicate_tags
Consolidates tags that differ only in case (e.g., "Jazz" and "jazz") into a single canonical tag.
# Preview merges
python manage.py merge_duplicate_tags --dry-run
# Merge, preferring lowercase
python manage.py merge_duplicate_tags --prefer-case lower
| Argument | Description |
|---|---|
--dry-run | Preview without merging |
--prefer-case | Preferred case for canonical tag |
Data Updates
update_data_sources
Enriches places with data from external APIs (DataForSEO, Google, etc.). Supports parallel processing with rate limiting.
# Update all places in a boundary
python manage.py update_data_sources --boundary "Oakland, CA"
# Only incomplete places, with parallelism
python manage.py update_data_sources --boundary "Oakland, CA" --incomplete_only --parallel --max_workers 4
# Rate-limited processing
python manage.py update_data_sources --boundary "Portland, OR" --rate_limit 2 --max_retries 3
# Only places with few data sources
python manage.py update_data_sources --boundary "Oakland, CA" --max_data_sources 2
| Argument | Description |
|---|---|
--address | Address filter |
--boundary | Boundary filter |
--ed_cat | Editorial category |
--num_rows | Max rows |
--type | Record type |
--parallel | Enable parallel processing |
--max_workers | Concurrent workers |
--rate_limit | Concurrent API calls allowed |
--max_retries | Retry attempts for failures |
--max_data_sources | Only process places with ≤ N data sources |
--incomplete_only | Only places missing key fields |
--threshold | Completeness threshold |
enrich_places_with_ai
Enriches place records using AI-based data analysis and generation.
python manage.py enrich_places_with_ai --queryset "places_needing_enrichment"
| Argument | Description |
|---|---|
--queryset | Named queryset to process |
update_event_links
Replaces event URLs with affiliate links where applicable.
python manage.py update_event_links
update_events_recurring
Updates scheduling data for recurring events, extending end dates and creating new instances.
python manage.py update_events_recurring
update_timezone
Corrects timezone data for places and events based on their geographic coordinates.
python manage.py update_timezone
update_vibe_list
Migrates vibes from the legacy vibes field to vibe_list and other_vibes fields. Processes in chunks of 1000.
python manage.py update_vibe_list
update_wordpress
Syncs place and event data to WordPress. Handles media attachments and cleans up old events.
python manage.py update_wordpress --type places --address "Oakland, CA"
python manage.py update_wordpress --type events --ed_cat "Downtown Tulsa"
| Argument | Description |
|---|---|
--type | places or events |
--city | City filter |
--address | Address filter |
--num_rows | Max rows |
--ed_cat | Editorial category |
--tag | Tag filter |
resave_vibes
Re-triggers the vibe save logic for all places and events, refreshing computed fields.
python manage.py resave_vibes
map_old_categories
Migrates legacy old_categories JSON field data to the MPTT-based category model.
python manage.py map_old_categories
ML & Classification
nlp_categories
Applies NLP-based category classification to places or events using GPT models.
# Categorize places in a boundary
python manage.py nlp_categories --boundary "Oakland, CA" --type places
# Categorize events
python manage.py nlp_categories --type events --ed_cat "events_feed"
| Argument | Description |
|---|---|
--city | City filter |
--address | Address filter |
--boundary | Boundary filter |
--ed_cat | Editorial category |
--type | places or events |
predict_place_events
Predicts whether places are likely to host events based on descriptions, categories, and other attributes.
# Predict for places in an area
python manage.py predict_place_events --address "San Francisco, CA" --limit 100
# Check a specific place
python manage.py predict_place_events --place_id <uuid> --verbose
# Update database with predictions
python manage.py predict_place_events --boundary "Oakland, CA" --update --min_confidence 0.5
| Argument | Description |
|---|---|
--limit | Max places to analyze (default: 100) |
--address | Address filter |
--boundary | Boundary filter |
--place_id | Specific place UUID |
--min_confidence | Confidence threshold (default: 0.3) |
--update | Save predictions to database |
--verbose | Show detailed info |
train_vibe_classifier
Trains external Nyckel ML classifiers for vibe taxonomies. Supports exporting training data and uploading to Nyckel.
# List available taxonomies
python manage.py train_vibe_classifier --list-taxonomies
# Export training data for a taxonomy
python manage.py train_vibe_classifier --taxonomy energy_level --export --output training.csv
# Create and upload a classifier
python manage.py train_vibe_classifier --taxonomy aesthetic_decor --create --name "Aesthetic Classifier"
python manage.py train_vibe_classifier --taxonomy aesthetic_decor --upload --function-id <nyckel-id>
# Preview without uploading
python manage.py train_vibe_classifier --taxonomy energy_level --upload --dry-run
| Argument | Description |
|---|---|
--taxonomy | Taxonomy name (e.g., energy_level, aesthetic_decor) |
--list-taxonomies | List all available taxonomies |
--info | Show taxonomy details (requires --taxonomy) |
--export | Export training data to CSV |
--output | Output file path for CSV |
--limit-per-vibe | Max training samples per vibe (default: 50) |
--create | Create a new Nyckel classifier |
--name | Classifier name |
--upload | Upload training samples to Nyckel |
--function-id | Nyckel function ID |
--min-text-length | Min text length for samples (default: 200) |
--dry-run | Preview without changes |
analyze_vibe_clusters
Clusters vibes using embeddings to identify redundancies and relationships. Supports multiple dimensionality reduction methods.
# Basic clustering with visualization
python manage.py analyze_vibe_clusters --visualize
# UMAP method with custom cluster count
python manage.py analyze_vibe_clusters --method umap --clusters 30 --visualize
# Export to custom directory
python manage.py analyze_vibe_clusters --output-dir ./analysis/vibes
| Argument | Description |
|---|---|
--clusters | Number of clusters (default: 50) |
--visualize | Generate visualization plots |
--method | tsne (default), umap, or pca |
--min-cluster-size | Min cluster size to report (default: 2) |
--output-dir | Output directory (default: ./vibe_analysis) |
Import & Export
import_overture_places
Imports places from the Overture Maps dataset within a bounding box. Supports validation, GeoJSON export, and batch processing.
# Import for a predefined city
python manage.py import_overture_places --city Oakland
# Import with custom bounding box
python manage.py import_overture_places --bbox "-122.3,37.7,-122.2,37.8"
# Preview and export to GeoJSON
python manage.py import_overture_places --city Oakland --dry-run --export-geojson oakland.geojson
# Filter by category with confidence threshold
python manage.py import_overture_places --city Oakland --category restaurant --min-confidence 0.7
# List available predefined cities
python manage.py import_overture_places --list-cities
# Validate imported data against CSV
python manage.py import_overture_places --city Oakland --validate-csv --match-threshold 0.8
| Argument | Description |
|---|---|
--bbox | Bounding box as west,south,east,north |
--city | Predefined city name |
--min-confidence | Confidence threshold 0-1 (default: 0.5) |
--category | Filter by primary category |
--dry-run | Preview without saving |
--limit | Max places to process |
--export-geojson | Export results to GeoJSON |
--skip-existing | Skip existing places |
--list-cities | List predefined city bounding boxes |
--batch-size | Processing batch size |
--validate-csv | Validate against CSV |
--match-threshold | Match threshold for validation |
import_addresses
Bulk imports and geocodes address data from a CSV file. Includes a persistent cache for geocoding results.
# Import from CSV
python manage.py import_addresses --csv addresses.csv --address "Oakland, CA"
# Check cache status
python manage.py import_addresses --cache-info
# Fresh import (skip cache)
python manage.py import_addresses --csv addresses.csv --skip-cache
| Argument | Description |
|---|---|
--csv | CSV file path |
--type | Record type |
--city | City filter |
--address | Address filter |
--boundary | Boundary filter |
--num_rows | Max rows |
--ed_cat | Editorial category |
--cat | Category |
--tag | Tag |
--clear-cache | Clear geocoding cache |
--skip-cache | Start fresh without cache |
--cache-info | Show cache info and exit |
import_frontend_themes
Imports hardcoded frontend theme definitions into the Django Theme model.
# Preview what would be imported
python manage.py import_frontend_themes --dry-run
# Import themes
python manage.py import_frontend_themes
| Argument | Description |
|---|---|
--dry-run | Preview without saving |
airtable_export
Exports place records to an Airtable base for external collaboration.
python manage.py airtable_export -c "Oakland" -t "Places"
| Argument | Description |
|---|---|
-c / --city | City to export |
-t / --table-name | Airtable table name |
Images
get_business_images
Scrapes business images from venue websites. Supports parallel fetching.
# Fetch images for places in a boundary
python manage.py get_business_images --boundary "Oakland, CA" --parallel --max_workers 8
# Limit scope
python manage.py get_business_images --address "Portland, OR" --num_rows 50
| Argument | Description |
|---|---|
--type | Record type |
--city | City filter |
--address | Address filter |
--boundary | Boundary filter |
--num_rows | Max rows |
--ed_cat | Editorial category |
--cat | Category |
--tag | Tag |
--parallel | Enable parallel fetching |
--max_workers | Concurrent workers (default: 8) |
generate_blurhashes
Backfills blurhash placeholder values for images created before the auto-generation signal was added.
# Backfill all missing blurhashes
python manage.py generate_blurhashes
# Only for events, in batches
python manage.py generate_blurhashes --type events --batch-size 50
# Preview
python manage.py generate_blurhashes --dry-run
| Argument | Description |
|---|---|
--type | events, places, or all (default: all) |
--batch-size | Processing batch size (default: 100) |
--limit | Max images |
--dry-run | Preview without changes |
User & Admin
list_boundaries
Lists available geographic boundaries for use in pipeline filtering and Dagster configuration.
# List all boundaries
python manage.py list_boundaries
# Filter by type and admin level
python manage.py list_boundaries --type official --admin-level city
# JSON output for scripting
python manage.py list_boundaries --format json
| Argument | Description |
|---|---|
--type | early, official, or hidden |
--admin-level | neighborhood, district, city, state, or county |
--format | table (default), list, or json |
manage_tag_types
Manages tag type classifications. Tags can be typed as category, geography, or customer.
# View statistics
python manage.py manage_tag_types --stats
# List untyped tags
python manage.py manage_tag_types --list-untyped
# Set a tag type
python manage.py manage_tag_types --set san-francisco geography
# Bulk import from CSV
python manage.py manage_tag_types --bulk-import tag_types.csv
# Export current types
python manage.py manage_tag_types --export tag_types_export.csv
| Argument | Description |
|---|---|
--list | List all tags with types |
--list-untyped | List tags without a type |
--set | Set type: --set TAG_SLUG TYPE |
--bulk-import | Import from CSV |
--export | Export to CSV |
--stats | Show type statistics |
create_default_membership_tiers
Creates default membership tier records if they don't already exist.
python manage.py create_default_membership_tiers
badge_report
Generates a report on badge usage across users.
python manage.py badge_report
pull_user_vibes
Imports user-generated vibe selections and preferences.
python manage.py pull_user_vibes
zoho
Zoho CRM integration (experimental).
python manage.py zoho
Testing & Monitoring
run_tests
Executes the API test suite against live endpoints.
python manage.py run_tests
sentry_cron_monitor
Reports cron job health to Sentry for monitoring scheduled tasks.
python manage.py sentry_cron_monitor
Utilities
list_commands
Lists and searches all management commands with descriptions and arguments.
# List all commands
python manage.py list_commands
# Search by keyword
python manage.py list_commands --search "events"
# Filter by category
python manage.py list_commands --category "Data Collection"
# Generate COMMANDS.md
python manage.py list_commands --generate-docs
# Verbose output with all arguments
python manage.py list_commands --verbose --markdown
| Argument | Description |
|---|---|
--verbose | Show all arguments |
--markdown / -m | Output as markdown |
--generate-docs | Generate docs/COMMANDS.md |
--category | Filter by category name |
--search | Search commands by name or description |
export_docs
Exports backend documentation as Docusaurus-compatible Markdown. Converts RST to MD, adds frontmatter, sanitizes for MDX, and generates OpenAPI schema.
# Build docs to default output dir
python manage.py export_docs
# Custom output with clean build
python manage.py export_docs --output-dir /tmp/docs --clean
# Preview without writing
python manage.py export_docs --dry-run
# Skip OpenAPI generation (faster)
python manage.py export_docs --skip-openapi
| Argument | Description |
|---|---|
--output-dir | Output directory (default: docs/build/docusaurus) |
--clean | Remove output dir before building |
--skip-openapi | Skip OpenAPI schema generation |
--dry-run | List files without writing |
extract_event_keywords
Extracts and counts keywords from event names/descriptions for training classification models.
# Top 100 keywords from all events
python manage.py extract_event_keywords
# Export keywords for a boundary
python manage.py extract_event_keywords --boundary "Oakland, CA" --export keywords.csv
# Only upcoming events, minimum 10 occurrences
python manage.py extract_event_keywords --upcoming-only --min-count 10 --top 50
| Argument | Description |
|---|---|
--limit | Max events to process |
--min-count | Min keyword occurrences (default: 5) |
--min-length | Min keyword length (default: 3) |
--top | Top N keywords (default: 100) |
--export | Export to CSV |
--boundary | Boundary filter |
--upcoming-only | Only upcoming events |
misc_commands
Collection of miscellaneous maintenance utilities.
python manage.py misc_commands
logging_config
Displays and validates the current logging configuration.
python manage.py logging_config
ETL Commands
Commands in the etl app for data import, migration, and synchronization.
import_airtable
Imports place data from Airtable tables (East Bay Express, Shopkeepers, Timeout, Michelin, Zomato, Aggregate Ratings).
# Import from a specific table
python manage.py import_airtable -t "East Bay Express"
# With custom API key
python manage.py import_airtable -t "Michelin" -k "keyXXXXXX"
| Argument | Description |
|---|---|
-t / --table-name | Airtable table name (required) |
-k / --key | Airtable API key (default: env var) |
import_foursquare
Imports places from a Foursquare CSV export.
# Import from default CSV
python manage.py import_foursquare
# Import from custom file
python manage.py import_foursquare -f data/foursquare_export.csv
| Argument | Description |
|---|---|
-f / --file | CSV file path (default: etl/scrapers/data/foursquare_with_vibes.csv) |
import_osm_overpass
Imports places from Overpass API GeoJSON files. Maps OSM tags to Vibemap place types.
python manage.py import_osm_overpass -f overpass_export.geojson
| Argument | Description |
|---|---|
-f / --file | GeoJSON file path (required) |
--debug | Debug mode |
import_osm_imposm
Imports places from PostGIS tables populated by the imposm tool.
python manage.py import_osm_imposm
scrapy_crawl
Runs Scrapy spiders from within the Django context.
python manage.py scrapy_crawl
backfill_api_records
Creates ETL records from user-created or staff-modified API records.
# Backfill all records
python manage.py backfill_api_records
# Clean and rebuild
python manage.py backfill_api_records --clean
# Only staff-modified records
python manage.py backfill_api_records --modified --limit 100
| Argument | Description |
|---|---|
-c / --clean | Clean all ETL records and rebuild |
-l / --limit | Max records to process |
--modified | Only staff-modified records |
add_association_places
Imports places from member organization CSVs stored in Azure Blob Storage.
python manage.py add_association_places
migrate_mongo
Migrates events and places from a legacy MongoDB database to PostGIS.
python manage.py migrate_mongo --batch-size 500
| Argument | Description |
|---|---|
-b / --batch-size | Records per batch (default: 1000) |
migrate_images_azure_imagekit
Migrates images from Azure CDN to ImageKit.
python manage.py migrate_images_azure_imagekit
store_imagekit_name
Backfills imagekit_url and thumbnail_url for existing images by querying the ImageKit API.
python manage.py store_imagekit_name
sync_spotify_artists
Finds and links Spotify profiles for performer records using fuzzy matching.
# Sync all unlinked performers
python manage.py sync_spotify_artists
# Sync a specific performer
python manage.py sync_spotify_artists --performer-id <id>
# Preview without saving
python manage.py sync_spotify_artists --dry-run --limit 20
| Argument | Description |
|---|---|
--client-id | Spotify client ID |
--client-secret | Spotify client secret |
--limit | Max performers (default: 50) |
--performer-id | Specific performer ID |
--dry-run | Preview without saving |
--force | Re-sync already linked performers |
sync_boundary_population
Fetches population data from GeoNames and US Census APIs for boundary records.
# Sync city populations
python manage.py sync_boundary_population --admin-level city
# Dry run for a specific boundary
python manage.py sync_boundary_population --boundary-id <id> --dry-run
# Force update existing data
python manage.py sync_boundary_population --admin-level city --force
| Argument | Description |
|---|---|
--admin-level | Filter by admin level |
--boundary-id | Specific boundary ID |
--limit | Max boundaries (default: 100) |
--force | Overwrite existing population data |
--dry-run | Preview without saving |
--geonames-username | GeoNames API username |
--census-api-key | US Census API key |
--country | Country code (default: US) |
load_taginfo
Loads OpenStreetMap tag metadata from a taginfo SQLite database into OSMMapFeature records.
python manage.py load_taginfo -d taginfo.db
| Argument | Description |
|---|---|
-d / --db | Path to taginfo SQLite database (required) |
load_utm_zones
Loads UTM zone boundaries from a shapefile.
python manage.py load_utm_zones
create_imposm_mapping
Generates an imposm3 mapping YAML file from the OSM map_features config.
python manage.py create_imposm_mapping -o mapping.yml
| Argument | Description |
|---|---|
-o / --outfile | Output YAML path (default: etl/config/imposm_mapping.yml) |
osmium_import
Experimental: parses OSM PBF files using PyOsmium. Not fully implemented for database import.
python manage.py osmium_import -f region.osm.pbf
| Argument | Description |
|---|---|
-f / --pbf-file | PBF file path (required) |
Other App Commands
sync_wrike_organizations (accounts)
Matches Wrike project folders to OrganizationProfile records using fuzzy matching. Can create missing profiles.
# Preview matches
python manage.py sync_wrike_organizations --dry-run
# Create missing organization profiles
python manage.py sync_wrike_organizations --create-missing
# Custom customers folder
python manage.py sync_wrike_organizations --customers-folder "FOLDER_ID"
| Argument | Description |
|---|---|
--dry-run | Preview without changes |
--create-missing | Create OrganizationProfiles for unmatched Wrike folders |
--customers-folder | Wrike folder ID (default: MQAAAABm87lD) |
sync_scraped_events (search_indexes)
Finds events added by scrapers that haven't been indexed in Elasticsearch and triggers indexing.
# Sync recent scraped events
python manage.py sync_scraped_events
# Include past events
python manage.py sync_scraped_events --all_events
# Preview what would be synced
python manage.py sync_scraped_events --dry-run
| Argument | Description |
|---|---|
--dry-run | Preview without indexing |
--all_events | Include past events |
Common Patterns
Geographic Filtering
Most commands support filtering by location:
# By address (geocoded at runtime)
python manage.py check_listings --address "San Francisco, CA"
# By boundary (pre-defined geographic region)
python manage.py update_data_sources --boundary "Oakland, CA"
# By city name
python manage.py sync_google_ratings --city "Oakland"
# By bounding box
python manage.py seed_osm_places -b "-122.3,37.7,-122.2,37.8"
Limiting and Pagination
Control how many records to process:
python manage.py check_events --num_recent 100
python manage.py predict_place_events --limit 50
python manage.py scrape_places_for_completeness --limit 20
Dry Runs
Preview changes before committing:
python manage.py merge_duplicates --dry-run
python manage.py merge_duplicate_tags --dry-run
python manage.py import_overture_places --city Oakland --dry-run
python manage.py place_pipeline_run --boundary "Oakland, CA" --dry_run
Parallel Processing
Speed up batch operations:
python manage.py update_data_sources --parallel --max_workers 4
python manage.py get_business_images --parallel --max_workers 8
python manage.py event_pipeline_run --parallel --max_workers 6
Category and Tag Filtering
Filter by editorial category or tag:
python manage.py check_listings --ed_cat "Downtown Tulsa"
python manage.py scrape_data_for_seo --editorial_category "restaurants"
python manage.py misc_commands --tag "union_square"
Naming Conventions
| Prefix | Purpose |
|---|---|
check_* | Validation and quality control |
sync_* | External data synchronization |
import_* | Batch data imports |
scrape_* | Web scraping operations |
update_* | Data maintenance and updates |
predict_* | ML predictions |
*_pipeline_* | Multi-step workflow commands |
merge_* | Deduplication and consolidation |
train_* | ML model training |