Skip to main content

Management Commands Reference

All Django management commands for the Vibemap platform, organized by domain.

# Get help for any command
python manage.py <command> --help

# List all commands with search
python manage.py list_commands
python manage.py list_commands --search "events"
python manage.py list_commands --category "Data Collection"

Pipelines

Vibemap uses multi-step pipelines for event collection and place enrichment. Each step can run independently or be orchestrated by a top-level *_pipeline_run command.

Event Pipeline

Collects events from venue websites and APIs through a discover-scrape-validate-store workflow.

StepCommandStatus transition
0event_pipeline_runOrchestrates all steps
1event_pipeline_discover— → pending
1bevent_pipeline_discover_web— → pending
2event_pipeline_scrapependingscraped
3event_pipeline_validatescrapedvalidated
4event_pipeline_storevalidatedstored
event_pipeline_exportExport EventLink records
event_pipeline_import_urlsImport URLs into pipeline

event_pipeline_run

Unified entry point that runs both API-based and direct-link event scraping strategies.

# Full pipeline for a boundary
python manage.py event_pipeline_run --boundary "Portland, OR"

# API-only strategy with parallel processing
python manage.py event_pipeline_run --strategy api --boundary "Oakland, CA" --parallel

# Links-only, output discovered links to CSV
python manage.py event_pipeline_run --strategy links --discover_only --output_file events.csv

# Dry run with debug logging
python manage.py event_pipeline_run --boundary "San Francisco, CA" --dry_run --debug
ArgumentDescription
--strategyboth (default), api, or links
--venue_search_termSingle search term for filtering venues
--venue_search_termsMultiple search terms (space-separated)
--editorial_categoryEditorial category filter (default: events_feed)
--addressAddress filter for venues
--boundaryGeographic boundary filter
--num_per_batchVenues per batch
--max_workersConcurrent workers for parallel processing
--parallel / --no-parallelToggle parallel processing (default: on)
--debugVerbose logging
--dry_runRun without saving
--discover_onlyOnly discover links, don't scrape
--output_fileOutput path for discovered links (CSV or JSON)
--links_fileInput file of links to scrape

event_pipeline_discover

Crawls venue websites to find event URLs. Saves discovered links as EventLink records with status pending.

# Discover events for venues in a boundary
python manage.py event_pipeline_discover --boundary "Oakland, CA"

# Include DataForSEO API results as fallback
python manage.py event_pipeline_discover --boundary "Oakland, CA" --include_api

# API-only discovery
python manage.py event_pipeline_discover --editorial_category events_feed --api_only

# Output to file without saving to DB
python manage.py event_pipeline_discover --boundary "Oakland, CA" --output_file links.csv --dry_run
ArgumentDescription
--venue_search_termSearch term for filtering venues
--editorial_categoryEditorial category (default: events_feed)
--num_per_batchVenues per batch (default: 10)
--include_apiAlso search via DataForSEO API
--api_onlySkip website crawling, use API only
--parallelEnable parallel processing
--max_depthMax crawl depth per site (default: 2)
--output_fileSave links to file
--progress_csvWrite progress CSV
--dry_runPreview without saving

event_pipeline_discover_web

Discovers event links by searching the web using DataForSEO. Generates search queries per city and extracts event URLs from results.

# Discover events in Portland
python manage.py event_pipeline_discover_web --city "Portland, OR"

# Multiple cities with zip code expansion
python manage.py event_pipeline_discover_web --city "Oakland, CA" "San Francisco, CA" --search-by-zip

# Custom search queries, save to CSV
python manage.py event_pipeline_discover_web --city "Austin, TX" \
--search-queries "live music Austin" "Austin events this week" \
--csv-file austin_events.csv

# Dry run with city validation
python manage.py event_pipeline_discover_web --city "Seattle, WA" --validate-city --dry-run
ArgumentDescription
--cityCity or cities to search
--search-queriesCustom search queries
--max-results-per-queryMax search results per query
--max-links-per-siteMax event links per discovered site
--output-fileJSON output (relative to scrapers/out/)
--urls-fileURLs-only text output
--csv-fileCSV output with metadata
--dry-runPreview without saving
--max-workersConcurrent workers
--search-by-zipSearch across zip codes in the city area
--zip-radiusRadius in miles for zip code search (default: 10)
--max-zipcodesMax zip codes to search (default: 20)
--validate-cityValidate links are in the target city (slower)
--no-enhance-queriesUse exact search queries only

event_pipeline_scrape

Fetches pending EventLink URLs and extracts structured event data (title, date, description, location, price, images). Stores results as JSON in the EventLink scraped_data field.

# Scrape the next 100 pending links
python manage.py event_pipeline_scrape

# Scrape links for a specific venue
python manage.py event_pipeline_scrape --venue_id <uuid>

# Scrape specific URLs directly (no EventLink records needed)
python manage.py event_pipeline_scrape --urls "https://example.com/events/123" "https://example.com/events/456"

# Multi-event mode for calendar pages
python manage.py event_pipeline_scrape --urls "https://venue.com/calendar" --multi --max_events 20

# Import from CSV and scrape
python manage.py event_pipeline_scrape --input_csv links.csv --create

# Export progress to CSV
python manage.py event_pipeline_scrape --limit 50 --export_csv scraped_results.csv
ArgumentDescription
--statusFilter by status (default: pending)
--limitMax links to process (default: 100)
--venue_idFilter by venue UUID
--venue_idsMultiple venue UUIDs
--timeoutPer-URL timeout in seconds (default: 120)
--max_retriesRetry attempts (default: 3)
--urlsScrape specific URLs directly
--input_csvInput CSV with URLs
--multiExtract all events from calendar/listing pages
--max_eventsMax events per page in multi mode (default: 50)
--createCreate EventLink records from scraped data
--export_csvExport results to CSV
--progress_csvWrite progress CSV
--boundaryGeographic boundary filter
--themeTheme filter

event_pipeline_validate

Validates scraped event data: checks for duplicates, validates location data, and marks links as validated or skipped.

# Validate all scraped links
python manage.py event_pipeline_validate

# Skip duplicate checking (faster)
python manage.py event_pipeline_validate --no-skip-duplicates

# Validate for a specific venue
python manage.py event_pipeline_validate --venue_id <uuid> --limit 50

# Export validation results
python manage.py event_pipeline_validate --export_csv validation_report.csv
ArgumentDescription
--statusFilter by status (default: scraped)
--limitMax links to validate (default: 100)
--venue_idFilter by venue UUID
--venue_idsMultiple venue UUIDs
--skip_locationSkip location validation
--skip_duplicates / --no-skip-duplicatesToggle duplicate detection (default: on)
--export_csvExport results to CSV
--progress_csvWrite progress CSV

event_pipeline_store

Creates HotspotsEvent database records from validated EventLink data. Downloads and attaches images, verifies persistence.

# Store validated events
python manage.py event_pipeline_store

# Dry run to preview
python manage.py event_pipeline_store --dry_run

# Store without images (faster)
python manage.py event_pipeline_store --no-add-images

# Send email notification after storing
python manage.py event_pipeline_store --send_email
ArgumentDescription
--statusFilter by status (default: validated)
--limitMax events to store (default: 100)
--dry_runPreview without creating records
--add_images / --no-add-imagesToggle image download (default: on)
--export_csvExport results to CSV
--progress_csvWrite progress CSV
--send_emailSend notification email after storing

event_pipeline_export

Exports EventLink records to file for analysis or transfer.

# Export all validated links to JSON
python manage.py event_pipeline_export --output events.json --status validated

# Export as CSV with metadata
python manage.py event_pipeline_export --output events.csv --format csv --include_metadata

# Export just URLs
python manage.py event_pipeline_export --output urls.txt --format urls
ArgumentDescription
--outputOutput file path (required)
--formatjson (default), csv, or urls
--statusFilter by status
--venue_idFilter by venue UUID
--limitMax records to export
--include_metadata / --no-include-metadataInclude extra metadata (default: on)

event_pipeline_import_urls

Imports event URLs from a list or file and optionally runs them through the pipeline.

# Import URLs from command line
python manage.py event_pipeline_import_urls --urls "https://example.com/event1" "https://example.com/event2"

# Import from file and run full pipeline
python manage.py event_pipeline_import_urls --input_file urls.txt --run_pipeline

# Import and link to a venue
python manage.py event_pipeline_import_urls --input_file urls.txt --venue_id <uuid>
ArgumentDescription
--urlsSpace-separated URLs
--input_fileFile with one URL per line
--venue_idAssociate with a venue
--save_links / --no-save-linksSave as EventLink records (default: on)
--run_pipelineRun full pipeline after import
--timeoutPer-URL timeout in seconds (default: 60)

Place Pipeline

Enriches place records through a 4-step workflow: import, enrich, categorize, and extract images.

StepCommandStatus transition
0place_pipeline_runOrchestrates all steps
1place_pipeline_import— → imported
2place_pipeline_enrichimportedenriched
3place_pipeline_categorizeenrichedcategorized
4place_pipeline_imagescategorizedcomplete

place_pipeline_run

Orchestrates the full 4-step place enrichment pipeline.

# Full pipeline for a city
python manage.py place_pipeline_run --city "Oakland" --boundary "Oakland, CA"

# Enrich-only (skip import, start from step 2)
python manage.py place_pipeline_run --strategy enrich_only --boundary "Oakland, CA"

# Import only
python manage.py place_pipeline_run --strategy import_only --city "Portland"

# Dry run with limited scope
python manage.py place_pipeline_run --boundary "Oakland, CA" --limit 50 --dry_run
ArgumentDescription
--strategyfull (default), enrich_only, or import_only
--skip_importSkip import step
--cityCity name for Overture Maps import
--addressAddress filter
--boundaryGeographic boundary
--ed_catEditorial category filter
--limitMax places to process (default: 5000)
--max_completenessOnly enrich places below this score (default: 0.6)
--dry_runPreview without changes
--parallelEnable parallel processing (default: on)
--max_workersConcurrent workers

place_pipeline_import

Step 1: Imports places from Overture Maps and enriches with DataForSEO data. Creates PlaceEnrichmentTask tracking records.

# Import for a city
python manage.py place_pipeline_import --city "Oakland" --boundary "Oakland, CA"

# Skip Overture, only use DataForSEO
python manage.py place_pipeline_import --boundary "Oakland, CA" --skip_overture

# Import specific places
python manage.py place_pipeline_import --place_ids <uuid1> <uuid2>
ArgumentDescription
--cityCity for Overture Maps import
--addressAddress filter
--boundaryGeographic boundary
--ed_catEditorial category
--limitMax places (default: 5000)
--skip_overtureSkip Overture Maps import
--skip_datasourcesSkip DataForSEO enrichment
--dry_runPreview only
--place_idsSpecific place UUIDs

place_pipeline_enrich

Step 2: Scrapes venue websites using Crawl4AI/Gemini to fill in missing data (hours, phone, description, etc.) and improve completeness scores.

# Enrich imported places with low completeness
python manage.py place_pipeline_enrich --boundary "Oakland, CA" --max_completeness 0.5

# Enrich specific places
python manage.py place_pipeline_enrich --place_ids <uuid1> <uuid2>

# Dry run
python manage.py place_pipeline_enrich --boundary "Oakland, CA" --dry_run
ArgumentDescription
--statusTask status to process (default: imported)
--limitMax places to enrich (default: 100)
--max_completenessOnly enrich below this score (default: 0.6)
--addressAddress filter
--boundaryGeographic boundary
--ed_catEditorial category
--place_idsSpecific place UUIDs
--dry_runPreview only

place_pipeline_categorize

Step 3: Applies NLP-based category and vibe classification to enriched places.

# Categorize enriched places
python manage.py place_pipeline_categorize --boundary "Oakland, CA"

# Categorize specific places
python manage.py place_pipeline_categorize --place_ids <uuid1> <uuid2>
ArgumentDescription
--statusTask status to process (default: enriched)
--limitMax places (default: 5000)
--boundaryGeographic boundary
--ed_catEditorial category
--dry_runPreview only
--place_idsSpecific place UUIDs

place_pipeline_images

Step 4: Extracts and scores business images from websites. Uses DataForSEO as fallback. Supports parallel processing.

# Extract images for categorized places
python manage.py place_pipeline_images --boundary "Oakland, CA"

# With parallel processing
python manage.py place_pipeline_images --boundary "Oakland, CA" --parallel --max_workers 8
ArgumentDescription
--statusTask status to process (default: categorized)
--limitMax places (default: 5000)
--parallelEnable parallel processing (default: on)
--max_workersConcurrent workers (default: 8)
--dry_runPreview only
--place_idsSpecific place UUIDs

Data Collection

scrape_data_for_seo

Bulk import of places and events from the DataForSEO API. Supports searching by category, address, or boundary with fuzzy matching and validation.

# Search for places in a boundary
python manage.py scrape_data_for_seo --search_type places --boundary "Oakland, CA"

# Search events for a specific venue type
python manage.py scrape_data_for_seo --search_type events --venue_search_term "music venues" --address "Portland, OR"

# Search all Google Maps categories for an area
python manage.py scrape_data_for_seo --search_type places --boundary "Tulsa, OK" \
--use_google_maps_categories --target_categories arts_culture,shopping

# Thorough search (don't stop early on existing places)
python manage.py scrape_data_for_seo --search_type places --boundary "Oakland, CA" \
--skip_early_termination --debug
ArgumentDescription
--search_typeplaces, events, or all
--venue_search_termSearch term for venues
--editorial_categoryEditorial category filter
--addressAddress filter
--boundaryGeographic boundary
--search_all_categoriesSearch all activity categories
--search_all_vibesSearch all vibes
--use_google_maps_categoriesUse organized Google Maps categories
--target_categoriesComma-separated categories
--num_per_batchVenues per batch
--skip_early_terminationDon't stop when finding existing places
--skip_after_existing_placesExisting-place threshold before stopping (default: 200)
--debugVerbose logging
--no_confirmSkip confirmation prompts

scrape_places_for_completeness

Scrapes venue websites to fill in missing data for places with low completeness scores. Uses Gemini AI for intelligent extraction and optionally generates AI descriptions.

# Scrape places below 50% completeness in Oakland
python manage.py scrape_places_for_completeness --max-completeness 0.5 --boundary "Oakland, CA"

# Limit to 20 places, ordered by rating count
python manage.py scrape_places_for_completeness --limit 20 --order-by "-aggregate_rating_count"

# Use Claude for description generation instead of Gemini
python manage.py scrape_places_for_completeness --boundary "Oakland, CA" --use-claude-description

# Scrape a single place
python manage.py scrape_places_for_completeness --place <uuid>

# Preview what would be scraped
python manage.py scrape_places_for_completeness --boundary "Oakland, CA" --dry-run
ArgumentDescription
--max-completenessOnly scrape places below this score (default: 0.6)
--min-sourcesMinimum data sources required (default: 0)
--limitMax places to process (default: 100)
--addressAddress filter
--boundaryGeographic boundary
--editorial_categoryEditorial category
--placeSingle place UUID
--use-gemini / --no-geminiToggle Gemini AI scraping (default: on)
--use-structured / --no-structuredToggle schema.org extraction (default: on)
--generate-ai-description / --no-ai-descriptionToggle AI description (default: on)
--use-claude-descriptionUse Claude instead of Gemini for descriptions
--order-bySort field (default: -aggregate_rating_count)
--dry-runPreview without scraping

seed_osm_places

Imports place records from OpenStreetMap data within a bounding box.

# Seed places for a bounding box (minx,miny,maxx,maxy)
python manage.py seed_osm_places -b "-122.3,37.7,-122.2,37.8"
ArgumentDescription
-b / --boundsBounding box as minx,miny,maxx,maxy

discover_vacancies_web

Discovers retail vacancies and commercial real estate listings by searching the web via DataForSEO.

# Find vacancies in a city
python manage.py discover_vacancies_web --city "Oakland, CA"

# Search for specific property types with AI detail extraction
python manage.py discover_vacancies_web --city "Portland, OR" \
--property-type retail --fetch-details

# Multiple cities, output to CSV
python manage.py discover_vacancies_web --city "Oakland, CA" "San Francisco, CA" \
--csv-file vacancies.csv
ArgumentDescription
--cityCity or cities to search
--search-queriesCustom search queries
--property-typeProperty type filter (default: all)
--fetch-detailsUse AI to extract details from each listing
--max-detail-workersWorkers for detail fetching (default: 3)
--output-fileJSON output
--csv-fileCSV output
--dry-runPreview without saving

enrich_vacancy_listings

Enriches vacancy listings with property details by scraping each listing page. Uses JSON-LD, HTML selectors, and AI extraction as fallback.

# Enrich listings from a CSV
python manage.py enrich_vacancy_listings --input-csv vacancies.csv --output-csv enriched.csv

# With AI extraction enabled
python manage.py enrich_vacancy_listings --input-csv vacancies.csv --output-csv enriched.csv --use-ai

# Limit and control rate
python manage.py enrich_vacancy_listings --input-csv vacancies.csv --output-csv enriched.csv \
--limit 50 --delay 2.0 --max-workers 1
ArgumentDescription
--input-csvInput CSV file (required)
--output-csvOutput CSV file (required)
--processed-fileTrack processed URLs for resume
--max-workersConcurrent workers (default: 2)
--delayDelay between requests in seconds (default: 1.0)
--limitMax listings to process
--use-aiEnable AI extraction
--skip-scraperSkip web scraping

run_scrapper

General-purpose scraper runner for various data sources.

python manage.py run_scrapper

Data Sync

sync_categories

Updates the Category model from the categories YAML file in vibemap-constants.

python manage.py sync_categories

sync_vibes

Synchronizes the Vibe model to match the vibes YAML file. Adds new vibes and removes obsolete ones.

python manage.py sync_vibes

sync_vibes_from_text

Extracts vibes from place descriptions using the Vibemap NLP API and assigns them to places.

python manage.py sync_vibes_from_text

sync_subcategory_parents

Updates parent-child relationships in the subcategory hierarchy from YAML config.

python manage.py sync_subcategory_parents

sync_google_ratings

Updates Google ratings and review data for places via DataForSEO. Uses fuzzy matching to find the best Google Maps match.

python manage.py sync_google_ratings

sync_ratings

Recalculates and synchronizes aggregate ratings across all places.

python manage.py sync_ratings

sync_instagram

Pulls and updates Instagram data (followers, posts, profile info) for places with linked Instagram accounts.

python manage.py sync_instagram

sync_boundaries

Updates geographic boundary data from configured sources.

python manage.py sync_boundaries

sync_simpleview

Syncs place and event data with SimpleView CMS (Peoria.org). Creates and updates listings and events in the WordPress-based CMS.

python manage.py sync_simpleview

sync_mailchimp_cities

Synchronizes city lists with Mailchimp interest categories for email marketing segmentation.

python manage.py sync_mailchimp_cities

Data Quality

check_listings

Validates venue listings for completeness and data quality. Reports on missing fields and data inconsistencies.

# Check all listings
python manage.py check_listings

# Check listings in a specific area
python manage.py check_listings --address "Oakland, CA" --num_rows 100

# Filter by editorial category
python manage.py check_listings --ed_cat "Downtown Tulsa"
ArgumentDescription
--typeRecord type
--cityCity filter
--addressAddress filter
--boundaryBoundary filter
--num_rowsMax rows to check
--ed_catEditorial category
--catCategory filter
--tagTag filter

check_events

Validates recent event data for completeness and sends a quality report.

# Check last 50 events
python manage.py check_events --num_recent 50

# Check events in a city
python manage.py check_events --city "Portland"
ArgumentDescription
--num_recentNumber of recent events to check
--typeRecord type
--cityCity filter

check_event_details

Validates event detail completeness (descriptions, images, dates, venues).

python manage.py check_event_details --city "Oakland" --num_rows 100
ArgumentDescription
--typeRecord type
--cityCity filter
--addressAddress filter
--num_rowsMax rows
--ed_catEditorial category
--catCategory
--tagTag

check_recent

Quick validation of recently added/modified listings and events.

# Check last 20 records
python manage.py check_recent --num_recent 20

# Check recent records in a city
python manage.py check_recent --city "Oakland" --type places
ArgumentDescription
--num_recentNumber of recent records
--typeplaces or events
--cityCity filter

check_quality_gemini

Batch quality assessment of place data using the Gemini API. Sends batches of places to Gemini for data accuracy evaluation.

# Check quality for a boundary
python manage.py check_quality_gemini --boundary "Oakland, CA" --num-rows 20

# Export results to CSV
python manage.py check_quality_gemini --boundary "Oakland, CA" --output-dir ./reports --output-format csv

# Small batches for testing
python manage.py check_quality_gemini --boundary "Oakland, CA" --batch-size 2 --num-rows 5
ArgumentDescription
--boundaryBoundary name filter
--ed-catEditorial category
--num-rowsPlaces to check (default: 20)
--batch-sizePlaces per API call, 1-10 (default: 5)
--output-dirDirectory for exports
--output-formatOutput format

check_django_matches

Verifies consistency between Django model records and external data sources. Useful for auditing CSV imports.

python manage.py check_django_matches --csv data.csv --address "Oakland, CA"
ArgumentDescription
--csvCSV file path
--typeRecord type
--cityCity filter
--addressAddress filter
--boundaryBoundary filter
--num_rowsMax rows
--ed_catEditorial category
--catCategory
--tagTag

check_yelp

Validates and enriches place data using Yelp business information. Uses fuzzy matching to find Yelp matches, validates coordinates, and extracts business hours.

python manage.py check_yelp

merge_duplicates

Identifies and merges duplicate place or event records based on name and location similarity.

python manage.py merge_duplicates

merge_duplicate_tags

Consolidates tags that differ only in case (e.g., "Jazz" and "jazz") into a single canonical tag.

# Preview merges
python manage.py merge_duplicate_tags --dry-run

# Merge, preferring lowercase
python manage.py merge_duplicate_tags --prefer-case lower
ArgumentDescription
--dry-runPreview without merging
--prefer-casePreferred case for canonical tag

Data Updates

update_data_sources

Enriches places with data from external APIs (DataForSEO, Google, etc.). Supports parallel processing with rate limiting.

# Update all places in a boundary
python manage.py update_data_sources --boundary "Oakland, CA"

# Only incomplete places, with parallelism
python manage.py update_data_sources --boundary "Oakland, CA" --incomplete_only --parallel --max_workers 4

# Rate-limited processing
python manage.py update_data_sources --boundary "Portland, OR" --rate_limit 2 --max_retries 3

# Only places with few data sources
python manage.py update_data_sources --boundary "Oakland, CA" --max_data_sources 2
ArgumentDescription
--addressAddress filter
--boundaryBoundary filter
--ed_catEditorial category
--num_rowsMax rows
--typeRecord type
--parallelEnable parallel processing
--max_workersConcurrent workers
--rate_limitConcurrent API calls allowed
--max_retriesRetry attempts for failures
--max_data_sourcesOnly process places with ≤ N data sources
--incomplete_onlyOnly places missing key fields
--thresholdCompleteness threshold

enrich_places_with_ai

Enriches place records using AI-based data analysis and generation.

python manage.py enrich_places_with_ai --queryset "places_needing_enrichment"
ArgumentDescription
--querysetNamed queryset to process

Replaces event URLs with affiliate links where applicable.

python manage.py update_event_links

update_events_recurring

Updates scheduling data for recurring events, extending end dates and creating new instances.

python manage.py update_events_recurring

update_timezone

Corrects timezone data for places and events based on their geographic coordinates.

python manage.py update_timezone

update_vibe_list

Migrates vibes from the legacy vibes field to vibe_list and other_vibes fields. Processes in chunks of 1000.

python manage.py update_vibe_list

update_wordpress

Syncs place and event data to WordPress. Handles media attachments and cleans up old events.

python manage.py update_wordpress --type places --address "Oakland, CA"
python manage.py update_wordpress --type events --ed_cat "Downtown Tulsa"
ArgumentDescription
--typeplaces or events
--cityCity filter
--addressAddress filter
--num_rowsMax rows
--ed_catEditorial category
--tagTag filter

resave_vibes

Re-triggers the vibe save logic for all places and events, refreshing computed fields.

python manage.py resave_vibes

map_old_categories

Migrates legacy old_categories JSON field data to the MPTT-based category model.

python manage.py map_old_categories

ML & Classification

nlp_categories

Applies NLP-based category classification to places or events using GPT models.

# Categorize places in a boundary
python manage.py nlp_categories --boundary "Oakland, CA" --type places

# Categorize events
python manage.py nlp_categories --type events --ed_cat "events_feed"
ArgumentDescription
--cityCity filter
--addressAddress filter
--boundaryBoundary filter
--ed_catEditorial category
--typeplaces or events

predict_place_events

Predicts whether places are likely to host events based on descriptions, categories, and other attributes.

# Predict for places in an area
python manage.py predict_place_events --address "San Francisco, CA" --limit 100

# Check a specific place
python manage.py predict_place_events --place_id <uuid> --verbose

# Update database with predictions
python manage.py predict_place_events --boundary "Oakland, CA" --update --min_confidence 0.5
ArgumentDescription
--limitMax places to analyze (default: 100)
--addressAddress filter
--boundaryBoundary filter
--place_idSpecific place UUID
--min_confidenceConfidence threshold (default: 0.3)
--updateSave predictions to database
--verboseShow detailed info

train_vibe_classifier

Trains external Nyckel ML classifiers for vibe taxonomies. Supports exporting training data and uploading to Nyckel.

# List available taxonomies
python manage.py train_vibe_classifier --list-taxonomies

# Export training data for a taxonomy
python manage.py train_vibe_classifier --taxonomy energy_level --export --output training.csv

# Create and upload a classifier
python manage.py train_vibe_classifier --taxonomy aesthetic_decor --create --name "Aesthetic Classifier"
python manage.py train_vibe_classifier --taxonomy aesthetic_decor --upload --function-id <nyckel-id>

# Preview without uploading
python manage.py train_vibe_classifier --taxonomy energy_level --upload --dry-run
ArgumentDescription
--taxonomyTaxonomy name (e.g., energy_level, aesthetic_decor)
--list-taxonomiesList all available taxonomies
--infoShow taxonomy details (requires --taxonomy)
--exportExport training data to CSV
--outputOutput file path for CSV
--limit-per-vibeMax training samples per vibe (default: 50)
--createCreate a new Nyckel classifier
--nameClassifier name
--uploadUpload training samples to Nyckel
--function-idNyckel function ID
--min-text-lengthMin text length for samples (default: 200)
--dry-runPreview without changes

analyze_vibe_clusters

Clusters vibes using embeddings to identify redundancies and relationships. Supports multiple dimensionality reduction methods.

# Basic clustering with visualization
python manage.py analyze_vibe_clusters --visualize

# UMAP method with custom cluster count
python manage.py analyze_vibe_clusters --method umap --clusters 30 --visualize

# Export to custom directory
python manage.py analyze_vibe_clusters --output-dir ./analysis/vibes
ArgumentDescription
--clustersNumber of clusters (default: 50)
--visualizeGenerate visualization plots
--methodtsne (default), umap, or pca
--min-cluster-sizeMin cluster size to report (default: 2)
--output-dirOutput directory (default: ./vibe_analysis)

Import & Export

import_overture_places

Imports places from the Overture Maps dataset within a bounding box. Supports validation, GeoJSON export, and batch processing.

# Import for a predefined city
python manage.py import_overture_places --city Oakland

# Import with custom bounding box
python manage.py import_overture_places --bbox "-122.3,37.7,-122.2,37.8"

# Preview and export to GeoJSON
python manage.py import_overture_places --city Oakland --dry-run --export-geojson oakland.geojson

# Filter by category with confidence threshold
python manage.py import_overture_places --city Oakland --category restaurant --min-confidence 0.7

# List available predefined cities
python manage.py import_overture_places --list-cities

# Validate imported data against CSV
python manage.py import_overture_places --city Oakland --validate-csv --match-threshold 0.8
ArgumentDescription
--bboxBounding box as west,south,east,north
--cityPredefined city name
--min-confidenceConfidence threshold 0-1 (default: 0.5)
--categoryFilter by primary category
--dry-runPreview without saving
--limitMax places to process
--export-geojsonExport results to GeoJSON
--skip-existingSkip existing places
--list-citiesList predefined city bounding boxes
--batch-sizeProcessing batch size
--validate-csvValidate against CSV
--match-thresholdMatch threshold for validation

import_addresses

Bulk imports and geocodes address data from a CSV file. Includes a persistent cache for geocoding results.

# Import from CSV
python manage.py import_addresses --csv addresses.csv --address "Oakland, CA"

# Check cache status
python manage.py import_addresses --cache-info

# Fresh import (skip cache)
python manage.py import_addresses --csv addresses.csv --skip-cache
ArgumentDescription
--csvCSV file path
--typeRecord type
--cityCity filter
--addressAddress filter
--boundaryBoundary filter
--num_rowsMax rows
--ed_catEditorial category
--catCategory
--tagTag
--clear-cacheClear geocoding cache
--skip-cacheStart fresh without cache
--cache-infoShow cache info and exit

import_frontend_themes

Imports hardcoded frontend theme definitions into the Django Theme model.

# Preview what would be imported
python manage.py import_frontend_themes --dry-run

# Import themes
python manage.py import_frontend_themes
ArgumentDescription
--dry-runPreview without saving

airtable_export

Exports place records to an Airtable base for external collaboration.

python manage.py airtable_export -c "Oakland" -t "Places"
ArgumentDescription
-c / --cityCity to export
-t / --table-nameAirtable table name

Images

get_business_images

Scrapes business images from venue websites. Supports parallel fetching.

# Fetch images for places in a boundary
python manage.py get_business_images --boundary "Oakland, CA" --parallel --max_workers 8

# Limit scope
python manage.py get_business_images --address "Portland, OR" --num_rows 50
ArgumentDescription
--typeRecord type
--cityCity filter
--addressAddress filter
--boundaryBoundary filter
--num_rowsMax rows
--ed_catEditorial category
--catCategory
--tagTag
--parallelEnable parallel fetching
--max_workersConcurrent workers (default: 8)

generate_blurhashes

Backfills blurhash placeholder values for images created before the auto-generation signal was added.

# Backfill all missing blurhashes
python manage.py generate_blurhashes

# Only for events, in batches
python manage.py generate_blurhashes --type events --batch-size 50

# Preview
python manage.py generate_blurhashes --dry-run
ArgumentDescription
--typeevents, places, or all (default: all)
--batch-sizeProcessing batch size (default: 100)
--limitMax images
--dry-runPreview without changes

User & Admin

list_boundaries

Lists available geographic boundaries for use in pipeline filtering and Dagster configuration.

# List all boundaries
python manage.py list_boundaries

# Filter by type and admin level
python manage.py list_boundaries --type official --admin-level city

# JSON output for scripting
python manage.py list_boundaries --format json
ArgumentDescription
--typeearly, official, or hidden
--admin-levelneighborhood, district, city, state, or county
--formattable (default), list, or json

manage_tag_types

Manages tag type classifications. Tags can be typed as category, geography, or customer.

# View statistics
python manage.py manage_tag_types --stats

# List untyped tags
python manage.py manage_tag_types --list-untyped

# Set a tag type
python manage.py manage_tag_types --set san-francisco geography

# Bulk import from CSV
python manage.py manage_tag_types --bulk-import tag_types.csv

# Export current types
python manage.py manage_tag_types --export tag_types_export.csv
ArgumentDescription
--listList all tags with types
--list-untypedList tags without a type
--setSet type: --set TAG_SLUG TYPE
--bulk-importImport from CSV
--exportExport to CSV
--statsShow type statistics

create_default_membership_tiers

Creates default membership tier records if they don't already exist.

python manage.py create_default_membership_tiers

badge_report

Generates a report on badge usage across users.

python manage.py badge_report

pull_user_vibes

Imports user-generated vibe selections and preferences.

python manage.py pull_user_vibes

zoho

Zoho CRM integration (experimental).

python manage.py zoho

Testing & Monitoring

run_tests

Executes the API test suite against live endpoints.

python manage.py run_tests

sentry_cron_monitor

Reports cron job health to Sentry for monitoring scheduled tasks.

python manage.py sentry_cron_monitor

Utilities

list_commands

Lists and searches all management commands with descriptions and arguments.

# List all commands
python manage.py list_commands

# Search by keyword
python manage.py list_commands --search "events"

# Filter by category
python manage.py list_commands --category "Data Collection"

# Generate COMMANDS.md
python manage.py list_commands --generate-docs

# Verbose output with all arguments
python manage.py list_commands --verbose --markdown
ArgumentDescription
--verboseShow all arguments
--markdown / -mOutput as markdown
--generate-docsGenerate docs/COMMANDS.md
--categoryFilter by category name
--searchSearch commands by name or description

export_docs

Exports backend documentation as Docusaurus-compatible Markdown. Converts RST to MD, adds frontmatter, sanitizes for MDX, and generates OpenAPI schema.

# Build docs to default output dir
python manage.py export_docs

# Custom output with clean build
python manage.py export_docs --output-dir /tmp/docs --clean

# Preview without writing
python manage.py export_docs --dry-run

# Skip OpenAPI generation (faster)
python manage.py export_docs --skip-openapi
ArgumentDescription
--output-dirOutput directory (default: docs/build/docusaurus)
--cleanRemove output dir before building
--skip-openapiSkip OpenAPI schema generation
--dry-runList files without writing

extract_event_keywords

Extracts and counts keywords from event names/descriptions for training classification models.

# Top 100 keywords from all events
python manage.py extract_event_keywords

# Export keywords for a boundary
python manage.py extract_event_keywords --boundary "Oakland, CA" --export keywords.csv

# Only upcoming events, minimum 10 occurrences
python manage.py extract_event_keywords --upcoming-only --min-count 10 --top 50
ArgumentDescription
--limitMax events to process
--min-countMin keyword occurrences (default: 5)
--min-lengthMin keyword length (default: 3)
--topTop N keywords (default: 100)
--exportExport to CSV
--boundaryBoundary filter
--upcoming-onlyOnly upcoming events

misc_commands

Collection of miscellaneous maintenance utilities.

python manage.py misc_commands

logging_config

Displays and validates the current logging configuration.

python manage.py logging_config

ETL Commands

Commands in the etl app for data import, migration, and synchronization.

import_airtable

Imports place data from Airtable tables (East Bay Express, Shopkeepers, Timeout, Michelin, Zomato, Aggregate Ratings).

# Import from a specific table
python manage.py import_airtable -t "East Bay Express"

# With custom API key
python manage.py import_airtable -t "Michelin" -k "keyXXXXXX"
ArgumentDescription
-t / --table-nameAirtable table name (required)
-k / --keyAirtable API key (default: env var)

import_foursquare

Imports places from a Foursquare CSV export.

# Import from default CSV
python manage.py import_foursquare

# Import from custom file
python manage.py import_foursquare -f data/foursquare_export.csv
ArgumentDescription
-f / --fileCSV file path (default: etl/scrapers/data/foursquare_with_vibes.csv)

import_osm_overpass

Imports places from Overpass API GeoJSON files. Maps OSM tags to Vibemap place types.

python manage.py import_osm_overpass -f overpass_export.geojson
ArgumentDescription
-f / --fileGeoJSON file path (required)
--debugDebug mode

import_osm_imposm

Imports places from PostGIS tables populated by the imposm tool.

python manage.py import_osm_imposm

scrapy_crawl

Runs Scrapy spiders from within the Django context.

python manage.py scrapy_crawl

backfill_api_records

Creates ETL records from user-created or staff-modified API records.

# Backfill all records
python manage.py backfill_api_records

# Clean and rebuild
python manage.py backfill_api_records --clean

# Only staff-modified records
python manage.py backfill_api_records --modified --limit 100
ArgumentDescription
-c / --cleanClean all ETL records and rebuild
-l / --limitMax records to process
--modifiedOnly staff-modified records

add_association_places

Imports places from member organization CSVs stored in Azure Blob Storage.

python manage.py add_association_places

migrate_mongo

Migrates events and places from a legacy MongoDB database to PostGIS.

python manage.py migrate_mongo --batch-size 500
ArgumentDescription
-b / --batch-sizeRecords per batch (default: 1000)

migrate_images_azure_imagekit

Migrates images from Azure CDN to ImageKit.

python manage.py migrate_images_azure_imagekit

store_imagekit_name

Backfills imagekit_url and thumbnail_url for existing images by querying the ImageKit API.

python manage.py store_imagekit_name

sync_spotify_artists

Finds and links Spotify profiles for performer records using fuzzy matching.

# Sync all unlinked performers
python manage.py sync_spotify_artists

# Sync a specific performer
python manage.py sync_spotify_artists --performer-id <id>

# Preview without saving
python manage.py sync_spotify_artists --dry-run --limit 20
ArgumentDescription
--client-idSpotify client ID
--client-secretSpotify client secret
--limitMax performers (default: 50)
--performer-idSpecific performer ID
--dry-runPreview without saving
--forceRe-sync already linked performers

sync_boundary_population

Fetches population data from GeoNames and US Census APIs for boundary records.

# Sync city populations
python manage.py sync_boundary_population --admin-level city

# Dry run for a specific boundary
python manage.py sync_boundary_population --boundary-id <id> --dry-run

# Force update existing data
python manage.py sync_boundary_population --admin-level city --force
ArgumentDescription
--admin-levelFilter by admin level
--boundary-idSpecific boundary ID
--limitMax boundaries (default: 100)
--forceOverwrite existing population data
--dry-runPreview without saving
--geonames-usernameGeoNames API username
--census-api-keyUS Census API key
--countryCountry code (default: US)

load_taginfo

Loads OpenStreetMap tag metadata from a taginfo SQLite database into OSMMapFeature records.

python manage.py load_taginfo -d taginfo.db
ArgumentDescription
-d / --dbPath to taginfo SQLite database (required)

load_utm_zones

Loads UTM zone boundaries from a shapefile.

python manage.py load_utm_zones

create_imposm_mapping

Generates an imposm3 mapping YAML file from the OSM map_features config.

python manage.py create_imposm_mapping -o mapping.yml
ArgumentDescription
-o / --outfileOutput YAML path (default: etl/config/imposm_mapping.yml)

osmium_import

Experimental: parses OSM PBF files using PyOsmium. Not fully implemented for database import.

python manage.py osmium_import -f region.osm.pbf
ArgumentDescription
-f / --pbf-filePBF file path (required)

Other App Commands

sync_wrike_organizations (accounts)

Matches Wrike project folders to OrganizationProfile records using fuzzy matching. Can create missing profiles.

# Preview matches
python manage.py sync_wrike_organizations --dry-run

# Create missing organization profiles
python manage.py sync_wrike_organizations --create-missing

# Custom customers folder
python manage.py sync_wrike_organizations --customers-folder "FOLDER_ID"
ArgumentDescription
--dry-runPreview without changes
--create-missingCreate OrganizationProfiles for unmatched Wrike folders
--customers-folderWrike folder ID (default: MQAAAABm87lD)

sync_scraped_events (search_indexes)

Finds events added by scrapers that haven't been indexed in Elasticsearch and triggers indexing.

# Sync recent scraped events
python manage.py sync_scraped_events

# Include past events
python manage.py sync_scraped_events --all_events

# Preview what would be synced
python manage.py sync_scraped_events --dry-run
ArgumentDescription
--dry-runPreview without indexing
--all_eventsInclude past events

Common Patterns

Geographic Filtering

Most commands support filtering by location:

# By address (geocoded at runtime)
python manage.py check_listings --address "San Francisco, CA"

# By boundary (pre-defined geographic region)
python manage.py update_data_sources --boundary "Oakland, CA"

# By city name
python manage.py sync_google_ratings --city "Oakland"

# By bounding box
python manage.py seed_osm_places -b "-122.3,37.7,-122.2,37.8"

Limiting and Pagination

Control how many records to process:

python manage.py check_events --num_recent 100
python manage.py predict_place_events --limit 50
python manage.py scrape_places_for_completeness --limit 20

Dry Runs

Preview changes before committing:

python manage.py merge_duplicates --dry-run
python manage.py merge_duplicate_tags --dry-run
python manage.py import_overture_places --city Oakland --dry-run
python manage.py place_pipeline_run --boundary "Oakland, CA" --dry_run

Parallel Processing

Speed up batch operations:

python manage.py update_data_sources --parallel --max_workers 4
python manage.py get_business_images --parallel --max_workers 8
python manage.py event_pipeline_run --parallel --max_workers 6

Category and Tag Filtering

Filter by editorial category or tag:

python manage.py check_listings --ed_cat "Downtown Tulsa"
python manage.py scrape_data_for_seo --editorial_category "restaurants"
python manage.py misc_commands --tag "union_square"

Naming Conventions

PrefixPurpose
check_*Validation and quality control
sync_*External data synchronization
import_*Batch data imports
scrape_*Web scraping operations
update_*Data maintenance and updates
predict_*ML predictions
*_pipeline_*Multi-step workflow commands
merge_*Deduplication and consolidation
train_*ML model training