# CFTC COT Explorer A tool for ingesting, storing, and querying CFTC (Commodity Futures Trading Commission) Commitments of Traders (COT) report data. Provides a REST API backed by a SQLite database. ## Overview The CFTC publishes weekly COT reports showing the positioning of commercial, non-commercial, and non-reportable traders across futures markets. This project: - Parses the weekly HTML reports and historical ZIP archives from the CFTC website - Stores the data in a SQLite database - Exposes the data via a FastAPI REST API with analytics endpoints ## Project Structure ``` . ├── app/ │ ├── db.py # Database connection and initialization │ ├── ingestion/ │ │ ├── cli.py # Command-line ingestion tool │ │ ├── importer.py # Import logic (HTML, ZIP, download) │ │ └── parser.py # HTML/ZIP parser │ └── api/ │ ├── main.py # FastAPI application entry point │ ├── models.py # Pydantic response models │ └── routes/ │ ├── commodities.py # /api/commodities, /api/exchanges │ ├── positions.py # /api/positions/{code}/... │ ├── analytics.py # /api/analytics/... │ └── reports.py # /api/reports/... ├── data/ # SQLite database and downloaded HTML files ├── schema.sql # Database schema └── cftc_downloader.py # Standalone downloader script ``` ## Setup **Requirements:** Python 3.10+ ```bash pip install fastapi uvicorn requests beautifulsoup4 ``` **Initialize the database:** ```bash python -m app.ingestion.cli init-db ``` ## Data Ingestion ### Import local HTML files If you have weekly HTML files saved in `./data/`: ```bash python -m app.ingestion.cli import-local-html --data-dir ./data ``` ### Download and import the latest weekly report ```bash python -m app.ingestion.cli download-and-import ``` ### Import the full historical archive (1995–present) ```bash python -m app.ingestion.cli import-history --start-year 1995 --end-year 2026 ``` ### Import a specific file ```bash python -m app.ingestion.cli import-html data/2026-03-10_deacbtlof.htm python -m app.ingestion.cli import-zip deahistfo2024.zip ``` ### Check database status ```bash python -m app.ingestion.cli status ``` Ingestion is idempotent — re-running any import command will skip already-imported sources. ## Running the API ```bash uvicorn app.api.main:app --reload ``` The API will be available at `http://localhost:8000`. Interactive docs are at `http://localhost:8000/docs`. ### Environment Variables | Variable | Default | Description | |----------|---------|-------------| | `DB_PATH` | `app/data/cot.db` | Path to the SQLite database file | ## API Endpoints ### Commodities | Method | Path | Description | |--------|------|-------------| | `GET` | `/api/exchanges` | List all exchanges with commodity counts | | `GET` | `/api/commodities` | List all commodities (filter by `?exchange=CME`) | | `GET` | `/api/commodities/{cftc_code}` | Get metadata for a single commodity | ### Positions | Method | Path | Description | |--------|------|-------------| | `GET` | `/api/positions/{cftc_code}/latest` | Latest report with all row types and concentration data | | `GET` | `/api/positions/{cftc_code}/history` | Time series of positions (supports `from_date`, `to_date`, `row_type`) | | `GET` | `/api/positions/{cftc_code}/extremes` | All-time min/max for open interest and net positions | | `GET` | `/api/positions/compare` | Compare a metric across multiple commodities (comma-separated `codes`) | ### Analytics | Method | Path | Description | |--------|------|-------------| | `GET` | `/api/analytics/screener` | Rank markets by non-commercial net position percentile | | `GET` | `/api/analytics/{cftc_code}/net-position-percentile` | Percentile rank and z-score for current net position | | `GET` | `/api/analytics/{cftc_code}/concentration` | Largest-trader concentration data over time | #### Screener parameters | Parameter | Default | Description | |-----------|---------|-------------| | `exchange` | — | Filter by exchange abbreviation | | `lookback_weeks` | `156` | Historical window for percentile calculation (4–1560) | | `top_n` | `50` | Number of results to return | | `direction` | — | Filter to `long` (≥50th pct) or `short` (<50th pct) | ## Database Schema The SQLite database contains five tables: - **`commodities`** — one row per unique market (CFTC code, name, exchange) - **`reports`** — one row per (commodity, report date) - **`positions`** — position data per report, split into `All`, `Old`, and `Other` row types; includes open interest, long/short counts, week-over-week changes, percent of open interest, and trader counts - **`concentration`** — largest-trader concentration ratios (top 4 and top 8 traders, gross and net) - **`import_log`** — tracks which source files have been processed A convenience view `v_net_positions` joins all tables and pre-computes net positions (long minus short) for each trader category.