FastAPI application that ingests CFTC Commitments of Traders data into SQLite and exposes it via a REST API with analytics endpoints (screener, percentile rank, concentration). Includes CLI for historical and weekly data ingestion, Docker setup, and a frontend. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
148 lines
5.1 KiB
Markdown
148 lines
5.1 KiB
Markdown
# CFTC COT Explorer
|
||
|
||
A tool for ingesting, storing, and querying CFTC (Commodity Futures Trading Commission) Commitments of Traders (COT) report data. Provides a REST API backed by a SQLite database.
|
||
|
||
## Overview
|
||
|
||
The CFTC publishes weekly COT reports showing the positioning of commercial, non-commercial, and non-reportable traders across futures markets. This project:
|
||
|
||
- Parses the weekly HTML reports and historical ZIP archives from the CFTC website
|
||
- Stores the data in a SQLite database
|
||
- Exposes the data via a FastAPI REST API with analytics endpoints
|
||
|
||
## Project Structure
|
||
|
||
```
|
||
.
|
||
├── app/
|
||
│ ├── db.py # Database connection and initialization
|
||
│ ├── ingestion/
|
||
│ │ ├── cli.py # Command-line ingestion tool
|
||
│ │ ├── importer.py # Import logic (HTML, ZIP, download)
|
||
│ │ └── parser.py # HTML/ZIP parser
|
||
│ └── api/
|
||
│ ├── main.py # FastAPI application entry point
|
||
│ ├── models.py # Pydantic response models
|
||
│ └── routes/
|
||
│ ├── commodities.py # /api/commodities, /api/exchanges
|
||
│ ├── positions.py # /api/positions/{code}/...
|
||
│ ├── analytics.py # /api/analytics/...
|
||
│ └── reports.py # /api/reports/...
|
||
├── data/ # SQLite database and downloaded HTML files
|
||
├── schema.sql # Database schema
|
||
└── cftc_downloader.py # Standalone downloader script
|
||
```
|
||
|
||
## Setup
|
||
|
||
**Requirements:** Python 3.10+
|
||
|
||
```bash
|
||
pip install fastapi uvicorn requests beautifulsoup4
|
||
```
|
||
|
||
**Initialize the database:**
|
||
|
||
```bash
|
||
python -m app.ingestion.cli init-db
|
||
```
|
||
|
||
## Data Ingestion
|
||
|
||
### Import local HTML files
|
||
|
||
If you have weekly HTML files saved in `./data/`:
|
||
|
||
```bash
|
||
python -m app.ingestion.cli import-local-html --data-dir ./data
|
||
```
|
||
|
||
### Download and import the latest weekly report
|
||
|
||
```bash
|
||
python -m app.ingestion.cli download-and-import
|
||
```
|
||
|
||
### Import the full historical archive (1995–present)
|
||
|
||
```bash
|
||
python -m app.ingestion.cli import-history --start-year 1995 --end-year 2026
|
||
```
|
||
|
||
### Import a specific file
|
||
|
||
```bash
|
||
python -m app.ingestion.cli import-html data/2026-03-10_deacbtlof.htm
|
||
python -m app.ingestion.cli import-zip deahistfo2024.zip
|
||
```
|
||
|
||
### Check database status
|
||
|
||
```bash
|
||
python -m app.ingestion.cli status
|
||
```
|
||
|
||
Ingestion is idempotent — re-running any import command will skip already-imported sources.
|
||
|
||
## Running the API
|
||
|
||
```bash
|
||
uvicorn app.api.main:app --reload
|
||
```
|
||
|
||
The API will be available at `http://localhost:8000`. Interactive docs are at `http://localhost:8000/docs`.
|
||
|
||
### Environment Variables
|
||
|
||
| Variable | Default | Description |
|
||
|----------|---------|-------------|
|
||
| `DB_PATH` | `app/data/cot.db` | Path to the SQLite database file |
|
||
|
||
## API Endpoints
|
||
|
||
### Commodities
|
||
|
||
| Method | Path | Description |
|
||
|--------|------|-------------|
|
||
| `GET` | `/api/exchanges` | List all exchanges with commodity counts |
|
||
| `GET` | `/api/commodities` | List all commodities (filter by `?exchange=CME`) |
|
||
| `GET` | `/api/commodities/{cftc_code}` | Get metadata for a single commodity |
|
||
|
||
### Positions
|
||
|
||
| Method | Path | Description |
|
||
|--------|------|-------------|
|
||
| `GET` | `/api/positions/{cftc_code}/latest` | Latest report with all row types and concentration data |
|
||
| `GET` | `/api/positions/{cftc_code}/history` | Time series of positions (supports `from_date`, `to_date`, `row_type`) |
|
||
| `GET` | `/api/positions/{cftc_code}/extremes` | All-time min/max for open interest and net positions |
|
||
| `GET` | `/api/positions/compare` | Compare a metric across multiple commodities (comma-separated `codes`) |
|
||
|
||
### Analytics
|
||
|
||
| Method | Path | Description |
|
||
|--------|------|-------------|
|
||
| `GET` | `/api/analytics/screener` | Rank markets by non-commercial net position percentile |
|
||
| `GET` | `/api/analytics/{cftc_code}/net-position-percentile` | Percentile rank and z-score for current net position |
|
||
| `GET` | `/api/analytics/{cftc_code}/concentration` | Largest-trader concentration data over time |
|
||
|
||
#### Screener parameters
|
||
|
||
| Parameter | Default | Description |
|
||
|-----------|---------|-------------|
|
||
| `exchange` | — | Filter by exchange abbreviation |
|
||
| `lookback_weeks` | `156` | Historical window for percentile calculation (4–1560) |
|
||
| `top_n` | `50` | Number of results to return |
|
||
| `direction` | — | Filter to `long` (≥50th pct) or `short` (<50th pct) |
|
||
|
||
## Database Schema
|
||
|
||
The SQLite database contains five tables:
|
||
|
||
- **`commodities`** — one row per unique market (CFTC code, name, exchange)
|
||
- **`reports`** — one row per (commodity, report date)
|
||
- **`positions`** — position data per report, split into `All`, `Old`, and `Other` row types; includes open interest, long/short counts, week-over-week changes, percent of open interest, and trader counts
|
||
- **`concentration`** — largest-trader concentration ratios (top 4 and top 8 traders, gross and net)
|
||
- **`import_log`** — tracks which source files have been processed
|
||
|
||
A convenience view `v_net_positions` joins all tables and pre-computes net positions (long minus short) for each trader category.
|