# RSTAT — Reddit Stock Analyzer Scan Reddit for stock ticker mentions, score sentiment, enrich with price/market cap, and explore the results in a clean web dashboard. Automate shareable images and post them to Reddit.
## Highlights - CLI + Web UI: Collect data with `rstat`, browse it with `rstat-dashboard`. - Smart ticker parsing: Prefer $TSLA/$AAPL “golden” matches; fall back to filtered ALL-CAPS words. - Sentiment: VADER (NLTK) scores for titles and comments; “deep dive” averages per post. - Storage: Local SQLite database `reddit_stocks.db` with de-duped mentions and post analytics. - Enrichment: Yahoo Finance market cap + latest close fetched in batch and on-demand. - Images: Export polished daily/weekly summary PNGs for subreddits or “overall”. - Automation: Optional cron job plus one-command posting to Reddit with OAuth refresh tokens. ## Repository layout ``` . ├── Dockerfile # Multi-stage build (Tailwind -> Python + gunicorn) ├── docker-compose.yml # Prod (nginx + varnish optional) + dashboard ├── docker-compose-dev.yml # Dev compose (local nginx) ├── requirements.txt # Python deps ├── setup.py # Installs console scripts ├── subreddits.json # Default subreddits list ├── reddit_stocks.db # SQLite database (generated/updated by CLI) ├── export_image.py # Generate shareable PNGs (Playwright) ├── post_to_reddit.py # Post latest PNG to Reddit ├── get_refresh_token.py # One-time OAuth2 refresh token helper ├── fetch_close_price.py # Utility for closing price (yfinance) ├── fetch_market_cap.py # Utility for market cap (yfinance) ├── rstat_tool/ │ ├── main.py # CLI entry (rstat) │ ├── dashboard.py # Flask app entry (rstat-dashboard) │ ├── database.py # SQLite schema + queries │ ├── ticker_extractor.py # Ticker parsing + blacklist │ ├── sentiment_analyzer.py # VADER sentiment │ ├── cleanup.py # Cleanup utilities (rstat-cleanup) │ ├── flair_finder.py # Fetch subreddit flair IDs (rstat-flairs) │ ├── logger_setup.py # Logging │ └── setup_nltk.py # One-time VADER download ├── templates/ # Jinja2 templates (Tailwind 4 styling) └── static/ # Favicon + generated CSS (style.css) ``` ## Requirements - Python 3.10+ (Docker image uses Python 3.13-slim) - Reddit API app (script type) for read + submit - For optional image export: Playwright browsers - For UI development (optional): Node 18+ to rebuild Tailwind CSS ## Setup 1) Clone and enter the repo ```bash git clone cd reddit_stock_analyzer ``` 2) Create and activate a virtualenv - bash/zsh: ```bash python3 -m venv .venv source .venv/bin/activate ``` - fish: ```fish python3 -m venv .venv source .venv/bin/activate.fish ``` 3) Install Python dependencies and commands ```bash pip install -r requirements.txt pip install -e . ``` 4) Configure environment Create a `.env` file in the repo root with your Reddit app credentials: ``` REDDIT_CLIENT_ID=your_client_id REDDIT_CLIENT_SECRET=your_client_secret REDDIT_USER_AGENT=python:rstat:v1.0 (by u/yourname) ``` Optional (after OAuth step below): ``` REDDIT_REFRESH_TOKEN=your_refresh_token ``` 5) One-time NLTK setup ```bash python rstat_tool/setup_nltk.py ``` 6) Configure subreddits (optional) Edit `subreddits.json` to your liking. It ships with a sane default list. ## CLI usage (rstat) The `rstat` command collects Reddit data and updates the database. Credentials are read from `.env`. Common flags (see `rstat --help`): - `--config FILE` Use a JSON file with `{"subreddits": [ ... ]}` (default: `subreddits.json`) - `--subreddit NAME` Scan a single subreddit instead of the config - `--days N` Only scan posts from the last N days (default 1) - `--posts N` Max posts per subreddit to check (default 200) - `--comments N` Max comments per post to scan (default 100) - `--no-financials` Skip Yahoo Finance during the scan (faster) - `--update-top-tickers` Update financials for tickers that are currently top daily/weekly - `--update-financials-only [TICKER]` Update all or a single ticker’s market cap/close - `--stdout` Log to console as well as file; `--debug` for verbose Examples: ```bash # Scan configured subs for last 24h, including financials rstat --days 1 # Target a single subreddit for the past week, scan more comments rstat --subreddit wallstreetbets --days 7 --comments 250 # Skip financials during scan, then update only top tickers rstat --no-financials rstat --update-top-tickers # Update financials for all tickers in DB rstat --update-financials-only # Update a single ticker (case-insensitive) rstat --update-financials-only TSLA ``` How mentions are detected: - If a post contains any $TICKER (e.g., `$TSLA`) anywhere, we use “golden-only” mode: only $-prefixed tickers are considered. - Otherwise, we fall back to filtered ALL-CAPS 2–5 letter words, excluding a large blacklist to avoid false positives. - Title tickers attribute all comments in the thread; otherwise, we scan comments directly for mentions. ## Web dashboard (rstat-dashboard) Start the dashboard and open http://127.0.0.1:5000 ```bash rstat-dashboard ``` Features: - Overall top 10 (daily/weekly) across all subs - Per-subreddit dashboards (daily/weekly) - Deep Dive pages listing posts analyzed for a ticker - Shareable image-friendly views (UI hides nav when `?image=true`) The dashboard reads from `reddit_stocks.db`. Run `rstat` first so you have data. ## Image export (export_image.py) Exports a high-res PNG of the dashboard views via Playwright. Note: the script currently uses `https://rstat.net` as its base URL. ```bash # Overall daily image python export_image.py --overall # Subreddit daily image python export_image.py --subreddit wallstreetbets # Weekly view python export_image.py --subreddit wallstreetbets --weekly ``` Output files are saved into the `images/` folder, e.g. `overall_summary_daily_1700000000.png`. Tip: If you want to export from a local dashboard instead of rstat.net, edit `base_url` in `export_image.py`. ## Post images to Reddit (post_to_reddit.py) One-time OAuth2 step to obtain a refresh token: 1) In your Reddit app settings, set the redirect URI to exactly `http://localhost:5000` (matches the script). 2) Run: ```bash python get_refresh_token.py ``` Follow the on-screen steps: open the generated URL, allow, copy the redirected URL, paste back. Add the printed token to `.env` as `REDDIT_REFRESH_TOKEN`. Now you can post: ```bash # Post the most recent overall image to r/rstat python post_to_reddit.py # Post the most recent daily image for a subreddit python post_to_reddit.py --subreddit wallstreetbets # Post weekly image for a subreddit python post_to_reddit.py --subreddit wallstreetbets --weekly # Choose a target subreddit and (optionally) a flair ID python post_to_reddit.py --subreddit wallstreetbets --target-subreddit rstat --flair-id ``` Need a flair ID? Use the helper: ```bash rstat-flairs wallstreetbets ``` ## Cleanup utilities (rstat-cleanup) Remove blacklisted “ticker” rows and/or purge data for subreddits no longer in your config. ```bash # Show help rstat-cleanup --help # Remove tickers that are in the internal COMMON_WORDS_BLACKLIST rstat-cleanup --tickers # Remove any subreddit data not in subreddits.json rstat-cleanup --subreddits # Use a custom config file rstat-cleanup --subreddits my_subs.json # Run both tasks rstat-cleanup --all ``` ## Automation (cron) An example `run_daily_job.sh` is provided. Update `BASE_DIR` and make it executable: ```bash chmod +x run_daily_job.sh ``` Add a cron entry (example 22:00 daily): ``` 0 22 * * * /absolute/path/to/reddit_stock_analyzer/run_daily_job.sh >> /absolute/path/to/reddit_stock_analyzer/cron.log 2>&1 ``` ## Docker Builds a Tailwind CSS layer, then a Python runtime with gunicorn. The compose files include optional nginx and varnish. Quick start for the dashboard only (uses your host `reddit_stocks.db`): ```bash docker compose up -d rstat-dashboard ``` Notes: - The `rstat-dashboard` container mounts `./reddit_stocks.db` read-only. Populate it by running `rstat` on the host (or add a separate CLI container). - Prod compose includes nginx (and optional certbot/varnish) configs under `config/`. ## Data model (SQLite) - `tickers(id, symbol UNIQUE, market_cap, closing_price, last_updated)` - `subreddits(id, name UNIQUE)` - `mentions(id, ticker_id, subreddit_id, post_id, comment_id NULLABLE, mention_type, mention_sentiment, mention_timestamp, UNIQUE(ticker_id, post_id, comment_id))` - `posts(id, post_id UNIQUE, title, post_url, subreddit_id, post_timestamp, comment_count, avg_comment_sentiment)` Uniqueness prevents duplicates across post/comment granularity. Cleanup helpers remove blacklisted “tickers” and stale subreddits. ## UI and Tailwind The CSS (`static/css/style.css`) is generated from `static/css/input.css` using Tailwind 4 during Docker build. If you want to tweak UI locally: ```bash npm install npx tailwindcss -i ./static/css/input.css -o ./static/css/style.css --minify ``` ## Troubleshooting - Missing VADER: Run `python rstat_tool/setup_nltk.py` once (in your venv). - Playwright errors: Run `playwright install` once; ensure lib dependencies are present on your OS. - yfinance returns None: Retry later; some tickers or regions can be spotty. The app tolerates missing financials. - Flair required: If posting fails with flair errors, fetch a valid flair ID and pass `--flair-id`. - Empty dashboards: Make sure `rstat` ran recently and `.env` is set; check `rstat.log`. - DB locked: If you edit while the dashboard is reading, wait or stop the server; SQLite locks are short-lived. ## Safety and notes - Do not commit `.env` or your database if it contains sensitive data. - This project is for research/entertainment. Not investment advice. --- Made with Python, Flask, NLTK, Playwright, and Tailwind.