2025-07-22 10:36:26 +02:00
2025-07-21 15:49:15 +02:00
2025-07-21 23:08:55 +02:00
2025-07-21 20:13:18 +02:00
2025-07-21 20:13:18 +02:00
2025-07-21 23:06:29 +02:00

rstat - Reddit Stock Analyzer

A powerful, installable command-line tool and web dashboard to scan Reddit for stock ticker mentions, perform sentiment analysis, generate insightful reports, and create shareable summary images.

Key Features

  • Dual-Interface: Use a flexible command-line tool (rstat) for data collection and a simple web dashboard (rstat-dashboard) for data visualization.
  • Flexible Data Scraping:
    • Scan subreddits from a config file or target a single subreddit on the fly.
    • Configure the time window to scan posts from the last 24 hours (for daily cron jobs) or back-fill data from several past days (e.g., last 7 days).
    • Fetches from /new to capture the most recent discussions.
  • Deep Analysis & Storage:
    • Scans both post titles and comments, differentiating between the two.
    • Performs a "deep dive" analysis on posts to calculate the average sentiment of the entire comment section.
    • Persists all data in a local SQLite database (reddit_stocks.db) to track trends over time.
  • Rich Data Enrichment:
    • Calculates sentiment (Bullish, Bearish, Neutral) for every mention using NLTK.
    • Fetches and stores daily closing prices and market capitalization from Yahoo Finance.
  • Interactive Web Dashboard:
    • View Top 10 tickers across all subreddits or on a per-subreddit basis.
    • Click any ticker to get a "Deep Dive" page, showing every post it was mentioned in.
  • Shareable Summary Images:
    • Generate clean, dark-mode summary images for both daily and weekly sentiment for any subreddit, perfect for sharing.
  • High-Quality Data:
    • Uses a configurable blacklist and smart filtering to reduce false positives.
    • Automatically cleans the database of invalid tickers if the blacklist is updated.

Project Structure

reddit_stock_analyzer/
├── .env                  # Your secret API keys
├── requirements.txt      # Project dependencies
├── setup.py              # Installation script for the tool
├── subreddits.json       # Default list of subreddits to scan
├── templates/            # HTML templates for the web dashboard
│   ├── base.html
│   ├── index.html
│   ├── subreddit.html
│   ├── deep_dive.html
│   ├── image_view.html
│   └── weekly_image_view.html
└── rstat_tool/           # The main source code package
    ├── __init__.py
    ├── main.py           # Scraper entry point and CLI logic
    ├── dashboard.py      # Web dashboard entry point (Flask app)
    ├── database.py       # All SQLite database functions
    └── ...

Setup and Installation

Follow these steps to set up the project on your local machine.

1. Prerequisites

  • Python 3.7+
  • Git

2. Clone the Repository

git clone <your-repository-url>
cd reddit_stock_analyzer

3. Set Up a Python Virtual Environment

It is highly recommended to use a virtual environment to manage dependencies.

On macOS / Linux:

python3 -m venv .venv
source .venv/bin/activate

On Windows:

python -m venv .venv
.\.venv\Scripts\activate

4. Install Dependencies

pip install -r requirements.txt

5. Configure Reddit API Credentials

  1. Go to the Reddit Apps preferences page and create a new "script" app.

  2. Create a file named .env in the root of the project directory.

  3. Add your credentials to the .env file like this:

    REDDIT_CLIENT_ID=your_client_id_from_reddit
    REDDIT_CLIENT_SECRET=your_client_secret_from_reddit
    REDDIT_USER_AGENT=A custom user agent string (e.g., python:rstat:v1.2)
    

6. Set Up NLTK

Run the included setup script once to download the required vader_lexicon for sentiment analysis.

python rstat_tool/setup_nltk.py

7. Build and Install the Commands

Install the tool in "editable" mode. This creates the rstat and rstat-dashboard commands in your virtual environment and links them to your source code.

pip install -e .

The installation is now complete.


Usage

The tool is split into two commands: one for gathering data and one for viewing it.

1. The Scraper (rstat)

This is the command-line tool you will use to populate the database. It is highly flexible.

Common Commands:

  • Run a daily scan (for cron jobs): Scans subreddits from subreddits.json for posts in the last 24 hours.

    rstat --config subreddits.json --days 1
    
  • Scan a single subreddit: Ignores the config file and scans just one subreddit.

    rstat --subreddit wallstreetbets --days 1
    
  • Back-fill data for last week: Scans a specific subreddit for all new posts in the last 7 days.

    rstat --subreddit Tollbugatabets --days 7
    
  • Get help and see all options:

    rstat --help
    

2. The Web Dashboard (rstat-dashboard)

This command starts a local web server to let you explore the data you've collected.

How to Run:

  1. Make sure you have run the rstat scraper at least once to populate the database.
  2. Start the web server:
    rstat-dashboard
    
  3. Open your web browser and navigate to http://127.0.0.1:5000.

Dashboard Features:

  • Main Page: Shows the Top 10 most mentioned tickers across all scanned subreddits.
  • Subreddit Pages: Click any subreddit in the navigation bar to see a dashboard specific to that community.
  • Deep Dive: In any table, click on a ticker's symbol to see a detailed breakdown of every post it was mentioned in.
  • Shareable Images: On a subreddit's page, click "(View Daily Image)" or "(View Weekly Image)" to generate a polished, shareable summary card.
Description
A powerful, installable command-line tool to scan Reddit for stock ticker mentions, perform sentiment analysis, and generate insightful summary reports.
Readme 804 KiB
Languages
Python 54.3%
CSS 27.6%
HTML 12%
VCL 4.5%
Shell 0.8%
Other 0.8%