rstat - Reddit Stock Analyzer

A powerful, installable command-line tool and web dashboard to scan Reddit for stock ticker mentions, perform sentiment analysis, generate insightful reports, and create shareable summary images.

Key Features

  • Dual-Interface: Use a flexible command-line tool (rstat) for data collection and a simple web dashboard (rstat-dashboard) for data visualization.
  • Flexible Data Scraping:
    • Scan subreddits from a config file or target a single subreddit on the fly.
    • Configure the time window to scan posts from the last 24 hours (for daily cron jobs) or back-fill data from several past days (e.g., last 7 days).
    • Fetches from /new to capture the most recent discussions.
  • Deep Analysis & Storage:
    • Scans both post titles and comments, differentiating between the two.
    • Performs a "deep dive" analysis on posts to calculate the average sentiment of the entire comment section.
    • Persists all data in a local SQLite database (reddit_stocks.db) to track trends over time.
  • Rich Data Enrichment:
    • Calculates sentiment (Bullish, Bearish, Neutral) for every mention using NLTK.
    • Fetches and stores daily closing prices and market capitalization from Yahoo Finance.
  • Interactive Web Dashboard:
    • View Top 10 tickers across all subreddits or on a per-subreddit basis.
    • Click any ticker to get a "Deep Dive" page, showing every post it was mentioned in.
  • Shareable Summary Images:
    • Generate clean, dark-mode summary images for both daily and weekly sentiment for any subreddit, perfect for sharing.
  • High-Quality Data:
    • Uses a configurable blacklist and smart filtering to reduce false positives.
    • Automatically cleans the database of invalid tickers if the blacklist is updated.

Project Structure

reddit_stock_analyzer/
├── .env                  # Your secret API keys
├── requirements.txt      # Project dependencies
├── setup.py              # Installation script for the tool
├── subreddits.json       # Default list of subreddits to scan
├── templates/            # HTML templates for the web dashboard
│   ├── base.html
│   ├── index.html
│   ├── subreddit.html
│   ├── deep_dive.html
│   ├── image_view.html
│   └── weekly_image_view.html
└── rstat_tool/           # The main source code package
    ├── __init__.py
    ├── main.py           # Scraper entry point and CLI logic
    ├── dashboard.py      # Web dashboard entry point (Flask app)
    ├── database.py       # All SQLite database functions
    └── ...

Setup and Installation

Follow these steps to set up the project on your local machine.

1. Prerequisites

  • Python 3.7+
  • Git

2. Clone the Repository

git clone <your-repository-url>
cd reddit_stock_analyzer

3. Set Up a Python Virtual Environment

It is highly recommended to use a virtual environment to manage dependencies.

On macOS / Linux:

python3 -m venv .venv
source .venv/bin/activate

On Windows:

python -m venv .venv
.\.venv\Scripts\activate

4. Install Dependencies

pip install -r requirements.txt

5. Configure Reddit API Credentials

  1. Go to the Reddit Apps preferences page and create a new "script" app.

  2. Create a file named .env in the root of the project directory.

  3. Add your credentials to the .env file like this:

    REDDIT_CLIENT_ID=your_client_id_from_reddit
    REDDIT_CLIENT_SECRET=your_client_secret_from_reddit
    REDDIT_USER_AGENT=A custom user agent string (e.g., python:rstat:v1.2)
    

6. Set Up NLTK

Run the included setup script once to download the required vader_lexicon for sentiment analysis.

python rstat_tool/setup_nltk.py

7. Build and Install the Commands

Install the tool in "editable" mode. This creates the rstat and rstat-dashboard commands in your virtual environment and links them to your source code.

pip install -e .

The installation is now complete.


Usage

The tool is split into two commands: one for gathering data and one for viewing it.

1. The Scraper (rstat)

This is the command-line tool you will use to populate the database. It is highly flexible.

Common Commands:

  • Run a daily scan (for cron jobs): Scans subreddits from subreddits.json for posts in the last 24 hours.

    rstat --config subreddits.json --days 1
    
  • Scan a single subreddit: Ignores the config file and scans just one subreddit.

    rstat --subreddit wallstreetbets --days 1
    
  • Back-fill data for last week: Scans a specific subreddit for all new posts in the last 7 days.

    rstat --subreddit Tollbugatabets --days 7
    
  • Get help and see all options:

    rstat --help
    

2. The Web Dashboard (rstat-dashboard)

This command starts a local web server to let you explore the data you've collected.

How to Run:

  1. Make sure you have run the rstat scraper at least once to populate the database.
  2. Start the web server:
    rstat-dashboard
    
  3. Open your web browser and navigate to http://127.0.0.1:5000.

Dashboard Features:

  • Main Page: Shows the Top 10 most mentioned tickers across all scanned subreddits.
  • Subreddit Pages: Click any subreddit in the navigation bar to see a dashboard specific to that community.
  • Deep Dive: In any table, click on a ticker's symbol to see a detailed breakdown of every post it was mentioned in.
  • Shareable Images: On a subreddit's page, click "(View Daily Image)" or "(View Weekly Image)" to generate a polished, shareable summary card.

3. Exporting Shareable Images (.png)

In addition to viewing the dashboards in a browser, the project includes a powerful script to programmatically save the 'image views' as static .png files. This is ideal for automation, scheduled tasks (cron jobs), or sharing the results on social media platforms like your r/rstat subreddit.

One-Time Setup

The image exporter uses the Playwright library to control a headless browser. Before using it for the first time, you must install the necessary browser runtimes with this command:

playwright install

Usage Workflow

The exporter works by taking a high-quality screenshot of the live web page. Therefore, the process requires two steps running in two separate terminals.

Step 1: Start the Web Dashboard

The web server must be running for the exporter to have a page to screenshot. Open a terminal and run:

rstat-dashboard

Leave this terminal running.

Step 2: Run the Export Script

Open a second terminal in the same project directory. You can now run the export_image.py script with the desired arguments.

Examples:

  • To export the daily summary image for r/wallstreetbets:

    python export_image.py wallstreetbets
    
  • To export the weekly summary image for r/wallstreetbets:

    python export_image.py wallstreetbets --weekly
    
  • To export the overall summary image (across all subreddits):

    python export_image.py --overall
    

Output

After running a command, a new .png file (e.g., wallstreetbets_daily_1690000000.png) will be saved in the images-directory in the root directory of the project.

4. Full Automation: Posting to Reddit via Cron Job

The final piece of the project is a script that automates the entire process: scraping data, generating an image, and posting it to a target subreddit like r/rstat. This is designed to be run via a scheduled task or cron job.

Prerequisites for Posting

The posting script needs to log in to your Reddit account. You must add your Reddit username and password to your .env file.

Add these two lines to your .env file:

REDDIT_USERNAME=YourRedditUsername
REDDIT_PASSWORD=YourRedditPassword

(For security, it's recommended to use a dedicated bot account for this, not your personal account.)

The post_to_reddit.py Script

This is a standalone script located in the project's root directory that finds the most recently generated image and posts it to Reddit.

Manual Usage:

You can run this script manually from your terminal. This is great for testing or one-off posts.

  • Post the latest OVERALL summary image to r/rstat:

    python post_to_reddit.py
    
  • Post the latest DAILY image for a specific subreddit:

    python post_to_reddit.py --subreddit wallstreetbets
    
  • Post the latest WEEKLY image for a specific subreddit:

    python post_to_reddit.py --subreddit wallstreetbets --weekly
    
  • Post to a different target subreddit (e.g., a test subreddit):

    python post_to_reddit.py --target-subreddit MyTestSub
    

Setting Up the Cron Job for Full Automation

To run the entire pipeline automatically every day, you can use a simple shell script controlled by cron.

Step 1: Create a Job Script

Create a file named run_daily_job.sh in the root of your project directory. This script will run all the necessary commands in the correct order.

run_daily_job.sh:

#!/bin/bash

# CRITICAL: Navigate to the project directory using an absolute path.
# Replace '/path/to/your/project/reddit_stock_analyzer' with your actual path.
cd /path/to/your/project/reddit_stock_analyzer

# CRITICAL: Activate the virtual environment using an absolute path.
source /path/to/your/project/reddit_stock_analyzer/.venv/bin/activate

echo "--- Starting RSTAT Daily Job on $(date) ---"

# 1. Scrape data from the last 24 hours for all subreddits in the config.
echo "Step 1: Scraping new data..."
rstat --config subreddits.json --days 1

# 2. Start the dashboard in the background so the exporter can access it.
echo "Step 2: Starting dashboard in background..."
rstat-dashboard &
DASHBOARD_PID=$!

# Give the server a moment to start up.
sleep 10

# 3. Export the overall summary image.
echo "Step 3: Exporting overall summary image..."
python export_image.py --overall

# 4. Post the newly created overall summary image to r/rstat.
echo "Step 4: Posting image to Reddit..."
python post_to_reddit.py --target-subreddit rstat

# 5. Clean up by stopping the background dashboard server.
echo "Step 5: Stopping dashboard server..."
kill $DASHBOARD_PID

echo "--- RSTAT Daily Job Complete ---"
```**Before proceeding, you must edit the two absolute paths at the top of this script to match your system.**

**Step 2: Make the Script Executable**

In your terminal, run the following command:
```bash
chmod +x run_daily_job.sh

Step 3: Schedule the Cron Job

  1. Open your crontab editor by running crontab -e.

  2. Add a new line to the file to schedule the job. For example, to run the script every day at 10:00 PM, add the following line:

    0 22 * * * /path/to/your/project/reddit_stock_analyzer/run_daily_job.sh >> /path/to/your/project/reddit_stock_analyzer/cron.log 2>&1
    
    • 0 22 * * * means at minute 0 of hour 22, every day, every month, every day of the week.
    • >> /path/to/your/.../cron.log 2>&1 is highly recommended. It redirects all output (both standard and error) from the script into a log file, so you can check if the job ran successfully.

Your project is now fully automated to scrape, analyze, visualize, and post data every day.

Description
A powerful, installable command-line tool to scan Reddit for stock ticker mentions, perform sentiment analysis, and generate insightful summary reports.
Readme 804 KiB
Languages
Python 54.3%
CSS 27.6%
HTML 12%
VCL 4.5%
Shell 0.8%
Other 0.8%