Compare commits

...

2 Commits

11 changed files with 471 additions and 152 deletions

1
.gitignore vendored
View File

@@ -6,3 +6,4 @@ __pycache__/
*.db *.db
*.log *.log
reddit_stock_analyzer.egg-info/ reddit_stock_analyzer.egg-info/
images/

164
README.md
View File

@@ -156,3 +156,167 @@ This command starts a local web server to let you explore the data you've collec
* **Subreddit Pages:** Click any subreddit in the navigation bar to see a dashboard specific to that community. * **Subreddit Pages:** Click any subreddit in the navigation bar to see a dashboard specific to that community.
* **Deep Dive:** In any table, click on a ticker's symbol to see a detailed breakdown of every post it was mentioned in. * **Deep Dive:** In any table, click on a ticker's symbol to see a detailed breakdown of every post it was mentioned in.
* **Shareable Images:** On a subreddit's page, click "(View Daily Image)" or "(View Weekly Image)" to generate a polished, shareable summary card. * **Shareable Images:** On a subreddit's page, click "(View Daily Image)" or "(View Weekly Image)" to generate a polished, shareable summary card.
### 3. Exporting Shareable Images (`.png`)
In addition to viewing the dashboards in a browser, the project includes a powerful script to programmatically save the 'image views' as static `.png` files. This is ideal for automation, scheduled tasks (cron jobs), or sharing the results on social media platforms like your `r/rstat` subreddit.
#### One-Time Setup
The image exporter uses the Playwright library to control a headless browser. Before using it for the first time, you must install the necessary browser runtimes with this command:
```bash
playwright install
```
#### Usage Workflow
The exporter works by taking a high-quality screenshot of the live web page. Therefore, the process requires two steps running in two separate terminals.
**Step 1: Start the Web Dashboard**
The web server must be running for the exporter to have a page to screenshot. Open a terminal and run:
```bash
rstat-dashboard
```
Leave this terminal running.
**Step 2: Run the Export Script**
Open a **second terminal** in the same project directory. You can now run the `export_image.py` script with the desired arguments.
**Examples:**
* To export the **daily** summary image for `r/wallstreetbets`:
```bash
python export_image.py wallstreetbets
```
* To export the **weekly** summary image for `r/wallstreetbets`:
```bash
python export_image.py wallstreetbets --weekly
```
* To export the **overall** summary image (across all subreddits):
```bash
python export_image.py --overall
```
#### Output
After running a command, a new `.png` file (e.g., `wallstreetbets_daily_1690000000.png`) will be saved in the images-directory in the root directory of the project.
## 4. Full Automation: Posting to Reddit via Cron Job
The final piece of the project is a script that automates the entire process: scraping data, generating an image, and posting it to a target subreddit like `r/rstat`. This is designed to be run via a scheduled task or cron job.
### Prerequisites for Posting
The posting script needs to log in to your Reddit account. You must add your Reddit username and password to your `.env` file.
**Add these two lines to your `.env` file:**
```
REDDIT_USERNAME=YourRedditUsername
REDDIT_PASSWORD=YourRedditPassword
```
*(For security, it's recommended to use a dedicated bot account for this, not your personal account.)*
### The `post_to_reddit.py` Script
This is a standalone script located in the project's root directory that finds the most recently generated image and posts it to Reddit.
**Manual Usage:**
You can run this script manually from your terminal. This is great for testing or one-off posts.
* **Post the latest OVERALL summary image to `r/rstat`:**
```bash
python post_to_reddit.py
```
* **Post the latest DAILY image for a specific subreddit:**
```bash
python post_to_reddit.py --subreddit wallstreetbets
```
* **Post the latest WEEKLY image for a specific subreddit:**
```bash
python post_to_reddit.py --subreddit wallstreetbets --weekly
```
* **Post to a different target subreddit (e.g., a test subreddit):**
```bash
python post_to_reddit.py --target-subreddit MyTestSub
```
### Setting Up the Cron Job for Full Automation
To run the entire pipeline automatically every day, you can use a simple shell script controlled by `cron`.
**Step 1: Create a Job Script**
Create a file named `run_daily_job.sh` in the root of your project directory. This script will run all the necessary commands in the correct order.
**`run_daily_job.sh`:**
```bash
#!/bin/bash
# CRITICAL: Navigate to the project directory using an absolute path.
# Replace '/path/to/your/project/reddit_stock_analyzer' with your actual path.
cd /path/to/your/project/reddit_stock_analyzer
# CRITICAL: Activate the virtual environment using an absolute path.
source /path/to/your/project/reddit_stock_analyzer/.venv/bin/activate
echo "--- Starting RSTAT Daily Job on $(date) ---"
# 1. Scrape data from the last 24 hours for all subreddits in the config.
echo "Step 1: Scraping new data..."
rstat --config subreddits.json --days 1
# 2. Start the dashboard in the background so the exporter can access it.
echo "Step 2: Starting dashboard in background..."
rstat-dashboard &
DASHBOARD_PID=$!
# Give the server a moment to start up.
sleep 10
# 3. Export the overall summary image.
echo "Step 3: Exporting overall summary image..."
python export_image.py --overall
# 4. Post the newly created overall summary image to r/rstat.
echo "Step 4: Posting image to Reddit..."
python post_to_reddit.py --target-subreddit rstat
# 5. Clean up by stopping the background dashboard server.
echo "Step 5: Stopping dashboard server..."
kill $DASHBOARD_PID
echo "--- RSTAT Daily Job Complete ---"
```**Before proceeding, you must edit the two absolute paths at the top of this script to match your system.**
**Step 2: Make the Script Executable**
In your terminal, run the following command:
```bash
chmod +x run_daily_job.sh
```
**Step 3: Schedule the Cron Job**
1. Open your crontab editor by running `crontab -e`.
2. Add a new line to the file to schedule the job. For example, to run the script **every day at 10:00 PM**, add the following line:
```
0 22 * * * /path/to/your/project/reddit_stock_analyzer/run_daily_job.sh >> /path/to/your/project/reddit_stock_analyzer/cron.log 2>&1
```
* `0 22 * * *` means at minute 0 of hour 22, every day, every month, every day of the week.
* `>> /path/to/your/.../cron.log 2>&1` is highly recommended. It redirects all output (both standard and error) from the script into a log file, so you can check if the job ran successfully.
Your project is now fully automated to scrape, analyze, visualize, and post data every day.

View File

@@ -1,51 +1,73 @@
# export_image.py # export_image.py
import argparse import argparse
from playwright.sync_api import sync_playwright import os
import time import time
from playwright.sync_api import sync_playwright
def export_subreddit_image(subreddit_name, weekly=False): # Define the output directory as a constant
""" OUTPUT_DIR = "images"
Launches a headless browser to take a screenshot of a subreddit's image view.
""" def export_image(url_path, filename_prefix):
view_type = "weekly" if weekly else "daily" """
print(f"Exporting {view_type} image for r/{subreddit_name}...") Launches a headless browser, navigates to a URL path, and screenshots
the .image-container element, saving it to the OUTPUT_DIR.
"""
print(f"-> Preparing to export image for: {filename_prefix}")
# 1. Ensure the output directory exists
os.makedirs(OUTPUT_DIR, exist_ok=True)
# The URL our Flask app serves
base_url = "http://127.0.0.1:5000" base_url = "http://127.0.0.1:5000"
path = f"image/weekly/{subreddit_name}" if weekly else f"image/{subreddit_name}" url = f"{base_url}/{url_path}"
url = f"{base_url}/{path}"
# Define the output filename # 2. Construct the full output path including the new directory
output_file = f"{subreddit_name}_{'weekly' if weekly else 'daily'}_{int(time.time())}.png" output_file = os.path.join(OUTPUT_DIR, f"{filename_prefix}_{int(time.time())}.png")
with sync_playwright() as p: with sync_playwright() as p:
browser = p.chromium.launch() try:
page = browser.new_page() browser = p.chromium.launch()
page = browser.new_page()
# Set a large viewport for high-quality screenshots page.set_viewport_size({"width": 1920, "height": 1080})
page.set_viewport_size({"width": 1920, "height": 1080})
print(f"Navigating to {url}...") print(f" Navigating to {url}...")
page.goto(url) page.goto(url, wait_until="networkidle") # Wait for network to be idle
# Important: Give the page a second to ensure all styles and fonts have loaded # Target the specific element we want to screenshot
page.wait_for_timeout(1000) element = page.locator(".image-container")
# Target the specific element we want to screenshot print(f" Saving screenshot to {output_file}...")
element = page.locator(".image-container") element.screenshot(path=output_file)
print(f"Saving screenshot to {output_file}...") browser.close()
element.screenshot(path=output_file) print(f"-> Export complete! Image saved to {output_file}")
except Exception as e:
print(f"\nAn error occurred during export: {e}")
print("Please ensure the 'rstat-dashboard' server is running in another terminal.")
browser.close()
print("Export complete!")
if __name__ == "__main__": if __name__ == "__main__":
# Use a mutually exclusive group to ensure only one mode is chosen
parser = argparse.ArgumentParser(description="Export subreddit sentiment images.") parser = argparse.ArgumentParser(description="Export subreddit sentiment images.")
parser.add_argument("subreddit", help="The name of the subreddit to export.") group = parser.add_mutually_exclusive_group(required=True)
parser.add_argument("--weekly", action="store_true", help="Export the weekly view instead of the daily view.") group.add_argument("-s", "--subreddit", help="The name of the subreddit to export.")
group.add_argument("-o", "--overall", action="store_true", help="Export the overall summary image.")
parser.add_argument("-w", "--weekly", action="store_true", help="Export the weekly view instead of the daily view (only for --subreddit).")
args = parser.parse_args() args = parser.parse_args()
# NOTE: This script assumes your 'rstat-dashboard' server is already running in another terminal. # Determine the correct URL path and filename based on arguments
export_subreddit_image(args.subreddit, args.weekly) if args.subreddit:
view_type = "weekly" if args.weekly else "daily"
url_path_to_render = f"image/{view_type}/{args.subreddit}"
filename_prefix_to_save = f"{args.subreddit}_{view_type}"
export_image(url_path_to_render, filename_prefix_to_save)
elif args.overall:
if args.weekly:
print("Warning: --weekly flag has no effect with --overall. Exporting overall summary.")
url_path_to_render = "image/overall"
filename_prefix_to_save = "overall_summary"
export_image(url_path_to_render, filename_prefix_to_save)

104
post_to_reddit.py Normal file
View File

@@ -0,0 +1,104 @@
# post_to_reddit.py
import argparse
import os
import glob
from datetime import datetime, timezone
import praw
from dotenv import load_dotenv
# --- CONFIGURATION ---
IMAGE_DIR = "images"
def get_reddit_instance():
"""Initializes and returns a PRAW Reddit instance from .env credentials."""
load_dotenv()
client_id = os.getenv("REDDIT_CLIENT_ID")
client_secret = os.getenv("REDDIT_CLIENT_SECRET")
user_agent = os.getenv("REDDIT_USER_AGENT")
username = os.getenv("REDDIT_USERNAME") # <-- Add your Reddit username to .env
password = os.getenv("REDDIT_PASSWORD") # <-- Add your Reddit password to .env
if not all([client_id, client_secret, user_agent, username, password]):
print("Error: Reddit API credentials (including username/password) not found in .env file.")
return None
return praw.Reddit(
client_id=client_id,
client_secret=client_secret,
user_agent=user_agent,
username=username,
password=password
)
def find_latest_image(pattern):
"""Finds the most recent file in the IMAGE_DIR that matches a given pattern."""
try:
search_path = os.path.join(IMAGE_DIR, pattern)
list_of_files = glob.glob(search_path)
if not list_of_files:
return None
# The latest file will be the one with the highest modification time
latest_file = max(list_of_files, key=os.path.getmtime)
return latest_file
except Exception as e:
print(f"Error finding image file: {e}")
return None
def main():
"""Main function to find an image and post it to Reddit."""
parser = argparse.ArgumentParser(description="Find the latest sentiment image and post it to a subreddit.")
parser.add_argument("-s", "--subreddit", help="The source subreddit of the image to post. (Defaults to overall summary)")
parser.add_argument("-w", "--weekly", action="store_true", help="Post the weekly summary instead of the daily one.")
parser.add_argument("-t", "--target-subreddit", default="rstat", help="The subreddit to post the image to. (Default: rstat)")
args = parser.parse_args()
# --- 1. Determine filename pattern and post title ---
current_date_str = datetime.now(timezone.utc).strftime("%Y-%m-%d")
if args.subreddit:
view_type = "weekly" if args.weekly else "daily"
filename_pattern = f"{args.subreddit.lower()}_{view_type}_*.png"
post_title = f"{view_type.capitalize()} Ticker Sentiment for r/{args.subreddit} ({current_date_str})"
else:
# Default to the overall summary
if args.weekly:
print("Warning: --weekly flag has no effect for overall summary. Posting overall daily image.")
filename_pattern = "overall_summary_*.png"
post_title = f"Overall Top 10 Ticker Mentions Across Reddit ({current_date_str})"
print(f"Searching for image pattern: {filename_pattern}")
# --- 2. Find the latest image file ---
image_to_post = find_latest_image(filename_pattern)
if not image_to_post:
print(f"Error: No image found matching the pattern '{filename_pattern}'. Please run the scraper and exporter first.")
return
print(f"Found image: {image_to_post}")
# --- 3. Connect to Reddit and submit ---
reddit = get_reddit_instance()
if not reddit:
return
try:
target_sub = reddit.subreddit(args.target_subreddit)
print(f"Submitting '{post_title}' to r/{target_sub.display_name}...")
submission = target_sub.submit_image(
title=post_title,
image_path=image_to_post,
flair_id=None # Optional: You can add a flair ID here if you want
)
print("\n--- Post Successful! ---")
print(f"Post URL: {submission.shortlink}")
except Exception as e:
print(f"\nAn error occurred while posting to Reddit: {e}")
if __name__ == "__main__":
main()

View File

@@ -2,10 +2,13 @@
import argparse import argparse
from . import database from . import database
from .logger_setup import get_logger
# We can't reuse load_subreddits from main anymore if it's not in the same file # We can't reuse load_subreddits from main anymore if it's not in the same file
# So we will duplicate it here. It's small and keeps this script self-contained. # So we will duplicate it here. It's small and keeps this script self-contained.
import json import json
log = get_logger()
def load_subreddits(filepath): def load_subreddits(filepath):
"""Loads a list of subreddits from a JSON file.""" """Loads a list of subreddits from a JSON file."""
try: try:
@@ -13,7 +16,7 @@ def load_subreddits(filepath):
data = json.load(f) data = json.load(f)
return data.get("subreddits", []) return data.get("subreddits", [])
except (FileNotFoundError, json.JSONDecodeError) as e: except (FileNotFoundError, json.JSONDecodeError) as e:
print(f"Error loading config file '{filepath}': {e}") log.error(f"Error loading config file '{filepath}': {e}")
return None return None
def run_cleanup(): def run_cleanup():
@@ -52,17 +55,17 @@ def run_cleanup():
run_any_task = True run_any_task = True
# If --all is used, default to 'subreddits.json' if --subreddits wasn't also specified # If --all is used, default to 'subreddits.json' if --subreddits wasn't also specified
config_file = args.subreddits or 'subreddits.json' config_file = args.subreddits or 'subreddits.json'
print(f"\nCleaning subreddits based on active list in: {config_file}") log.info(f"\nCleaning subreddits based on active list in: {config_file}")
active_subreddits = load_subreddits(config_file) active_subreddits = load_subreddits(config_file)
if active_subreddits is not None: if active_subreddits is not None:
database.clean_stale_subreddits(active_subreddits) database.clean_stale_subreddits(active_subreddits)
if not run_any_task: if not run_any_task:
parser.print_help() parser.print_help()
print("\nError: Please provide at least one cleanup option (e.g., --tickers, --subreddits, --all).") log.error("\nError: Please provide at least one cleanup option (e.g., --tickers, --subreddits, --all).")
return return
print("\nCleanup finished.") log.info("\nCleanup finished.")
if __name__ == "__main__": if __name__ == "__main__":
run_cleanup() run_cleanup()

View File

@@ -1,17 +1,19 @@
# rstat_tool/dashboard.py # rstat_tool/dashboard.py
from flask import Flask, render_template from flask import Flask, render_template
from datetime import datetime, timedelta from datetime import datetime, timedelta, timezone
from .logger_setup import get_logger
from .database import ( from .database import (
get_overall_summary, get_overall_summary,
get_subreddit_summary, get_subreddit_summary,
get_all_scanned_subreddits, get_all_scanned_subreddits,
get_deep_dive_details, get_deep_dive_details,
get_image_view_summary, get_daily_summary_for_subreddit,
get_weekly_summary_for_subreddit, get_weekly_summary_for_subreddit,
get_overall_image_view_summary get_overall_image_view_summary
) )
log = get_logger()
app = Flask(__name__, template_folder='../templates') app = Flask(__name__, template_folder='../templates')
@app.template_filter('format_mc') @app.template_filter('format_mc')
@@ -53,13 +55,13 @@ def deep_dive(symbol):
posts = get_deep_dive_details(symbol) posts = get_deep_dive_details(symbol)
return render_template("deep_dive.html", posts=posts, symbol=symbol) return render_template("deep_dive.html", posts=posts, symbol=symbol)
@app.route("/image/<name>") @app.route("/image/daily/<name>")
def image_view(name): def daily_image_view(name):
"""The handler for the image-style dashboard.""" """The handler for the image-style dashboard."""
tickers = get_image_view_summary(name) tickers = get_daily_summary_for_subreddit(name)
current_date = datetime.utcnow().strftime("%Y-%m-%d") current_date = datetime.now(timezone.utc).strftime("%Y-%m-%d")
return render_template( return render_template(
"image_view.html", "daily_image_view.html",
tickers=tickers, tickers=tickers,
subreddit_name=name, subreddit_name=name,
current_date=current_date current_date=current_date
@@ -71,7 +73,7 @@ def weekly_image_view(name):
tickers = get_weekly_summary_for_subreddit(name) tickers = get_weekly_summary_for_subreddit(name)
# Create the date range string for the title # Create the date range string for the title
end_date = datetime.utcnow() end_date = datetime.now(timezone.utc)
start_date = end_date - timedelta(days=7) start_date = end_date - timedelta(days=7)
date_range_str = f"{start_date.strftime('%b %d')} - {end_date.strftime('%b %d, %Y')}" date_range_str = f"{start_date.strftime('%b %d')} - {end_date.strftime('%b %d, %Y')}"
@@ -86,7 +88,7 @@ def weekly_image_view(name):
def overall_image_view(): def overall_image_view():
"""The handler for the overall image-style dashboard.""" """The handler for the overall image-style dashboard."""
tickers = get_overall_image_view_summary() tickers = get_overall_image_view_summary()
current_date = datetime.utcnow().strftime("%Y-%m-%d") current_date = datetime.now(timezone.utc).strftime("%Y-%m-%d")
return render_template( return render_template(
"overall_image_view.html", "overall_image_view.html",
tickers=tickers, tickers=tickers,
@@ -95,9 +97,9 @@ def overall_image_view():
def start_dashboard(): def start_dashboard():
"""The main function called by the 'rstat-dashboard' command.""" """The main function called by the 'rstat-dashboard' command."""
print("Starting Flask server...") log.info("Starting Flask server...")
print("Open http://127.0.0.1:5000 in your browser.") log.info("Open http://127.0.0.1:5000 in your browser.")
print("Press CTRL+C to stop the server.") log.info("Press CTRL+C to stop the server.")
app.run(debug=True) app.run(debug=True)
if __name__ == "__main__": if __name__ == "__main__":

View File

@@ -3,9 +3,11 @@
import sqlite3 import sqlite3
import time import time
from .ticker_extractor import COMMON_WORDS_BLACKLIST from .ticker_extractor import COMMON_WORDS_BLACKLIST
from datetime import datetime, timedelta from .logger_setup import get_logger
from datetime import datetime, timedelta, timezone
DB_FILE = "reddit_stocks.db" DB_FILE = "reddit_stocks.db"
log = get_logger()
def get_db_connection(): def get_db_connection():
"""Establishes a connection to the SQLite database.""" """Establishes a connection to the SQLite database."""
@@ -71,14 +73,14 @@ def initialize_db():
conn.commit() conn.commit()
conn.close() conn.close()
print("Database initialized successfully.") log.info("Database initialized successfully.")
def clean_stale_tickers(): def clean_stale_tickers():
""" """
Removes tickers and their associated mentions from the database Removes tickers and their associated mentions from the database
if the ticker symbol exists in the COMMON_WORDS_BLACKLIST. if the ticker symbol exists in the COMMON_WORDS_BLACKLIST.
""" """
print("\n--- Cleaning Stale Tickers from Database ---") log.info("\n--- Cleaning Stale Tickers from Database ---")
conn = get_db_connection() conn = get_db_connection()
cursor = conn.cursor() cursor = conn.cursor()
@@ -89,27 +91,27 @@ def clean_stale_tickers():
stale_tickers = cursor.fetchall() stale_tickers = cursor.fetchall()
if not stale_tickers: if not stale_tickers:
print("No stale tickers to clean.") log.info("No stale tickers to clean.")
conn.close() conn.close()
return return
for ticker in stale_tickers: for ticker in stale_tickers:
ticker_id = ticker['id'] ticker_id = ticker['id']
ticker_symbol = ticker['symbol'] ticker_symbol = ticker['symbol']
print(f"Removing stale ticker '{ticker_symbol}' (ID: {ticker_id})...") log.info(f"Removing stale ticker '{ticker_symbol}' (ID: {ticker_id})...")
cursor.execute("DELETE FROM mentions WHERE ticker_id = ?", (ticker_id,)) cursor.execute("DELETE FROM mentions WHERE ticker_id = ?", (ticker_id,))
cursor.execute("DELETE FROM tickers WHERE id = ?", (ticker_id,)) cursor.execute("DELETE FROM tickers WHERE id = ?", (ticker_id,))
deleted_count = conn.total_changes deleted_count = conn.total_changes
conn.commit() conn.commit()
conn.close() conn.close()
print(f"Cleanup complete. Removed {deleted_count} records.") log.info(f"Cleanup complete. Removed {deleted_count} records.")
def clean_stale_subreddits(active_subreddits): def clean_stale_subreddits(active_subreddits):
""" """
Removes all data associated with subreddits that are NOT in the active list. Removes all data associated with subreddits that are NOT in the active list.
""" """
print("\n--- Cleaning Stale Subreddits from Database ---") log.info("\n--- Cleaning Stale Subreddits from Database ---")
conn = get_db_connection() conn = get_db_connection()
cursor = conn.cursor() cursor = conn.cursor()
cursor.execute("SELECT id, name FROM subreddits") cursor.execute("SELECT id, name FROM subreddits")
@@ -117,20 +119,20 @@ def clean_stale_subreddits(active_subreddits):
stale_sub_ids = [] stale_sub_ids = []
for sub in db_subreddits: for sub in db_subreddits:
if sub['name'] not in active_subreddits: if sub['name'] not in active_subreddits:
print(f"Found stale subreddit to remove: r/{sub['name']}") log.info(f"Found stale subreddit to remove: r/{sub['name']}")
stale_sub_ids.append(sub['id']) stale_sub_ids.append(sub['id'])
if not stale_sub_ids: if not stale_sub_ids:
print("No stale subreddits to clean.") log.info("No stale subreddits to clean.")
conn.close() conn.close()
return return
for sub_id in stale_sub_ids: for sub_id in stale_sub_ids:
print(f" -> Deleting associated data for subreddit ID: {sub_id}") log.info(f" -> Deleting associated data for subreddit ID: {sub_id}")
cursor.execute("DELETE FROM mentions WHERE subreddit_id = ?", (sub_id,)) cursor.execute("DELETE FROM mentions WHERE subreddit_id = ?", (sub_id,))
cursor.execute("DELETE FROM posts WHERE subreddit_id = ?", (sub_id,)) cursor.execute("DELETE FROM posts WHERE subreddit_id = ?", (sub_id,))
cursor.execute("DELETE FROM subreddits WHERE id = ?", (sub_id,)) cursor.execute("DELETE FROM subreddits WHERE id = ?", (sub_id,))
conn.commit() conn.commit()
conn.close() conn.close()
print("Stale subreddit cleanup complete.") log.info("Stale subreddit cleanup complete.")
def get_db_connection(): def get_db_connection():
conn = sqlite3.connect(DB_FILE) conn = sqlite3.connect(DB_FILE)
@@ -184,7 +186,7 @@ def initialize_db():
""") """)
conn.commit() conn.commit()
conn.close() conn.close()
print("Database initialized successfully.") log.info("Database initialized successfully.")
def add_mention(conn, ticker_id, subreddit_id, post_id, mention_type, timestamp, mention_sentiment, post_avg_sentiment=None): def add_mention(conn, ticker_id, subreddit_id, post_id, mention_type, timestamp, mention_sentiment, post_avg_sentiment=None):
cursor = conn.cursor() cursor = conn.cursor()
@@ -230,7 +232,7 @@ def get_ticker_info(conn, ticker_id):
def generate_summary_report(limit=20): def generate_summary_report(limit=20):
"""Queries the DB to generate a summary for the command-line tool.""" """Queries the DB to generate a summary for the command-line tool."""
print(f"\n--- Top {limit} Tickers by Mention Count ---") log.info(f"\n--- Top {limit} Tickers by Mention Count ---")
conn = get_db_connection() conn = get_db_connection()
cursor = conn.cursor() cursor = conn.cursor()
@@ -274,13 +276,6 @@ def generate_summary_report(limit=20):
) )
conn.close() conn.close()
def get_all_scanned_subreddits():
"""Gets a unique list of all subreddits we have data for."""
conn = get_db_connection()
results = conn.execute("SELECT DISTINCT name FROM subreddits ORDER BY name ASC;").fetchall()
conn.close()
return [row['name'] for row in results]
def add_or_update_post_analysis(conn, post_data): def add_or_update_post_analysis(conn, post_data):
""" """
Inserts a new post analysis record or updates an existing one. Inserts a new post analysis record or updates an existing one.
@@ -300,35 +295,15 @@ def add_or_update_post_analysis(conn, post_data):
) )
conn.commit() conn.commit()
def get_deep_dive_details(ticker_symbol):
"""
Gets all analyzed posts that mention a specific ticker.
"""
conn = get_db_connection()
query = """
SELECT DISTINCT p.*, s.name as subreddit_name FROM posts p
JOIN mentions m ON p.post_id = m.post_id
JOIN tickers t ON m.ticker_id = t.id
JOIN subreddits s ON p.subreddit_id = s.id
WHERE LOWER(t.symbol) = LOWER(?)
ORDER BY p.post_timestamp DESC;
"""
results = conn.execute(query, (ticker_symbol,)).fetchall()
conn.close()
return results
def get_overall_summary(limit=50): def get_overall_summary(limit=50):
conn = get_db_connection() conn = get_db_connection()
query = """ query = """
SELECT SELECT t.symbol, t.market_cap, t.closing_price, COUNT(m.id) as mention_count,
t.symbol, t.market_cap, t.closing_price,
COUNT(m.id) as mention_count,
SUM(CASE WHEN m.mention_sentiment > 0.1 THEN 1 ELSE 0 END) as bullish_mentions, SUM(CASE WHEN m.mention_sentiment > 0.1 THEN 1 ELSE 0 END) as bullish_mentions,
SUM(CASE WHEN m.mention_sentiment < -0.1 THEN 1 ELSE 0 END) as bearish_mentions, SUM(CASE WHEN m.mention_sentiment < -0.1 THEN 1 ELSE 0 END) as bearish_mentions,
SUM(CASE WHEN m.mention_sentiment BETWEEN -0.1 AND 0.1 THEN 1 ELSE 0 END) as neutral_mentions SUM(CASE WHEN m.mention_sentiment BETWEEN -0.1 AND 0.1 THEN 1 ELSE 0 END) as neutral_mentions
FROM mentions m JOIN tickers t ON m.ticker_id = t.id FROM mentions m JOIN tickers t ON m.ticker_id = t.id
GROUP BY t.symbol, t.market_cap, t.closing_price GROUP BY t.symbol, t.market_cap, t.closing_price ORDER BY mention_count DESC LIMIT ?;
ORDER BY mention_count DESC LIMIT ?;
""" """
results = conn.execute(query, (limit,)).fetchall() results = conn.execute(query, (limit,)).fetchall()
conn.close() conn.close()
@@ -337,86 +312,87 @@ def get_overall_summary(limit=50):
def get_subreddit_summary(subreddit_name, limit=50): def get_subreddit_summary(subreddit_name, limit=50):
conn = get_db_connection() conn = get_db_connection()
query = """ query = """
SELECT SELECT t.symbol, t.market_cap, t.closing_price, COUNT(m.id) as mention_count,
t.symbol, t.market_cap, t.closing_price,
COUNT(m.id) as mention_count,
SUM(CASE WHEN m.mention_sentiment > 0.1 THEN 1 ELSE 0 END) as bullish_mentions, SUM(CASE WHEN m.mention_sentiment > 0.1 THEN 1 ELSE 0 END) as bullish_mentions,
SUM(CASE WHEN m.mention_sentiment < -0.1 THEN 1 ELSE 0 END) as bearish_mentions, SUM(CASE WHEN m.mention_sentiment < -0.1 THEN 1 ELSE 0 END) as bearish_mentions,
SUM(CASE WHEN m.mention_sentiment BETWEEN -0.1 AND 0.1 THEN 1 ELSE 0 END) as neutral_mentions SUM(CASE WHEN m.mention_sentiment BETWEEN -0.1 AND 0.1 THEN 1 ELSE 0 END) as neutral_mentions
FROM mentions m FROM mentions m JOIN tickers t ON m.ticker_id = t.id JOIN subreddits s ON m.subreddit_id = s.id
JOIN tickers t ON m.ticker_id = t.id WHERE LOWER(s.name) = LOWER(?) GROUP BY t.symbol, t.market_cap, t.closing_price ORDER BY mention_count DESC LIMIT ?;
JOIN subreddits s ON m.subreddit_id = s.id
WHERE LOWER(s.name) = LOWER(?)
GROUP BY t.symbol, t.market_cap, t.closing_price
ORDER BY mention_count DESC LIMIT ?;
""" """
results = conn.execute(query, (subreddit_name, limit)).fetchall() results = conn.execute(query, (subreddit_name, limit)).fetchall()
conn.close() conn.close()
return results return results
def get_image_view_summary(subreddit_name): def get_daily_summary_for_subreddit(subreddit_name):
""" Gets a summary for the DAILY image view (last 24 hours). """
conn = get_db_connection() conn = get_db_connection()
one_day_ago = datetime.now(timezone.utc) - timedelta(days=1)
one_day_ago_timestamp = int(one_day_ago.timestamp())
query = """ query = """
SELECT SELECT t.symbol,
t.symbol,
COUNT(CASE WHEN m.mention_type = 'post' THEN 1 END) as post_mentions, COUNT(CASE WHEN m.mention_type = 'post' THEN 1 END) as post_mentions,
COUNT(CASE WHEN m.mention_type = 'comment' THEN 1 END) as comment_mentions, COUNT(CASE WHEN m.mention_type = 'comment' THEN 1 END) as comment_mentions,
COUNT(CASE WHEN m.mention_sentiment > 0.1 THEN 1 END) as bullish_mentions, COUNT(CASE WHEN m.mention_sentiment > 0.1 THEN 1 END) as bullish_mentions,
COUNT(CASE WHEN m.mention_sentiment < -0.1 THEN 1 END) as bearish_mentions COUNT(CASE WHEN m.mention_sentiment < -0.1 THEN 1 END) as bearish_mentions
FROM mentions m FROM mentions m JOIN tickers t ON m.ticker_id = t.id JOIN subreddits s ON m.subreddit_id = s.id
JOIN tickers t ON m.ticker_id = t.id WHERE LOWER(s.name) = LOWER(?) AND m.mention_timestamp >= ?
JOIN subreddits s ON m.subreddit_id = s.id GROUP BY t.symbol ORDER BY (post_mentions + comment_mentions) DESC LIMIT 10;
WHERE LOWER(s.name) = LOWER(?)
GROUP BY t.symbol
ORDER BY (post_mentions + comment_mentions) DESC
LIMIT 10;
""" """
results = conn.execute(query, (subreddit_name,)).fetchall() results = conn.execute(query, (subreddit_name, one_day_ago_timestamp)).fetchall()
conn.close() conn.close()
return results return results
def get_weekly_summary_for_subreddit(subreddit_name): def get_weekly_summary_for_subreddit(subreddit_name):
""" Gets a summary for the WEEKLY image view (last 7 days). """
conn = get_db_connection() conn = get_db_connection()
seven_days_ago = datetime.utcnow() - timedelta(days=7) seven_days_ago = datetime.now(timezone.utc) - timedelta(days=7)
seven_days_ago_timestamp = int(seven_days_ago.timestamp()) seven_days_ago_timestamp = int(seven_days_ago.timestamp())
query = """ query = """
SELECT SELECT t.symbol,
t.symbol,
COUNT(CASE WHEN m.mention_type = 'post' THEN 1 END) as post_mentions, COUNT(CASE WHEN m.mention_type = 'post' THEN 1 END) as post_mentions,
COUNT(CASE WHEN m.mention_type = 'comment' THEN 1 END) as comment_mentions, COUNT(CASE WHEN m.mention_type = 'comment' THEN 1 END) as comment_mentions,
COUNT(CASE WHEN m.mention_sentiment > 0.1 THEN 1 END) as bullish_mentions, COUNT(CASE WHEN m.mention_sentiment > 0.1 THEN 1 END) as bullish_mentions,
COUNT(CASE WHEN m.mention_sentiment < -0.1 THEN 1 END) as bearish_mentions COUNT(CASE WHEN m.mention_sentiment < -0.1 THEN 1 END) as bearish_mentions
FROM mentions m FROM mentions m JOIN tickers t ON m.ticker_id = t.id JOIN subreddits s ON m.subreddit_id = s.id
JOIN tickers t ON m.ticker_id = t.id
JOIN subreddits s ON m.subreddit_id = s.id
WHERE LOWER(s.name) = LOWER(?) AND m.mention_timestamp >= ? WHERE LOWER(s.name) = LOWER(?) AND m.mention_timestamp >= ?
GROUP BY t.symbol GROUP BY t.symbol ORDER BY (post_mentions + comment_mentions) DESC LIMIT 10;
ORDER BY (post_mentions + comment_mentions) DESC
LIMIT 10;
""" """
results = conn.execute(query, (subreddit_name, seven_days_ago_timestamp)).fetchall() results = conn.execute(query, (subreddit_name, seven_days_ago_timestamp)).fetchall()
conn.close() conn.close()
return results return results
def get_overall_image_view_summary(): def get_overall_image_view_summary():
""" """ Gets a summary of top tickers across ALL subreddits for the image view. """
Gets a summary of top tickers across ALL subreddits for the image view.
"""
conn = get_db_connection() conn = get_db_connection()
query = """ query = """
SELECT SELECT t.symbol,
t.symbol,
COUNT(CASE WHEN m.mention_type = 'post' THEN 1 END) as post_mentions, COUNT(CASE WHEN m.mention_type = 'post' THEN 1 END) as post_mentions,
COUNT(CASE WHEN m.mention_type = 'comment' THEN 1 END) as comment_mentions, COUNT(CASE WHEN m.mention_type = 'comment' THEN 1 END) as comment_mentions,
COUNT(CASE WHEN m.mention_sentiment > 0.1 THEN 1 END) as bullish_mentions, COUNT(CASE WHEN m.mention_sentiment > 0.1 THEN 1 END) as bullish_mentions,
COUNT(CASE WHEN m.mention_sentiment < -0.1 THEN 1 END) as bearish_mentions COUNT(CASE WHEN m.mention_sentiment < -0.1 THEN 1 END) as bearish_mentions
FROM mentions m FROM mentions m JOIN tickers t ON m.ticker_id = t.id
JOIN tickers t ON m.ticker_id = t.id GROUP BY t.symbol ORDER BY (post_mentions + comment_mentions) DESC LIMIT 10;
-- No JOIN or WHERE for subreddit, as we want all of them
GROUP BY t.symbol
ORDER BY (post_mentions + comment_mentions) DESC
LIMIT 10;
""" """
results = conn.execute(query).fetchall() results = conn.execute(query).fetchall()
conn.close() conn.close()
return results return results
def get_deep_dive_details(ticker_symbol):
""" Gets all analyzed posts that mention a specific ticker. """
conn = get_db_connection()
query = """
SELECT DISTINCT p.*, s.name as subreddit_name FROM posts p
JOIN mentions m ON p.post_id = m.post_id JOIN tickers t ON m.ticker_id = t.id
JOIN subreddits s ON p.subreddit_id = s.id
WHERE LOWER(t.symbol) = LOWER(?) ORDER BY p.post_timestamp DESC;
"""
results = conn.execute(query, (ticker_symbol,)).fetchall()
conn.close()
return results
def get_all_scanned_subreddits():
""" Gets a unique list of all subreddits we have data for. """
conn = get_db_connection()
results = conn.execute("SELECT DISTINCT name FROM subreddits ORDER BY name ASC;").fetchall()
conn.close()
return [row['name'] for row in results]

View File

@@ -0,0 +1,47 @@
# rstat_tool/logger_setup.py
import logging
import sys
# Get the root logger
logger = logging.getLogger("rstat_app")
logger.setLevel(logging.INFO) # Set the minimum level of messages to handle
# Prevent the logger from propagating messages to the parent (root) logger
logger.propagate = False
# Only add handlers if they haven't been added before
# This prevents duplicate log messages if this function is called multiple times.
if not logger.handlers:
# --- Console Handler ---
# This handler prints logs to the standard output (your terminal)
console_handler = logging.StreamHandler(sys.stdout)
console_handler.setLevel(logging.INFO)
# A simple formatter for the console
console_formatter = logging.Formatter('%(message)s')
console_handler.setFormatter(console_formatter)
logger.addHandler(console_handler)
# --- File Handler ---
# This handler writes logs to a file
# 'a' stands for append mode
file_handler = logging.FileHandler("rstat.log", mode='a')
file_handler.setLevel(logging.INFO)
# A more detailed formatter for the file, including timestamp and log level
file_formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(message)s', datefmt='%Y-%m-%d %H:%M:%S')
file_handler.setFormatter(file_formatter)
logger.addHandler(file_handler)
# Get the logger used by the yfinance library
yfinance_logger = logging.getLogger("yfinance")
# Set its level to capture warnings and errors
yfinance_logger.setLevel(logging.WARNING)
# Add our existing handlers to it. This tells yfinance's logger
# to send its messages to our console and our log file.
if not yfinance_logger.handlers:
yfinance_logger.addHandler(console_handler)
yfinance_logger.addHandler(file_handler)
def get_logger():
"""A simple function to get our configured logger."""
return logger

View File

@@ -12,17 +12,20 @@ from dotenv import load_dotenv
from . import database from . import database
from .ticker_extractor import extract_tickers from .ticker_extractor import extract_tickers
from .sentiment_analyzer import get_sentiment_score from .sentiment_analyzer import get_sentiment_score
from .logger_setup import get_logger
load_dotenv() load_dotenv()
MARKET_CAP_REFRESH_INTERVAL = 86400 MARKET_CAP_REFRESH_INTERVAL = 86400
POST_AGE_LIMIT = 86400 POST_AGE_LIMIT = 86400
log = get_logger()
def load_subreddits(filepath): def load_subreddits(filepath):
try: try:
with open(filepath, 'r') as f: with open(filepath, 'r') as f:
return json.load(f).get("subreddits", []) return json.load(f).get("subreddits", [])
except (FileNotFoundError, json.JSONDecodeError) as e: except (FileNotFoundError, json.JSONDecodeError) as e:
print(f"Error loading config file '{filepath}': {e}") log.error(f"Error loading config file '{filepath}': {e}")
return None return None
def get_financial_data(ticker_symbol): def get_financial_data(ticker_symbol):
@@ -52,7 +55,7 @@ def scan_subreddits(reddit, subreddits_list, post_limit=100, comment_limit=100,
post_age_limit = days_to_scan * 86400 post_age_limit = days_to_scan * 86400
current_time = time.time() current_time = time.time()
print(f"\nScanning {len(subreddits_list)} subreddit(s) for NEW posts in the last {days_to_scan} day(s)...") log.info(f"\nScanning {len(subreddits_list)} subreddit(s) for NEW posts in the last {days_to_scan} day(s)...")
for subreddit_name in subreddits_list: for subreddit_name in subreddits_list:
try: try:
# Always use the lowercase version of the name for consistency. # Always use the lowercase version of the name for consistency.
@@ -60,15 +63,13 @@ def scan_subreddits(reddit, subreddits_list, post_limit=100, comment_limit=100,
subreddit_id = database.get_or_create_entity(conn, 'subreddits', 'name', normalized_sub_name) subreddit_id = database.get_or_create_entity(conn, 'subreddits', 'name', normalized_sub_name)
subreddit = reddit.subreddit(normalized_sub_name) subreddit = reddit.subreddit(normalized_sub_name)
print(f"Scanning r/{normalized_sub_name}...") log.info(f"Scanning r/{normalized_sub_name}...")
for submission in subreddit.new(limit=post_limit): for submission in subreddit.new(limit=post_limit):
if (current_time - submission.created_utc) > post_age_limit: if (current_time - submission.created_utc) > post_age_limit:
print(f" -> Reached posts older than the {days_to_scan}-day limit.") log.info(f" -> Reached posts older than the {days_to_scan}-day limit.")
break break
# --- NEW HYBRID LOGIC ---
tickers_in_title = set(extract_tickers(submission.title)) tickers_in_title = set(extract_tickers(submission.title))
all_tickers_found_in_post = set(tickers_in_title) # Start a set to track all tickers for financials all_tickers_found_in_post = set(tickers_in_title) # Start a set to track all tickers for financials
@@ -77,7 +78,7 @@ def scan_subreddits(reddit, subreddits_list, post_limit=100, comment_limit=100,
# --- CASE A: Tickers were found in the title --- # --- CASE A: Tickers were found in the title ---
if tickers_in_title: if tickers_in_title:
print(f" -> Title Mention(s): {', '.join(tickers_in_title)}. Attributing all comments.") log.info(f" -> Title Mention(s): {', '.join(tickers_in_title)}. Attributing all comments.")
post_sentiment = get_sentiment_score(submission.title) post_sentiment = get_sentiment_score(submission.title)
# Add one 'post' mention for each title ticker # Add one 'post' mention for each title ticker
@@ -109,7 +110,7 @@ def scan_subreddits(reddit, subreddits_list, post_limit=100, comment_limit=100,
ticker_id = database.get_or_create_entity(conn, 'tickers', 'symbol', ticker_symbol) ticker_id = database.get_or_create_entity(conn, 'tickers', 'symbol', ticker_symbol)
ticker_info = database.get_ticker_info(conn, ticker_id) ticker_info = database.get_ticker_info(conn, ticker_id)
if not ticker_info['last_updated'] or (current_time - ticker_info['last_updated'] > MARKET_CAP_REFRESH_INTERVAL): if not ticker_info['last_updated'] or (current_time - ticker_info['last_updated'] > MARKET_CAP_REFRESH_INTERVAL):
print(f" -> Fetching financial data for {ticker_symbol}...") log.info(f" -> Fetching financial data for {ticker_symbol}...")
financials = get_financial_data(ticker_symbol) financials = get_financial_data(ticker_symbol)
database.update_ticker_financials( database.update_ticker_financials(
conn, ticker_id, conn, ticker_id,
@@ -129,10 +130,10 @@ def scan_subreddits(reddit, subreddits_list, post_limit=100, comment_limit=100,
database.add_or_update_post_analysis(conn, post_analysis_data) database.add_or_update_post_analysis(conn, post_analysis_data)
except Exception as e: except Exception as e:
print(f"Could not scan r/{subreddit_name}. Error: {e}") log.error(f"Could not scan r/{subreddit_name}. Error: {e}")
conn.close() conn.close()
print("\n--- Scan Complete ---") log.info("\n--- Scan Complete ---")
def main(): def main():
@@ -147,19 +148,18 @@ def main():
parser.add_argument("-l", "--limit", type=int, default=20, help="Number of tickers to show in the CLI report.\n(Default: 20)") parser.add_argument("-l", "--limit", type=int, default=20, help="Number of tickers to show in the CLI report.\n(Default: 20)")
args = parser.parse_args() args = parser.parse_args()
# --- THIS IS THE CORRECTED LOGIC BLOCK ---
if args.subreddit: if args.subreddit:
# If --subreddit is used, create a list with just that one. # If --subreddit is used, create a list with just that one.
subreddits_to_scan = [args.subreddit] subreddits_to_scan = [args.subreddit]
print(f"Targeted Scan Mode: Focusing on r/{args.subreddit}") log.info(f"Targeted Scan Mode: Focusing on r/{args.subreddit}")
else: else:
# Otherwise, load from the config file. # Otherwise, load from the config file.
print(f"Config Scan Mode: Loading subreddits from {args.config}") log.info(f"Config Scan Mode: Loading subreddits from {args.config}")
# Use the correct argument name: args.config # Use the correct argument name: args.config
subreddits_to_scan = load_subreddits(args.config) subreddits_to_scan = load_subreddits(args.config)
if not subreddits_to_scan: if not subreddits_to_scan:
print("Error: No subreddits to scan. Please check your config file or --subreddit argument.") log.error("Error: No subreddits to scan. Please check your config file or --subreddit argument.")
return return
# --- Initialize and Run --- # --- Initialize and Run ---

View File

@@ -5,7 +5,7 @@
{% block content %} {% block content %}
<h1> <h1>
Top 10 Tickers in r/{{ subreddit_name }} Top 10 Tickers in r/{{ subreddit_name }}
<a href="/image/{{ subreddit_name }}" target="_blank" style="font-size: 0.8rem; margin-left: 1rem; font-weight: normal;">(View Daily Image)</a> <a href="/image/daily/{{ subreddit_name }}" target="_blank" style="font-size: 0.8rem; margin-left: 1rem; font-weight: normal;">(View Daily Image)</a>
<!-- ADD THIS NEW LINK --> <!-- ADD THIS NEW LINK -->
<a href="/image/weekly/{{ subreddit_name }}" target="_blank" style="font-size: 0.8rem; margin-left: 1rem; font-weight: normal;">(View Weekly Image)</a> <a href="/image/weekly/{{ subreddit_name }}" target="_blank" style="font-size: 0.8rem; margin-left: 1rem; font-weight: normal;">(View Weekly Image)</a>
</h1> </h1>