Compare commits
2 Commits
afe3cecb4f
...
f6ea86fa91
Author | SHA1 | Date | |
---|---|---|---|
f6ea86fa91 | |||
d4ed76e153 |
1
.gitignore
vendored
1
.gitignore
vendored
@@ -6,3 +6,4 @@ __pycache__/
|
||||
*.db
|
||||
*.log
|
||||
reddit_stock_analyzer.egg-info/
|
||||
images/
|
164
README.md
164
README.md
@@ -156,3 +156,167 @@ This command starts a local web server to let you explore the data you've collec
|
||||
* **Subreddit Pages:** Click any subreddit in the navigation bar to see a dashboard specific to that community.
|
||||
* **Deep Dive:** In any table, click on a ticker's symbol to see a detailed breakdown of every post it was mentioned in.
|
||||
* **Shareable Images:** On a subreddit's page, click "(View Daily Image)" or "(View Weekly Image)" to generate a polished, shareable summary card.
|
||||
|
||||
|
||||
### 3. Exporting Shareable Images (`.png`)
|
||||
|
||||
In addition to viewing the dashboards in a browser, the project includes a powerful script to programmatically save the 'image views' as static `.png` files. This is ideal for automation, scheduled tasks (cron jobs), or sharing the results on social media platforms like your `r/rstat` subreddit.
|
||||
|
||||
#### One-Time Setup
|
||||
|
||||
The image exporter uses the Playwright library to control a headless browser. Before using it for the first time, you must install the necessary browser runtimes with this command:
|
||||
|
||||
```bash
|
||||
playwright install
|
||||
```
|
||||
|
||||
#### Usage Workflow
|
||||
|
||||
The exporter works by taking a high-quality screenshot of the live web page. Therefore, the process requires two steps running in two separate terminals.
|
||||
|
||||
**Step 1: Start the Web Dashboard**
|
||||
|
||||
The web server must be running for the exporter to have a page to screenshot. Open a terminal and run:
|
||||
|
||||
```bash
|
||||
rstat-dashboard
|
||||
```
|
||||
Leave this terminal running.
|
||||
|
||||
**Step 2: Run the Export Script**
|
||||
|
||||
Open a **second terminal** in the same project directory. You can now run the `export_image.py` script with the desired arguments.
|
||||
|
||||
**Examples:**
|
||||
|
||||
* To export the **daily** summary image for `r/wallstreetbets`:
|
||||
```bash
|
||||
python export_image.py wallstreetbets
|
||||
```
|
||||
|
||||
* To export the **weekly** summary image for `r/wallstreetbets`:
|
||||
```bash
|
||||
python export_image.py wallstreetbets --weekly
|
||||
```
|
||||
|
||||
* To export the **overall** summary image (across all subreddits):
|
||||
```bash
|
||||
python export_image.py --overall
|
||||
```
|
||||
|
||||
#### Output
|
||||
|
||||
After running a command, a new `.png` file (e.g., `wallstreetbets_daily_1690000000.png`) will be saved in the images-directory in the root directory of the project.
|
||||
|
||||
|
||||
|
||||
## 4. Full Automation: Posting to Reddit via Cron Job
|
||||
|
||||
The final piece of the project is a script that automates the entire process: scraping data, generating an image, and posting it to a target subreddit like `r/rstat`. This is designed to be run via a scheduled task or cron job.
|
||||
|
||||
### Prerequisites for Posting
|
||||
|
||||
The posting script needs to log in to your Reddit account. You must add your Reddit username and password to your `.env` file.
|
||||
|
||||
**Add these two lines to your `.env` file:**
|
||||
```
|
||||
REDDIT_USERNAME=YourRedditUsername
|
||||
REDDIT_PASSWORD=YourRedditPassword
|
||||
```
|
||||
*(For security, it's recommended to use a dedicated bot account for this, not your personal account.)*
|
||||
|
||||
### The `post_to_reddit.py` Script
|
||||
|
||||
This is a standalone script located in the project's root directory that finds the most recently generated image and posts it to Reddit.
|
||||
|
||||
**Manual Usage:**
|
||||
|
||||
You can run this script manually from your terminal. This is great for testing or one-off posts.
|
||||
|
||||
* **Post the latest OVERALL summary image to `r/rstat`:**
|
||||
```bash
|
||||
python post_to_reddit.py
|
||||
```
|
||||
|
||||
* **Post the latest DAILY image for a specific subreddit:**
|
||||
```bash
|
||||
python post_to_reddit.py --subreddit wallstreetbets
|
||||
```
|
||||
|
||||
* **Post the latest WEEKLY image for a specific subreddit:**
|
||||
```bash
|
||||
python post_to_reddit.py --subreddit wallstreetbets --weekly
|
||||
```
|
||||
|
||||
* **Post to a different target subreddit (e.g., a test subreddit):**
|
||||
```bash
|
||||
python post_to_reddit.py --target-subreddit MyTestSub
|
||||
```
|
||||
|
||||
### Setting Up the Cron Job for Full Automation
|
||||
|
||||
To run the entire pipeline automatically every day, you can use a simple shell script controlled by `cron`.
|
||||
|
||||
**Step 1: Create a Job Script**
|
||||
|
||||
Create a file named `run_daily_job.sh` in the root of your project directory. This script will run all the necessary commands in the correct order.
|
||||
|
||||
**`run_daily_job.sh`:**
|
||||
```bash
|
||||
#!/bin/bash
|
||||
|
||||
# CRITICAL: Navigate to the project directory using an absolute path.
|
||||
# Replace '/path/to/your/project/reddit_stock_analyzer' with your actual path.
|
||||
cd /path/to/your/project/reddit_stock_analyzer
|
||||
|
||||
# CRITICAL: Activate the virtual environment using an absolute path.
|
||||
source /path/to/your/project/reddit_stock_analyzer/.venv/bin/activate
|
||||
|
||||
echo "--- Starting RSTAT Daily Job on $(date) ---"
|
||||
|
||||
# 1. Scrape data from the last 24 hours for all subreddits in the config.
|
||||
echo "Step 1: Scraping new data..."
|
||||
rstat --config subreddits.json --days 1
|
||||
|
||||
# 2. Start the dashboard in the background so the exporter can access it.
|
||||
echo "Step 2: Starting dashboard in background..."
|
||||
rstat-dashboard &
|
||||
DASHBOARD_PID=$!
|
||||
|
||||
# Give the server a moment to start up.
|
||||
sleep 10
|
||||
|
||||
# 3. Export the overall summary image.
|
||||
echo "Step 3: Exporting overall summary image..."
|
||||
python export_image.py --overall
|
||||
|
||||
# 4. Post the newly created overall summary image to r/rstat.
|
||||
echo "Step 4: Posting image to Reddit..."
|
||||
python post_to_reddit.py --target-subreddit rstat
|
||||
|
||||
# 5. Clean up by stopping the background dashboard server.
|
||||
echo "Step 5: Stopping dashboard server..."
|
||||
kill $DASHBOARD_PID
|
||||
|
||||
echo "--- RSTAT Daily Job Complete ---"
|
||||
```**Before proceeding, you must edit the two absolute paths at the top of this script to match your system.**
|
||||
|
||||
**Step 2: Make the Script Executable**
|
||||
|
||||
In your terminal, run the following command:
|
||||
```bash
|
||||
chmod +x run_daily_job.sh
|
||||
```
|
||||
|
||||
**Step 3: Schedule the Cron Job**
|
||||
|
||||
1. Open your crontab editor by running `crontab -e`.
|
||||
2. Add a new line to the file to schedule the job. For example, to run the script **every day at 10:00 PM**, add the following line:
|
||||
|
||||
```
|
||||
0 22 * * * /path/to/your/project/reddit_stock_analyzer/run_daily_job.sh >> /path/to/your/project/reddit_stock_analyzer/cron.log 2>&1
|
||||
```
|
||||
* `0 22 * * *` means at minute 0 of hour 22, every day, every month, every day of the week.
|
||||
* `>> /path/to/your/.../cron.log 2>&1` is highly recommended. It redirects all output (both standard and error) from the script into a log file, so you can check if the job ran successfully.
|
||||
|
||||
Your project is now fully automated to scrape, analyze, visualize, and post data every day.
|
@@ -1,36 +1,38 @@
|
||||
# export_image.py
|
||||
|
||||
import argparse
|
||||
from playwright.sync_api import sync_playwright
|
||||
import os
|
||||
import time
|
||||
from playwright.sync_api import sync_playwright
|
||||
|
||||
def export_subreddit_image(subreddit_name, weekly=False):
|
||||
"""
|
||||
Launches a headless browser to take a screenshot of a subreddit's image view.
|
||||
"""
|
||||
view_type = "weekly" if weekly else "daily"
|
||||
print(f"Exporting {view_type} image for r/{subreddit_name}...")
|
||||
# Define the output directory as a constant
|
||||
OUTPUT_DIR = "images"
|
||||
|
||||
def export_image(url_path, filename_prefix):
|
||||
"""
|
||||
Launches a headless browser, navigates to a URL path, and screenshots
|
||||
the .image-container element, saving it to the OUTPUT_DIR.
|
||||
"""
|
||||
print(f"-> Preparing to export image for: {filename_prefix}")
|
||||
|
||||
# 1. Ensure the output directory exists
|
||||
os.makedirs(OUTPUT_DIR, exist_ok=True)
|
||||
|
||||
# The URL our Flask app serves
|
||||
base_url = "http://127.0.0.1:5000"
|
||||
path = f"image/weekly/{subreddit_name}" if weekly else f"image/{subreddit_name}"
|
||||
url = f"{base_url}/{path}"
|
||||
url = f"{base_url}/{url_path}"
|
||||
|
||||
# Define the output filename
|
||||
output_file = f"{subreddit_name}_{'weekly' if weekly else 'daily'}_{int(time.time())}.png"
|
||||
# 2. Construct the full output path including the new directory
|
||||
output_file = os.path.join(OUTPUT_DIR, f"{filename_prefix}_{int(time.time())}.png")
|
||||
|
||||
with sync_playwright() as p:
|
||||
try:
|
||||
browser = p.chromium.launch()
|
||||
page = browser.new_page()
|
||||
|
||||
# Set a large viewport for high-quality screenshots
|
||||
page.set_viewport_size({"width": 1920, "height": 1080})
|
||||
|
||||
print(f" Navigating to {url}...")
|
||||
page.goto(url)
|
||||
|
||||
# Important: Give the page a second to ensure all styles and fonts have loaded
|
||||
page.wait_for_timeout(1000)
|
||||
page.goto(url, wait_until="networkidle") # Wait for network to be idle
|
||||
|
||||
# Target the specific element we want to screenshot
|
||||
element = page.locator(".image-container")
|
||||
@@ -39,13 +41,33 @@ def export_subreddit_image(subreddit_name, weekly=False):
|
||||
element.screenshot(path=output_file)
|
||||
|
||||
browser.close()
|
||||
print("Export complete!")
|
||||
print(f"-> Export complete! Image saved to {output_file}")
|
||||
|
||||
except Exception as e:
|
||||
print(f"\nAn error occurred during export: {e}")
|
||||
print("Please ensure the 'rstat-dashboard' server is running in another terminal.")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
# Use a mutually exclusive group to ensure only one mode is chosen
|
||||
parser = argparse.ArgumentParser(description="Export subreddit sentiment images.")
|
||||
parser.add_argument("subreddit", help="The name of the subreddit to export.")
|
||||
parser.add_argument("--weekly", action="store_true", help="Export the weekly view instead of the daily view.")
|
||||
group = parser.add_mutually_exclusive_group(required=True)
|
||||
group.add_argument("-s", "--subreddit", help="The name of the subreddit to export.")
|
||||
group.add_argument("-o", "--overall", action="store_true", help="Export the overall summary image.")
|
||||
|
||||
parser.add_argument("-w", "--weekly", action="store_true", help="Export the weekly view instead of the daily view (only for --subreddit).")
|
||||
args = parser.parse_args()
|
||||
|
||||
# NOTE: This script assumes your 'rstat-dashboard' server is already running in another terminal.
|
||||
export_subreddit_image(args.subreddit, args.weekly)
|
||||
# Determine the correct URL path and filename based on arguments
|
||||
if args.subreddit:
|
||||
view_type = "weekly" if args.weekly else "daily"
|
||||
url_path_to_render = f"image/{view_type}/{args.subreddit}"
|
||||
filename_prefix_to_save = f"{args.subreddit}_{view_type}"
|
||||
export_image(url_path_to_render, filename_prefix_to_save)
|
||||
|
||||
elif args.overall:
|
||||
if args.weekly:
|
||||
print("Warning: --weekly flag has no effect with --overall. Exporting overall summary.")
|
||||
url_path_to_render = "image/overall"
|
||||
filename_prefix_to_save = "overall_summary"
|
||||
export_image(url_path_to_render, filename_prefix_to_save)
|
104
post_to_reddit.py
Normal file
104
post_to_reddit.py
Normal file
@@ -0,0 +1,104 @@
|
||||
# post_to_reddit.py
|
||||
|
||||
import argparse
|
||||
import os
|
||||
import glob
|
||||
from datetime import datetime, timezone
|
||||
import praw
|
||||
from dotenv import load_dotenv
|
||||
|
||||
# --- CONFIGURATION ---
|
||||
IMAGE_DIR = "images"
|
||||
|
||||
def get_reddit_instance():
|
||||
"""Initializes and returns a PRAW Reddit instance from .env credentials."""
|
||||
load_dotenv()
|
||||
client_id = os.getenv("REDDIT_CLIENT_ID")
|
||||
client_secret = os.getenv("REDDIT_CLIENT_SECRET")
|
||||
user_agent = os.getenv("REDDIT_USER_AGENT")
|
||||
username = os.getenv("REDDIT_USERNAME") # <-- Add your Reddit username to .env
|
||||
password = os.getenv("REDDIT_PASSWORD") # <-- Add your Reddit password to .env
|
||||
|
||||
if not all([client_id, client_secret, user_agent, username, password]):
|
||||
print("Error: Reddit API credentials (including username/password) not found in .env file.")
|
||||
return None
|
||||
|
||||
return praw.Reddit(
|
||||
client_id=client_id,
|
||||
client_secret=client_secret,
|
||||
user_agent=user_agent,
|
||||
username=username,
|
||||
password=password
|
||||
)
|
||||
|
||||
def find_latest_image(pattern):
|
||||
"""Finds the most recent file in the IMAGE_DIR that matches a given pattern."""
|
||||
try:
|
||||
search_path = os.path.join(IMAGE_DIR, pattern)
|
||||
list_of_files = glob.glob(search_path)
|
||||
if not list_of_files:
|
||||
return None
|
||||
# The latest file will be the one with the highest modification time
|
||||
latest_file = max(list_of_files, key=os.path.getmtime)
|
||||
return latest_file
|
||||
except Exception as e:
|
||||
print(f"Error finding image file: {e}")
|
||||
return None
|
||||
|
||||
def main():
|
||||
"""Main function to find an image and post it to Reddit."""
|
||||
parser = argparse.ArgumentParser(description="Find the latest sentiment image and post it to a subreddit.")
|
||||
parser.add_argument("-s", "--subreddit", help="The source subreddit of the image to post. (Defaults to overall summary)")
|
||||
parser.add_argument("-w", "--weekly", action="store_true", help="Post the weekly summary instead of the daily one.")
|
||||
parser.add_argument("-t", "--target-subreddit", default="rstat", help="The subreddit to post the image to. (Default: rstat)")
|
||||
args = parser.parse_args()
|
||||
|
||||
# --- 1. Determine filename pattern and post title ---
|
||||
current_date_str = datetime.now(timezone.utc).strftime("%Y-%m-%d")
|
||||
|
||||
if args.subreddit:
|
||||
view_type = "weekly" if args.weekly else "daily"
|
||||
filename_pattern = f"{args.subreddit.lower()}_{view_type}_*.png"
|
||||
post_title = f"{view_type.capitalize()} Ticker Sentiment for r/{args.subreddit} ({current_date_str})"
|
||||
else:
|
||||
# Default to the overall summary
|
||||
if args.weekly:
|
||||
print("Warning: --weekly flag has no effect for overall summary. Posting overall daily image.")
|
||||
filename_pattern = "overall_summary_*.png"
|
||||
post_title = f"Overall Top 10 Ticker Mentions Across Reddit ({current_date_str})"
|
||||
|
||||
print(f"Searching for image pattern: {filename_pattern}")
|
||||
|
||||
# --- 2. Find the latest image file ---
|
||||
image_to_post = find_latest_image(filename_pattern)
|
||||
|
||||
if not image_to_post:
|
||||
print(f"Error: No image found matching the pattern '{filename_pattern}'. Please run the scraper and exporter first.")
|
||||
return
|
||||
|
||||
print(f"Found image: {image_to_post}")
|
||||
|
||||
# --- 3. Connect to Reddit and submit ---
|
||||
reddit = get_reddit_instance()
|
||||
if not reddit:
|
||||
return
|
||||
|
||||
try:
|
||||
target_sub = reddit.subreddit(args.target_subreddit)
|
||||
print(f"Submitting '{post_title}' to r/{target_sub.display_name}...")
|
||||
|
||||
submission = target_sub.submit_image(
|
||||
title=post_title,
|
||||
image_path=image_to_post,
|
||||
flair_id=None # Optional: You can add a flair ID here if you want
|
||||
)
|
||||
|
||||
print("\n--- Post Successful! ---")
|
||||
print(f"Post URL: {submission.shortlink}")
|
||||
|
||||
except Exception as e:
|
||||
print(f"\nAn error occurred while posting to Reddit: {e}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
@@ -2,10 +2,13 @@
|
||||
|
||||
import argparse
|
||||
from . import database
|
||||
from .logger_setup import get_logger
|
||||
# We can't reuse load_subreddits from main anymore if it's not in the same file
|
||||
# So we will duplicate it here. It's small and keeps this script self-contained.
|
||||
import json
|
||||
|
||||
log = get_logger()
|
||||
|
||||
def load_subreddits(filepath):
|
||||
"""Loads a list of subreddits from a JSON file."""
|
||||
try:
|
||||
@@ -13,7 +16,7 @@ def load_subreddits(filepath):
|
||||
data = json.load(f)
|
||||
return data.get("subreddits", [])
|
||||
except (FileNotFoundError, json.JSONDecodeError) as e:
|
||||
print(f"Error loading config file '{filepath}': {e}")
|
||||
log.error(f"Error loading config file '{filepath}': {e}")
|
||||
return None
|
||||
|
||||
def run_cleanup():
|
||||
@@ -52,17 +55,17 @@ def run_cleanup():
|
||||
run_any_task = True
|
||||
# If --all is used, default to 'subreddits.json' if --subreddits wasn't also specified
|
||||
config_file = args.subreddits or 'subreddits.json'
|
||||
print(f"\nCleaning subreddits based on active list in: {config_file}")
|
||||
log.info(f"\nCleaning subreddits based on active list in: {config_file}")
|
||||
active_subreddits = load_subreddits(config_file)
|
||||
if active_subreddits is not None:
|
||||
database.clean_stale_subreddits(active_subreddits)
|
||||
|
||||
if not run_any_task:
|
||||
parser.print_help()
|
||||
print("\nError: Please provide at least one cleanup option (e.g., --tickers, --subreddits, --all).")
|
||||
log.error("\nError: Please provide at least one cleanup option (e.g., --tickers, --subreddits, --all).")
|
||||
return
|
||||
|
||||
print("\nCleanup finished.")
|
||||
log.info("\nCleanup finished.")
|
||||
|
||||
if __name__ == "__main__":
|
||||
run_cleanup()
|
@@ -1,17 +1,19 @@
|
||||
# rstat_tool/dashboard.py
|
||||
|
||||
from flask import Flask, render_template
|
||||
from datetime import datetime, timedelta
|
||||
from datetime import datetime, timedelta, timezone
|
||||
from .logger_setup import get_logger
|
||||
from .database import (
|
||||
get_overall_summary,
|
||||
get_subreddit_summary,
|
||||
get_all_scanned_subreddits,
|
||||
get_deep_dive_details,
|
||||
get_image_view_summary,
|
||||
get_daily_summary_for_subreddit,
|
||||
get_weekly_summary_for_subreddit,
|
||||
get_overall_image_view_summary
|
||||
)
|
||||
|
||||
log = get_logger()
|
||||
app = Flask(__name__, template_folder='../templates')
|
||||
|
||||
@app.template_filter('format_mc')
|
||||
@@ -53,13 +55,13 @@ def deep_dive(symbol):
|
||||
posts = get_deep_dive_details(symbol)
|
||||
return render_template("deep_dive.html", posts=posts, symbol=symbol)
|
||||
|
||||
@app.route("/image/<name>")
|
||||
def image_view(name):
|
||||
@app.route("/image/daily/<name>")
|
||||
def daily_image_view(name):
|
||||
"""The handler for the image-style dashboard."""
|
||||
tickers = get_image_view_summary(name)
|
||||
current_date = datetime.utcnow().strftime("%Y-%m-%d")
|
||||
tickers = get_daily_summary_for_subreddit(name)
|
||||
current_date = datetime.now(timezone.utc).strftime("%Y-%m-%d")
|
||||
return render_template(
|
||||
"image_view.html",
|
||||
"daily_image_view.html",
|
||||
tickers=tickers,
|
||||
subreddit_name=name,
|
||||
current_date=current_date
|
||||
@@ -71,7 +73,7 @@ def weekly_image_view(name):
|
||||
tickers = get_weekly_summary_for_subreddit(name)
|
||||
|
||||
# Create the date range string for the title
|
||||
end_date = datetime.utcnow()
|
||||
end_date = datetime.now(timezone.utc)
|
||||
start_date = end_date - timedelta(days=7)
|
||||
date_range_str = f"{start_date.strftime('%b %d')} - {end_date.strftime('%b %d, %Y')}"
|
||||
|
||||
@@ -86,7 +88,7 @@ def weekly_image_view(name):
|
||||
def overall_image_view():
|
||||
"""The handler for the overall image-style dashboard."""
|
||||
tickers = get_overall_image_view_summary()
|
||||
current_date = datetime.utcnow().strftime("%Y-%m-%d")
|
||||
current_date = datetime.now(timezone.utc).strftime("%Y-%m-%d")
|
||||
return render_template(
|
||||
"overall_image_view.html",
|
||||
tickers=tickers,
|
||||
@@ -95,9 +97,9 @@ def overall_image_view():
|
||||
|
||||
def start_dashboard():
|
||||
"""The main function called by the 'rstat-dashboard' command."""
|
||||
print("Starting Flask server...")
|
||||
print("Open http://127.0.0.1:5000 in your browser.")
|
||||
print("Press CTRL+C to stop the server.")
|
||||
log.info("Starting Flask server...")
|
||||
log.info("Open http://127.0.0.1:5000 in your browser.")
|
||||
log.info("Press CTRL+C to stop the server.")
|
||||
app.run(debug=True)
|
||||
|
||||
if __name__ == "__main__":
|
||||
|
@@ -3,9 +3,11 @@
|
||||
import sqlite3
|
||||
import time
|
||||
from .ticker_extractor import COMMON_WORDS_BLACKLIST
|
||||
from datetime import datetime, timedelta
|
||||
from .logger_setup import get_logger
|
||||
from datetime import datetime, timedelta, timezone
|
||||
|
||||
DB_FILE = "reddit_stocks.db"
|
||||
log = get_logger()
|
||||
|
||||
def get_db_connection():
|
||||
"""Establishes a connection to the SQLite database."""
|
||||
@@ -71,14 +73,14 @@ def initialize_db():
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
print("Database initialized successfully.")
|
||||
log.info("Database initialized successfully.")
|
||||
|
||||
def clean_stale_tickers():
|
||||
"""
|
||||
Removes tickers and their associated mentions from the database
|
||||
if the ticker symbol exists in the COMMON_WORDS_BLACKLIST.
|
||||
"""
|
||||
print("\n--- Cleaning Stale Tickers from Database ---")
|
||||
log.info("\n--- Cleaning Stale Tickers from Database ---")
|
||||
conn = get_db_connection()
|
||||
cursor = conn.cursor()
|
||||
|
||||
@@ -89,27 +91,27 @@ def clean_stale_tickers():
|
||||
stale_tickers = cursor.fetchall()
|
||||
|
||||
if not stale_tickers:
|
||||
print("No stale tickers to clean.")
|
||||
log.info("No stale tickers to clean.")
|
||||
conn.close()
|
||||
return
|
||||
|
||||
for ticker in stale_tickers:
|
||||
ticker_id = ticker['id']
|
||||
ticker_symbol = ticker['symbol']
|
||||
print(f"Removing stale ticker '{ticker_symbol}' (ID: {ticker_id})...")
|
||||
log.info(f"Removing stale ticker '{ticker_symbol}' (ID: {ticker_id})...")
|
||||
cursor.execute("DELETE FROM mentions WHERE ticker_id = ?", (ticker_id,))
|
||||
cursor.execute("DELETE FROM tickers WHERE id = ?", (ticker_id,))
|
||||
|
||||
deleted_count = conn.total_changes
|
||||
conn.commit()
|
||||
conn.close()
|
||||
print(f"Cleanup complete. Removed {deleted_count} records.")
|
||||
log.info(f"Cleanup complete. Removed {deleted_count} records.")
|
||||
|
||||
def clean_stale_subreddits(active_subreddits):
|
||||
"""
|
||||
Removes all data associated with subreddits that are NOT in the active list.
|
||||
"""
|
||||
print("\n--- Cleaning Stale Subreddits from Database ---")
|
||||
log.info("\n--- Cleaning Stale Subreddits from Database ---")
|
||||
conn = get_db_connection()
|
||||
cursor = conn.cursor()
|
||||
cursor.execute("SELECT id, name FROM subreddits")
|
||||
@@ -117,20 +119,20 @@ def clean_stale_subreddits(active_subreddits):
|
||||
stale_sub_ids = []
|
||||
for sub in db_subreddits:
|
||||
if sub['name'] not in active_subreddits:
|
||||
print(f"Found stale subreddit to remove: r/{sub['name']}")
|
||||
log.info(f"Found stale subreddit to remove: r/{sub['name']}")
|
||||
stale_sub_ids.append(sub['id'])
|
||||
if not stale_sub_ids:
|
||||
print("No stale subreddits to clean.")
|
||||
log.info("No stale subreddits to clean.")
|
||||
conn.close()
|
||||
return
|
||||
for sub_id in stale_sub_ids:
|
||||
print(f" -> Deleting associated data for subreddit ID: {sub_id}")
|
||||
log.info(f" -> Deleting associated data for subreddit ID: {sub_id}")
|
||||
cursor.execute("DELETE FROM mentions WHERE subreddit_id = ?", (sub_id,))
|
||||
cursor.execute("DELETE FROM posts WHERE subreddit_id = ?", (sub_id,))
|
||||
cursor.execute("DELETE FROM subreddits WHERE id = ?", (sub_id,))
|
||||
conn.commit()
|
||||
conn.close()
|
||||
print("Stale subreddit cleanup complete.")
|
||||
log.info("Stale subreddit cleanup complete.")
|
||||
|
||||
def get_db_connection():
|
||||
conn = sqlite3.connect(DB_FILE)
|
||||
@@ -184,7 +186,7 @@ def initialize_db():
|
||||
""")
|
||||
conn.commit()
|
||||
conn.close()
|
||||
print("Database initialized successfully.")
|
||||
log.info("Database initialized successfully.")
|
||||
|
||||
def add_mention(conn, ticker_id, subreddit_id, post_id, mention_type, timestamp, mention_sentiment, post_avg_sentiment=None):
|
||||
cursor = conn.cursor()
|
||||
@@ -230,7 +232,7 @@ def get_ticker_info(conn, ticker_id):
|
||||
|
||||
def generate_summary_report(limit=20):
|
||||
"""Queries the DB to generate a summary for the command-line tool."""
|
||||
print(f"\n--- Top {limit} Tickers by Mention Count ---")
|
||||
log.info(f"\n--- Top {limit} Tickers by Mention Count ---")
|
||||
conn = get_db_connection()
|
||||
cursor = conn.cursor()
|
||||
|
||||
@@ -274,13 +276,6 @@ def generate_summary_report(limit=20):
|
||||
)
|
||||
conn.close()
|
||||
|
||||
def get_all_scanned_subreddits():
|
||||
"""Gets a unique list of all subreddits we have data for."""
|
||||
conn = get_db_connection()
|
||||
results = conn.execute("SELECT DISTINCT name FROM subreddits ORDER BY name ASC;").fetchall()
|
||||
conn.close()
|
||||
return [row['name'] for row in results]
|
||||
|
||||
def add_or_update_post_analysis(conn, post_data):
|
||||
"""
|
||||
Inserts a new post analysis record or updates an existing one.
|
||||
@@ -300,35 +295,15 @@ def add_or_update_post_analysis(conn, post_data):
|
||||
)
|
||||
conn.commit()
|
||||
|
||||
def get_deep_dive_details(ticker_symbol):
|
||||
"""
|
||||
Gets all analyzed posts that mention a specific ticker.
|
||||
"""
|
||||
conn = get_db_connection()
|
||||
query = """
|
||||
SELECT DISTINCT p.*, s.name as subreddit_name FROM posts p
|
||||
JOIN mentions m ON p.post_id = m.post_id
|
||||
JOIN tickers t ON m.ticker_id = t.id
|
||||
JOIN subreddits s ON p.subreddit_id = s.id
|
||||
WHERE LOWER(t.symbol) = LOWER(?)
|
||||
ORDER BY p.post_timestamp DESC;
|
||||
"""
|
||||
results = conn.execute(query, (ticker_symbol,)).fetchall()
|
||||
conn.close()
|
||||
return results
|
||||
|
||||
def get_overall_summary(limit=50):
|
||||
conn = get_db_connection()
|
||||
query = """
|
||||
SELECT
|
||||
t.symbol, t.market_cap, t.closing_price,
|
||||
COUNT(m.id) as mention_count,
|
||||
SELECT t.symbol, t.market_cap, t.closing_price, COUNT(m.id) as mention_count,
|
||||
SUM(CASE WHEN m.mention_sentiment > 0.1 THEN 1 ELSE 0 END) as bullish_mentions,
|
||||
SUM(CASE WHEN m.mention_sentiment < -0.1 THEN 1 ELSE 0 END) as bearish_mentions,
|
||||
SUM(CASE WHEN m.mention_sentiment BETWEEN -0.1 AND 0.1 THEN 1 ELSE 0 END) as neutral_mentions
|
||||
FROM mentions m JOIN tickers t ON m.ticker_id = t.id
|
||||
GROUP BY t.symbol, t.market_cap, t.closing_price
|
||||
ORDER BY mention_count DESC LIMIT ?;
|
||||
GROUP BY t.symbol, t.market_cap, t.closing_price ORDER BY mention_count DESC LIMIT ?;
|
||||
"""
|
||||
results = conn.execute(query, (limit,)).fetchall()
|
||||
conn.close()
|
||||
@@ -337,86 +312,87 @@ def get_overall_summary(limit=50):
|
||||
def get_subreddit_summary(subreddit_name, limit=50):
|
||||
conn = get_db_connection()
|
||||
query = """
|
||||
SELECT
|
||||
t.symbol, t.market_cap, t.closing_price,
|
||||
COUNT(m.id) as mention_count,
|
||||
SELECT t.symbol, t.market_cap, t.closing_price, COUNT(m.id) as mention_count,
|
||||
SUM(CASE WHEN m.mention_sentiment > 0.1 THEN 1 ELSE 0 END) as bullish_mentions,
|
||||
SUM(CASE WHEN m.mention_sentiment < -0.1 THEN 1 ELSE 0 END) as bearish_mentions,
|
||||
SUM(CASE WHEN m.mention_sentiment BETWEEN -0.1 AND 0.1 THEN 1 ELSE 0 END) as neutral_mentions
|
||||
FROM mentions m
|
||||
JOIN tickers t ON m.ticker_id = t.id
|
||||
JOIN subreddits s ON m.subreddit_id = s.id
|
||||
WHERE LOWER(s.name) = LOWER(?)
|
||||
GROUP BY t.symbol, t.market_cap, t.closing_price
|
||||
ORDER BY mention_count DESC LIMIT ?;
|
||||
FROM mentions m JOIN tickers t ON m.ticker_id = t.id JOIN subreddits s ON m.subreddit_id = s.id
|
||||
WHERE LOWER(s.name) = LOWER(?) GROUP BY t.symbol, t.market_cap, t.closing_price ORDER BY mention_count DESC LIMIT ?;
|
||||
"""
|
||||
results = conn.execute(query, (subreddit_name, limit)).fetchall()
|
||||
conn.close()
|
||||
return results
|
||||
|
||||
def get_image_view_summary(subreddit_name):
|
||||
def get_daily_summary_for_subreddit(subreddit_name):
|
||||
""" Gets a summary for the DAILY image view (last 24 hours). """
|
||||
conn = get_db_connection()
|
||||
one_day_ago = datetime.now(timezone.utc) - timedelta(days=1)
|
||||
one_day_ago_timestamp = int(one_day_ago.timestamp())
|
||||
query = """
|
||||
SELECT
|
||||
t.symbol,
|
||||
SELECT t.symbol,
|
||||
COUNT(CASE WHEN m.mention_type = 'post' THEN 1 END) as post_mentions,
|
||||
COUNT(CASE WHEN m.mention_type = 'comment' THEN 1 END) as comment_mentions,
|
||||
COUNT(CASE WHEN m.mention_sentiment > 0.1 THEN 1 END) as bullish_mentions,
|
||||
COUNT(CASE WHEN m.mention_sentiment < -0.1 THEN 1 END) as bearish_mentions
|
||||
FROM mentions m
|
||||
JOIN tickers t ON m.ticker_id = t.id
|
||||
JOIN subreddits s ON m.subreddit_id = s.id
|
||||
WHERE LOWER(s.name) = LOWER(?)
|
||||
GROUP BY t.symbol
|
||||
ORDER BY (post_mentions + comment_mentions) DESC
|
||||
LIMIT 10;
|
||||
FROM mentions m JOIN tickers t ON m.ticker_id = t.id JOIN subreddits s ON m.subreddit_id = s.id
|
||||
WHERE LOWER(s.name) = LOWER(?) AND m.mention_timestamp >= ?
|
||||
GROUP BY t.symbol ORDER BY (post_mentions + comment_mentions) DESC LIMIT 10;
|
||||
"""
|
||||
results = conn.execute(query, (subreddit_name,)).fetchall()
|
||||
results = conn.execute(query, (subreddit_name, one_day_ago_timestamp)).fetchall()
|
||||
conn.close()
|
||||
return results
|
||||
|
||||
def get_weekly_summary_for_subreddit(subreddit_name):
|
||||
""" Gets a summary for the WEEKLY image view (last 7 days). """
|
||||
conn = get_db_connection()
|
||||
seven_days_ago = datetime.utcnow() - timedelta(days=7)
|
||||
seven_days_ago = datetime.now(timezone.utc) - timedelta(days=7)
|
||||
seven_days_ago_timestamp = int(seven_days_ago.timestamp())
|
||||
query = """
|
||||
SELECT
|
||||
t.symbol,
|
||||
SELECT t.symbol,
|
||||
COUNT(CASE WHEN m.mention_type = 'post' THEN 1 END) as post_mentions,
|
||||
COUNT(CASE WHEN m.mention_type = 'comment' THEN 1 END) as comment_mentions,
|
||||
COUNT(CASE WHEN m.mention_sentiment > 0.1 THEN 1 END) as bullish_mentions,
|
||||
COUNT(CASE WHEN m.mention_sentiment < -0.1 THEN 1 END) as bearish_mentions
|
||||
FROM mentions m
|
||||
JOIN tickers t ON m.ticker_id = t.id
|
||||
JOIN subreddits s ON m.subreddit_id = s.id
|
||||
FROM mentions m JOIN tickers t ON m.ticker_id = t.id JOIN subreddits s ON m.subreddit_id = s.id
|
||||
WHERE LOWER(s.name) = LOWER(?) AND m.mention_timestamp >= ?
|
||||
GROUP BY t.symbol
|
||||
ORDER BY (post_mentions + comment_mentions) DESC
|
||||
LIMIT 10;
|
||||
GROUP BY t.symbol ORDER BY (post_mentions + comment_mentions) DESC LIMIT 10;
|
||||
"""
|
||||
results = conn.execute(query, (subreddit_name, seven_days_ago_timestamp)).fetchall()
|
||||
conn.close()
|
||||
return results
|
||||
|
||||
def get_overall_image_view_summary():
|
||||
"""
|
||||
Gets a summary of top tickers across ALL subreddits for the image view.
|
||||
"""
|
||||
""" Gets a summary of top tickers across ALL subreddits for the image view. """
|
||||
conn = get_db_connection()
|
||||
query = """
|
||||
SELECT
|
||||
t.symbol,
|
||||
SELECT t.symbol,
|
||||
COUNT(CASE WHEN m.mention_type = 'post' THEN 1 END) as post_mentions,
|
||||
COUNT(CASE WHEN m.mention_type = 'comment' THEN 1 END) as comment_mentions,
|
||||
COUNT(CASE WHEN m.mention_sentiment > 0.1 THEN 1 END) as bullish_mentions,
|
||||
COUNT(CASE WHEN m.mention_sentiment < -0.1 THEN 1 END) as bearish_mentions
|
||||
FROM mentions m
|
||||
JOIN tickers t ON m.ticker_id = t.id
|
||||
-- No JOIN or WHERE for subreddit, as we want all of them
|
||||
GROUP BY t.symbol
|
||||
ORDER BY (post_mentions + comment_mentions) DESC
|
||||
LIMIT 10;
|
||||
FROM mentions m JOIN tickers t ON m.ticker_id = t.id
|
||||
GROUP BY t.symbol ORDER BY (post_mentions + comment_mentions) DESC LIMIT 10;
|
||||
"""
|
||||
results = conn.execute(query).fetchall()
|
||||
conn.close()
|
||||
return results
|
||||
|
||||
def get_deep_dive_details(ticker_symbol):
|
||||
""" Gets all analyzed posts that mention a specific ticker. """
|
||||
conn = get_db_connection()
|
||||
query = """
|
||||
SELECT DISTINCT p.*, s.name as subreddit_name FROM posts p
|
||||
JOIN mentions m ON p.post_id = m.post_id JOIN tickers t ON m.ticker_id = t.id
|
||||
JOIN subreddits s ON p.subreddit_id = s.id
|
||||
WHERE LOWER(t.symbol) = LOWER(?) ORDER BY p.post_timestamp DESC;
|
||||
"""
|
||||
results = conn.execute(query, (ticker_symbol,)).fetchall()
|
||||
conn.close()
|
||||
return results
|
||||
|
||||
def get_all_scanned_subreddits():
|
||||
""" Gets a unique list of all subreddits we have data for. """
|
||||
conn = get_db_connection()
|
||||
results = conn.execute("SELECT DISTINCT name FROM subreddits ORDER BY name ASC;").fetchall()
|
||||
conn.close()
|
||||
return [row['name'] for row in results]
|
47
rstat_tool/logger_setup.py
Normal file
47
rstat_tool/logger_setup.py
Normal file
@@ -0,0 +1,47 @@
|
||||
# rstat_tool/logger_setup.py
|
||||
|
||||
import logging
|
||||
import sys
|
||||
|
||||
# Get the root logger
|
||||
logger = logging.getLogger("rstat_app")
|
||||
logger.setLevel(logging.INFO) # Set the minimum level of messages to handle
|
||||
|
||||
# Prevent the logger from propagating messages to the parent (root) logger
|
||||
logger.propagate = False
|
||||
|
||||
# Only add handlers if they haven't been added before
|
||||
# This prevents duplicate log messages if this function is called multiple times.
|
||||
if not logger.handlers:
|
||||
# --- Console Handler ---
|
||||
# This handler prints logs to the standard output (your terminal)
|
||||
console_handler = logging.StreamHandler(sys.stdout)
|
||||
console_handler.setLevel(logging.INFO)
|
||||
# A simple formatter for the console
|
||||
console_formatter = logging.Formatter('%(message)s')
|
||||
console_handler.setFormatter(console_formatter)
|
||||
logger.addHandler(console_handler)
|
||||
|
||||
# --- File Handler ---
|
||||
# This handler writes logs to a file
|
||||
# 'a' stands for append mode
|
||||
file_handler = logging.FileHandler("rstat.log", mode='a')
|
||||
file_handler.setLevel(logging.INFO)
|
||||
# A more detailed formatter for the file, including timestamp and log level
|
||||
file_formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(message)s', datefmt='%Y-%m-%d %H:%M:%S')
|
||||
file_handler.setFormatter(file_formatter)
|
||||
logger.addHandler(file_handler)
|
||||
|
||||
# Get the logger used by the yfinance library
|
||||
yfinance_logger = logging.getLogger("yfinance")
|
||||
# Set its level to capture warnings and errors
|
||||
yfinance_logger.setLevel(logging.WARNING)
|
||||
# Add our existing handlers to it. This tells yfinance's logger
|
||||
# to send its messages to our console and our log file.
|
||||
if not yfinance_logger.handlers:
|
||||
yfinance_logger.addHandler(console_handler)
|
||||
yfinance_logger.addHandler(file_handler)
|
||||
|
||||
def get_logger():
|
||||
"""A simple function to get our configured logger."""
|
||||
return logger
|
@@ -12,17 +12,20 @@ from dotenv import load_dotenv
|
||||
from . import database
|
||||
from .ticker_extractor import extract_tickers
|
||||
from .sentiment_analyzer import get_sentiment_score
|
||||
from .logger_setup import get_logger
|
||||
|
||||
load_dotenv()
|
||||
MARKET_CAP_REFRESH_INTERVAL = 86400
|
||||
POST_AGE_LIMIT = 86400
|
||||
|
||||
log = get_logger()
|
||||
|
||||
def load_subreddits(filepath):
|
||||
try:
|
||||
with open(filepath, 'r') as f:
|
||||
return json.load(f).get("subreddits", [])
|
||||
except (FileNotFoundError, json.JSONDecodeError) as e:
|
||||
print(f"Error loading config file '{filepath}': {e}")
|
||||
log.error(f"Error loading config file '{filepath}': {e}")
|
||||
return None
|
||||
|
||||
def get_financial_data(ticker_symbol):
|
||||
@@ -52,7 +55,7 @@ def scan_subreddits(reddit, subreddits_list, post_limit=100, comment_limit=100,
|
||||
post_age_limit = days_to_scan * 86400
|
||||
current_time = time.time()
|
||||
|
||||
print(f"\nScanning {len(subreddits_list)} subreddit(s) for NEW posts in the last {days_to_scan} day(s)...")
|
||||
log.info(f"\nScanning {len(subreddits_list)} subreddit(s) for NEW posts in the last {days_to_scan} day(s)...")
|
||||
for subreddit_name in subreddits_list:
|
||||
try:
|
||||
# Always use the lowercase version of the name for consistency.
|
||||
@@ -60,15 +63,13 @@ def scan_subreddits(reddit, subreddits_list, post_limit=100, comment_limit=100,
|
||||
|
||||
subreddit_id = database.get_or_create_entity(conn, 'subreddits', 'name', normalized_sub_name)
|
||||
subreddit = reddit.subreddit(normalized_sub_name)
|
||||
print(f"Scanning r/{normalized_sub_name}...")
|
||||
log.info(f"Scanning r/{normalized_sub_name}...")
|
||||
|
||||
for submission in subreddit.new(limit=post_limit):
|
||||
if (current_time - submission.created_utc) > post_age_limit:
|
||||
print(f" -> Reached posts older than the {days_to_scan}-day limit.")
|
||||
log.info(f" -> Reached posts older than the {days_to_scan}-day limit.")
|
||||
break
|
||||
|
||||
# --- NEW HYBRID LOGIC ---
|
||||
|
||||
tickers_in_title = set(extract_tickers(submission.title))
|
||||
all_tickers_found_in_post = set(tickers_in_title) # Start a set to track all tickers for financials
|
||||
|
||||
@@ -77,7 +78,7 @@ def scan_subreddits(reddit, subreddits_list, post_limit=100, comment_limit=100,
|
||||
|
||||
# --- CASE A: Tickers were found in the title ---
|
||||
if tickers_in_title:
|
||||
print(f" -> Title Mention(s): {', '.join(tickers_in_title)}. Attributing all comments.")
|
||||
log.info(f" -> Title Mention(s): {', '.join(tickers_in_title)}. Attributing all comments.")
|
||||
post_sentiment = get_sentiment_score(submission.title)
|
||||
|
||||
# Add one 'post' mention for each title ticker
|
||||
@@ -109,7 +110,7 @@ def scan_subreddits(reddit, subreddits_list, post_limit=100, comment_limit=100,
|
||||
ticker_id = database.get_or_create_entity(conn, 'tickers', 'symbol', ticker_symbol)
|
||||
ticker_info = database.get_ticker_info(conn, ticker_id)
|
||||
if not ticker_info['last_updated'] or (current_time - ticker_info['last_updated'] > MARKET_CAP_REFRESH_INTERVAL):
|
||||
print(f" -> Fetching financial data for {ticker_symbol}...")
|
||||
log.info(f" -> Fetching financial data for {ticker_symbol}...")
|
||||
financials = get_financial_data(ticker_symbol)
|
||||
database.update_ticker_financials(
|
||||
conn, ticker_id,
|
||||
@@ -129,10 +130,10 @@ def scan_subreddits(reddit, subreddits_list, post_limit=100, comment_limit=100,
|
||||
database.add_or_update_post_analysis(conn, post_analysis_data)
|
||||
|
||||
except Exception as e:
|
||||
print(f"Could not scan r/{subreddit_name}. Error: {e}")
|
||||
log.error(f"Could not scan r/{subreddit_name}. Error: {e}")
|
||||
|
||||
conn.close()
|
||||
print("\n--- Scan Complete ---")
|
||||
log.info("\n--- Scan Complete ---")
|
||||
|
||||
|
||||
def main():
|
||||
@@ -147,19 +148,18 @@ def main():
|
||||
parser.add_argument("-l", "--limit", type=int, default=20, help="Number of tickers to show in the CLI report.\n(Default: 20)")
|
||||
args = parser.parse_args()
|
||||
|
||||
# --- THIS IS THE CORRECTED LOGIC BLOCK ---
|
||||
if args.subreddit:
|
||||
# If --subreddit is used, create a list with just that one.
|
||||
subreddits_to_scan = [args.subreddit]
|
||||
print(f"Targeted Scan Mode: Focusing on r/{args.subreddit}")
|
||||
log.info(f"Targeted Scan Mode: Focusing on r/{args.subreddit}")
|
||||
else:
|
||||
# Otherwise, load from the config file.
|
||||
print(f"Config Scan Mode: Loading subreddits from {args.config}")
|
||||
log.info(f"Config Scan Mode: Loading subreddits from {args.config}")
|
||||
# Use the correct argument name: args.config
|
||||
subreddits_to_scan = load_subreddits(args.config)
|
||||
|
||||
if not subreddits_to_scan:
|
||||
print("Error: No subreddits to scan. Please check your config file or --subreddit argument.")
|
||||
log.error("Error: No subreddits to scan. Please check your config file or --subreddit argument.")
|
||||
return
|
||||
|
||||
# --- Initialize and Run ---
|
||||
|
@@ -5,7 +5,7 @@
|
||||
{% block content %}
|
||||
<h1>
|
||||
Top 10 Tickers in r/{{ subreddit_name }}
|
||||
<a href="/image/{{ subreddit_name }}" target="_blank" style="font-size: 0.8rem; margin-left: 1rem; font-weight: normal;">(View Daily Image)</a>
|
||||
<a href="/image/daily/{{ subreddit_name }}" target="_blank" style="font-size: 0.8rem; margin-left: 1rem; font-weight: normal;">(View Daily Image)</a>
|
||||
<!-- ADD THIS NEW LINK -->
|
||||
<a href="/image/weekly/{{ subreddit_name }}" target="_blank" style="font-size: 0.8rem; margin-left: 1rem; font-weight: normal;">(View Weekly Image)</a>
|
||||
</h1>
|
||||
|
Reference in New Issue
Block a user