StockVisionz — How It Works

The Big Picture

From Ticker Search to Predictions

StockVisionz follows a clear pipeline. Each step feeds the next — nothing is skipped, nothing is guessed.

🔍

Search

User types a ticker

📥

Ingest

Pull price & news data

📊

Compute

Calculate indicators

🧠

Train

Run ML models

🧪

Validate

Walk-forward testing

📈

Visualize

Dashboard results

Interactive Demo

Watch a Ticker Get Analyzed

Press the button below to simulate what happens when you search "AAPL" — step by step.

🔎 Simulating: AAPL

User searches "AAPL" on Dashboard

The frontend sends a request to the backend API

Check if AAPL exists in the database

If new → full backfill from 2016. If existing → fetch only missing days.

Pull price data from Yahoo Finance

Open, High, Low, Close, Volume — stored in time-series database

Compute technical indicators

RSI, MACD, Bollinger Bands, ATR, moving averages — all saved

Fetch news & run sentiment analysis

Alpaca News API + FinBERT scores each headline positive/negative

Build ML feature matrix

Combine price + indicators + sentiment into one table for ML models

User triggers ML training

Job queued → Worker picks it up → Walk-forward validation runs

Results appear on dashboard

Accuracy, Sharpe ratio, equity curve — compare all models side by side

Under the Hood

How Each Stage Works

Let's unpack what's happening at every step of the pipeline.

📥

Data Ingestion

The Foundation of Everything

When you search a ticker, the system pulls years of daily price data from Yahoo Finance — the entire trading history downloaded into our database.

New tickers: backfill from January 2016
Existing tickers: only fetch missing days
Time-series database optimized for fast queries
News articles fetched and scored for sentiment

📊

Technical Indicators

Turning Prices Into Signals

Raw prices aren't enough. We compute mathematical indicators that traders use to spot momentum, trends, and volatility patterns.

RSI (14-day) — Overbought or oversold?
MACD — Is momentum shifting?
Bollinger Bands — Price volatility
ATR — Average daily price range
Volume Ratio — Unusual activity?

🧠

ML Feature Matrix

What the Models Actually See

All indicators and price data combine into a single table. Each row = one trading day. Each column = a signal the model learns from.

Lagged returns: 1, 2, 5, and 10 days ago
All features shifted by 1 day — no future peeking
Target: will the stock go UP or DOWN tomorrow?
This is the critical "no cheating" step

⚙️

Job Queue System

How Training Gets Managed

Click "Train" and a job enters the queue. A background worker picks it up and trains the model. You see progress in real time.

Queued → Picked Up → Training → Completed
Multiple workers run safely (no conflicts)
Live progress via Server-Sent Events
Failed jobs automatically retried

🧪

Walk-Forward Validation

Testing Like the Real World

We test exactly how a model would perform in real life: train on the past, predict the future. The window slides through time.

Train on 252 days, test on 21 days
Slide forward 21 days, repeat
1-day purge gap prevents data leaking
Scaler only learns from training data

📈

Dashboard & Results

Making Sense of It All

Interactive charts, ML experiments, backtesting, news sentiment, and drift monitoring — all unified in one dashboard.

Compare all 9 model types side by side
Equity curves show strategy performance
Drift monitor catches market changes
AI Insight Bot answers your questions

System Architecture

How Data Flows Through the System

Trace a request from your search to live results. Each color is a layer.

Simple Advanced ↗

👤 You Search "AAPL"

→

Next.js Dashboard

→

API Routes

1 Backend Processing

GET /api/ohlcv

→

Ingestion Pipeline

→

Yahoo Finance API

OHLCV + Indicators

→

TimescaleDB

→

Feature Matrix View

2 ML Training (on demand)

You Click "Train"

→

POST /api/jobs

→

Job Queue

Worker Polls

→

Model Pipeline

→

Walk-Forward + Leak Check

Metrics Computed

→

Results Saved

→

Dashboard Updates Live ✨

The Brain

9 ML Models, One Question

"Will this stock go up or down tomorrow?" — each model takes a different approach to answering it.

📏

Ridge Regression

Simple linear baseline

Baseline

📐

Logistic Regression

Classic up/down classifier

Baseline

🎯

SVM

Finds the best boundary

Traditional

🌲

Random Forest

Votes from many trees

Traditional

🚀

XGBoost

GPU-powered boosting

GPU Required

🔮

LSTM

Deep learning sequences

Deep Learning

🕹️

DQN

RL agent learns to trade

Reinforcement

🎮

PPO

Policy gradient agent

Reinforcement

🤖

A2C

Actor-Critic agent

Reinforcement

🏆 Fair Comparison Guaranteed

All 9 models train on the exact same data splits — same dates, same windows. They share a comparison group ID so you can view them side by side.

Accuracy

Sharpe Ratio

CAGR

Max Drawdown

Safety First

6 Guardrails Against Cheating

The #1 risk in stock ML is "data leakage" — accidentally letting the model see the future. These 6 automated checks run on every single training job.

🛡️1

Window Integrity

Training always ends before testing starts, with a purge gap in between.

⏳2

Feature Lag Verification

All features shifted by 1 day — the model never sees "today's" data when predicting tomorrow.

🎯3

Target Alignment

The prediction target is confirmed to be the next day's actual result — no future information leaks in.

⚖️4

Scaler Isolation

Data normalization fit only on training data. Test data is never used to compute the scaler.

🚫5

No Future Data in Training

Database queries validated — no row from the future sneaks into the training set.

🔒6

Cross-Window Contamination

Test sets across different folds never overlap — no data point is tested twice.

Built With

Technology Stack

The tools and frameworks powering StockVisionz.

Frontend

Next.jsFramework

React + RechartsUI & Charts

Tailwind CSSStyling

Framer MotionAnimations

ClerkAuthentication

Backend & ML

PythonCore Language

XGBoost + CUDAGPU Training

PyTorchLSTM Models

Stable-Baselines3RL Agents

FinBERTSentiment AI

Infrastructure

Neon PostgresCloud Database

TimescaleDBTime-Series

SentryError Tracking

Gemini 2.5AI Insights Bot

NVIDIA CUDAGPU Compute

How StockVisionz Works

From Ticker Search to Predictions

Search

Ingest

Compute

Train

Validate

Visualize

Watch a Ticker Get Analyzed

🔎 Simulating: AAPL

User searches "AAPL" on Dashboard

Check if AAPL exists in the database

Pull price data from Yahoo Finance

Compute technical indicators

Fetch news & run sentiment analysis

Build ML feature matrix

User triggers ML training

Results appear on dashboard

How Each Stage Works

Data Ingestion

Technical Indicators

ML Feature Matrix

Job Queue System

Walk-Forward Validation

Dashboard & Results

How Data Flows Through the System

9 ML Models, One Question

Ridge Regression

Logistic Regression

SVM

Random Forest

XGBoost

LSTM

DQN

PPO

A2C

🏆 Fair Comparison Guaranteed

Walk-Forward Validation Visualized

6 Guardrails Against Cheating

Window Integrity

Feature Lag Verification

Target Alignment

Scaler Isolation

No Future Data in Training

Cross-Window Contamination

Technology Stack

Frontend

Backend & ML

Infrastructure