Future Work Roadmap - Ideas & Discussions

Core Philosophy: Keep It Simple, Stupid

                Key Principles:
                Each model adds ONE complexity layer
Train models independently to measure alpha contribution
Different models for different customer segments
Clear upsell path: V7 → V8 → V9 → V10 → V11
All models share core architecture (signal generation + deterministic portfolio)

            

Strategic Opportunities: Beyond US Markets

SEDAR Canadian Market 🇨🇦 HIGH PRIORITY

Status: Blue Ocean Opportunity - Zero Competition

Quick Summary:

Apply V7 architecture to Canadian market (SEDAR filings)
Zero competition - No LLM signals for TSX/TSXV
Already have SEDAR data infrastructure
Unique value: Mining/resource event intelligence (NI 43-101, drill results)
Start with dual-listed companies (use US stock prices we already have)
$3M ARR potential (50 customers @ $5K/month)

Key Insights:

✅ All SEDAR filings have English versions (no translation needed)
✅ Many TSX companies dual-listed on US exchanges (~500-800 stocks, 80% of market cap)
✅ 95% code reuse from V7
✅ First mover advantage in untapped market

Model Version Roadmap

V7 - Event Signals ✅ CURRENT

Status: In Training

Features:

Past events with tiered selection (0-14d ALL, 15-60d conf≥7, 61-180d conf≥8 or IDF≥10)
Recency-weighted event sorting
Basic market regime (bull/bear/neutral)
Filing metadata (CIK, date, type)

Output: BUY/SELL/SKIP + conviction (0.0-1.0)

Customer Segment: Event-driven investors, catalyst traders
Pricing Strategy: Base tier - "Event-Driven Signals"

V8 - Event + Market Context Signals PLANNED

Status: Next Priority (if insider/agreement not ready)

New Features Added:

VIX (volatility regime at filing date)
Sector performance vs SPY (relative strength)
Market breadth (% stocks above 200-day MA)
Credit spreads (high-yield vs treasury spread)
Interest rate environment (10-year yield, Fed funds rate)
Market momentum (SPY 30/60/90 day returns)

Why Separate Model:

Some clients want pure event-driven (no market timing dependency)
Market context requires different data infrastructure
Adds cost (market data feeds)
Different customer need: macro-aware vs event-pure

Customer Segment: Macro-aware fundamental investors, multi-strategy funds
Pricing Strategy: Mid tier - "Event + Market Context Signals" (2x base price)

Data Sources to Build:

VIX historical data (Yahoo Finance, CBOE API)
SPY and sector ETF prices (already have infrastructure)
Treasury yields (FRED API - Federal Reserve Economic Data)
Credit spreads (HYG, TLT ETF data)
Market breadth (NYSE advance/decline, need new source)

V9 - Event + Fundamental Signals PLANNED

New Features Added:

Valuation ratios: P/E, P/B, EV/EBITDA, P/S
Leverage metrics: Debt/Equity, Interest Coverage, Net Debt/EBITDA
Growth rates: Revenue growth YoY, Earnings growth YoY
Profitability: Gross margin, Operating margin, ROE, ROA
Quality metrics: Free cash flow yield, Asset turnover
Relative valuation: vs sector median, vs historical average

Why Separate Model:

Value investors want this, growth investors don't care
Fundamentals require parsing financial statements (complexity)
Some funds use only qualitative catalysts (events) not quantitative
Can charge premium for fundamental analysis layer

Customer Segment: Value investors, fundamental analysts, quantitative value funds
Pricing Strategy: Premium tier - "Event + Fundamental Signals" (3x base price)

Integration Challenges:

XBRL parsing is complex (standardization issues)
Different fiscal year ends (calendar vs non-calendar)
TTM calculations require previous quarters
Some companies don't report all metrics

V10 - Event + Technical Signals PLANNED

New Features Added:

Momentum indicators: RSI (14), MACD, Rate of Change
Trend indicators: Price vs 50/200 day MA, moving average crossovers
Volume indicators: Volume trend, OBV, accumulation/distribution
Volatility indicators: Historical volatility, Bollinger Bands, ATR
Price action: Support/resistance breaks, 52-week highs/lows
Chart patterns: Breakouts, consolidations (if detectable)

Why Separate Model:

Technical traders are a specific customer segment
Requires extensive price/volume history
Many fundamental investors think TA is voodoo
Adds significant data infrastructure cost

Customer Segment: Technical traders, momentum investors, quantitative TA funds
Pricing Strategy: Premium tier - "Event + Technical Signals" (3x base price)

V11 - Elite All-Features Model FUTURE

Status: After V7-V10 proven

Features (The "Kitchen Sink"):

✅ Events (V7)
✅ Market context (V8)
✅ Fundamentals (V9)
✅ Technicals (V10)
✅ Insider intelligence (V7a/V12)
✅ Agreement intelligence (V7b/V13)
New: Alternative data, news sentiment, social media

Customer Segment: Hedge funds, institutional investors, enterprise clients
Pricing Strategy: Enterprise tier - "Complete Alpha Engine" (5x base price)

Trade-offs:

Higher accuracy (hopefully!)
Harder to explain which signals drove decision
More expensive to run (data costs, compute)
Requires all data pipelines working

Special Purpose Models

V7a - Event + Enhanced Insider Intelligence

Status: Depends on insider system completion

New Features Added:

Clustering detection: Multiple insiders buying same timeframe
Magnitude analysis: Trade size vs typical holdings
Role hierarchy: CEO/CFO trades weighted higher than directors
Timing patterns: Buys before catalysts, post-blackout buying
Filing velocity: Form 4 filing delay as urgency signal
10% owner tracking: Activist and institutional position changes
Repeated buying: Same insider accumulating over time

Why Valuable:

Insiders know more than we do (legal front-running)
Clustering = high conviction across management
Large unusual trades = material information
Timing relative to events = predictive

Customer Segment: Insider trading specialists, event-driven funds
Pricing: Premium add-on (+$Y/month to base)

V7b - Event + Agreement Intelligence

Status: Depends on agreement extraction completion

New Features Added:

Agreement type classification: Licensing, supply, distribution, JV, M&A, employment
Strategic importance scoring: Exclusive rights, duration, scope
Revenue signals: Minimum commitments, milestone payments, royalty rates
Competitive moat indicators: IP licensing, long-term locks, exclusivity
Risk factors: Termination clauses, contingencies, penalty provisions
Counterparty analysis: Size/quality of partner (Fortune 500 vs startup)

Why Valuable:

Agreements create future revenue visibility
Exclusive agreements = competitive moats
Large commitments = material revenue impact
Strategic partnerships = validation

Customer Segment: Corporate development analysts, M&A-focused funds, contract intelligence users
Pricing: Premium add-on (+$Z/month to base)

Architecture Decision: Modular Add-Ons (Recommended)

The Approach:

Base: V7 - Event Signals

Add-ons customers can choose:

+Insider Intelligence ($+Y/month)
+Agreement Intelligence ($+Z/month)
+Market Context ($+W/month)
+Fundamentals ($+X/month)

Train model variants:

V7
V7 + Insider
V7 + Agreements
V7 + Insider + Agreements
V7 + Insider + Agreements + Market
etc...

Pros:

✅ Maximum customer flexibility
✅ Can charge for each feature
✅ A/B test feature value
✅ Customers pay for what they need

Cons:

❌ Need to train multiple variants
❌ More complex infrastructure
❌ Version management complexity

Sales & Pricing Strategy

Tier 1: Base Event Signals ($X/month)

Target: 1000 customers
Pitch: "Pure event-driven alpha from SEC filings"

Tier 2: Event + Specialized Feature ($2X/month)

Options:

Event + Insider ($2X)
Event + Agreements ($2X)
Event + Market Context ($2X)

Target: 300 customers per variant
Pitch: "Enhanced with [insider intelligence / agreement analysis / macro awareness]"

Tier 3: Event + Multiple Features ($3X/month)

Popular combos:

Event + Insider + Agreements
Event + Market + Fundamentals
Event + Market + Technical

Target: 100 customers
Pitch: "Multi-factor model combining [X, Y, Z]"

Tier 4: Elite Everything ($5X/month)

Features: All of the above
Target: 20 enterprise customers
Pitch: "Complete Alpha Engine - our best model"

Implementation Prioritization

Phase 1: Validate V7 (Current)

✅ Complete V7 training
✅ Build inference script
✅ Measure signal quality on test set (2023-2024)
✅ Build deterministic portfolio manager
✅ Run backtest, measure alpha

Success criteria: V7 generates alpha above benchmark

Phase 2: Choose Next Feature (Decision Tree)

If insider system is ready:

→ Build V7a (Events + Insider)
→ Measure incremental alpha vs V7
→ If alpha improves significantly: Keep as add-on feature

If agreement extraction is ready:

→ Build V7b (Events + Agreements)
→ Measure incremental alpha vs V7
→ If alpha improves significantly: Keep as add-on feature

If neither ready:

→ Build V8 (Events + Market Context)
→ Easiest to implement (just API calls)
→ Validates modular architecture

Phase 3: Combine Winners

If both insider and agreements add alpha:

→ Build V7ab (Events + Insider + Agreements)
→ Test for interaction effects
→ Measure combined alpha

Phase 4: Build Out Tiers

After proving modular approach works:

Build V8 (Market Context)
Build V9 (Fundamentals)
Build V10 (Technicals)
Build V11 (Everything)

Each time:

Measure incremental alpha
Validate on test set
Price based on value added

Success Metrics

V7 (Events Only):

Signal accuracy > 60%
Sharpe ratio > 1.5
Positive alpha vs SPY
Conviction calibration (0.9 conv → 90% success)

V7+ with Additional Features:

Incremental accuracy improvement > +5%
Incremental Sharpe improvement > +0.3
Incremental alpha > +2% annualized
Feature actually used by model (ablation study)

Product Success:

Customer acquisition rate
Customer retention (monthly churn)
Upsell rate (base → premium tiers)
Net revenue retention
Customer satisfaction (NPS score)

Next Actions

Complete V7 training and evaluation (in progress)
Measure V7 baseline performance on test set
Decide which feature to add next based on:
- What's ready (insider vs agreements vs market)
- What's easiest (market context = easiest)
- What adds most value (TBD from testing)
Build data pipeline for chosen feature
Train V7+feature model and measure improvement
Iterate based on results

Future Work Roadmap: V8-V11+ Development

Core Philosophy: Keep It Simple, Stupid

Key Principles:

Strategic Opportunities: Beyond US Markets

SEDAR Canadian Market 🇨🇦 HIGH PRIORITY

Model Version Roadmap

V7 - Event Signals ✅ CURRENT

V8 - Event + Market Context Signals PLANNED

V9 - Event + Fundamental Signals PLANNED

V10 - Event + Technical Signals PLANNED

V11 - Elite All-Features Model FUTURE

Special Purpose Models

V7a - Event + Enhanced Insider Intelligence

V7b - Event + Agreement Intelligence

Architecture Decision: Modular Add-Ons (Recommended)

The Approach:

Pros:

Cons:

Sales & Pricing Strategy

Tier 1: Base Event Signals ($X/month)

Tier 2: Event + Specialized Feature ($2X/month)

Tier 3: Event + Multiple Features ($3X/month)

Tier 4: Elite Everything ($5X/month)

Implementation Prioritization

Phase 1: Validate V7 (Current)

Phase 2: Choose Next Feature (Decision Tree)

Phase 3: Combine Winners

Phase 4: Build Out Tiers

Success Metrics

V7 (Events Only):

V7+ with Additional Features:

Product Success:

Next Actions