Core Philosophy: Keep It Simple, Stupid
Key Principles:
- Each model adds ONE complexity layer
- Train models independently to measure alpha contribution
- Different models for different customer segments
- Clear upsell path: V7 β V8 β V9 β V10 β V11
- All models share core architecture (signal generation + deterministic portfolio)
Strategic Opportunities: Beyond US Markets
SEDAR Canadian Market π¨π¦ HIGH PRIORITY
Status: Blue Ocean Opportunity - Zero Competition
Quick Summary:
- Apply V7 architecture to Canadian market (SEDAR filings)
- Zero competition - No LLM signals for TSX/TSXV
- Already have SEDAR data infrastructure
- Unique value: Mining/resource event intelligence (NI 43-101, drill results)
- Start with dual-listed companies (use US stock prices we already have)
- $3M ARR potential (50 customers @ $5K/month)
Key Insights:
- β All SEDAR filings have English versions (no translation needed)
- β Many TSX companies dual-listed on US exchanges (~500-800 stocks, 80% of market cap)
- β 95% code reuse from V7
- β First mover advantage in untapped market
Model Version Roadmap
V7 - Event Signals β CURRENT
Status: In Training
Features:
- Past events with tiered selection (0-14d ALL, 15-60d confβ₯7, 61-180d confβ₯8 or IDFβ₯10)
- Recency-weighted event sorting
- Basic market regime (bull/bear/neutral)
- Filing metadata (CIK, date, type)
Output: BUY/SELL/SKIP + conviction (0.0-1.0)
Pricing Strategy: Base tier - "Event-Driven Signals"
V8 - Event + Market Context Signals PLANNED
Status: Next Priority (if insider/agreement not ready)
New Features Added:
- VIX (volatility regime at filing date)
- Sector performance vs SPY (relative strength)
- Market breadth (% stocks above 200-day MA)
- Credit spreads (high-yield vs treasury spread)
- Interest rate environment (10-year yield, Fed funds rate)
- Market momentum (SPY 30/60/90 day returns)
Why Separate Model:
- Some clients want pure event-driven (no market timing dependency)
- Market context requires different data infrastructure
- Adds cost (market data feeds)
- Different customer need: macro-aware vs event-pure
Pricing Strategy: Mid tier - "Event + Market Context Signals" (2x base price)
Data Sources to Build:
- VIX historical data (Yahoo Finance, CBOE API)
- SPY and sector ETF prices (already have infrastructure)
- Treasury yields (FRED API - Federal Reserve Economic Data)
- Credit spreads (HYG, TLT ETF data)
- Market breadth (NYSE advance/decline, need new source)
V9 - Event + Fundamental Signals PLANNED
New Features Added:
- Valuation ratios: P/E, P/B, EV/EBITDA, P/S
- Leverage metrics: Debt/Equity, Interest Coverage, Net Debt/EBITDA
- Growth rates: Revenue growth YoY, Earnings growth YoY
- Profitability: Gross margin, Operating margin, ROE, ROA
- Quality metrics: Free cash flow yield, Asset turnover
- Relative valuation: vs sector median, vs historical average
Why Separate Model:
- Value investors want this, growth investors don't care
- Fundamentals require parsing financial statements (complexity)
- Some funds use only qualitative catalysts (events) not quantitative
- Can charge premium for fundamental analysis layer
Pricing Strategy: Premium tier - "Event + Fundamental Signals" (3x base price)
Integration Challenges:
- XBRL parsing is complex (standardization issues)
- Different fiscal year ends (calendar vs non-calendar)
- TTM calculations require previous quarters
- Some companies don't report all metrics
V10 - Event + Technical Signals PLANNED
New Features Added:
- Momentum indicators: RSI (14), MACD, Rate of Change
- Trend indicators: Price vs 50/200 day MA, moving average crossovers
- Volume indicators: Volume trend, OBV, accumulation/distribution
- Volatility indicators: Historical volatility, Bollinger Bands, ATR
- Price action: Support/resistance breaks, 52-week highs/lows
- Chart patterns: Breakouts, consolidations (if detectable)
Why Separate Model:
- Technical traders are a specific customer segment
- Requires extensive price/volume history
- Many fundamental investors think TA is voodoo
- Adds significant data infrastructure cost
Pricing Strategy: Premium tier - "Event + Technical Signals" (3x base price)
V11 - Elite All-Features Model FUTURE
Status: After V7-V10 proven
Features (The "Kitchen Sink"):
- β Events (V7)
- β Market context (V8)
- β Fundamentals (V9)
- β Technicals (V10)
- β Insider intelligence (V7a/V12)
- β Agreement intelligence (V7b/V13)
- New: Alternative data, news sentiment, social media
Pricing Strategy: Enterprise tier - "Complete Alpha Engine" (5x base price)
Trade-offs:
- Higher accuracy (hopefully!)
- Harder to explain which signals drove decision
- More expensive to run (data costs, compute)
- Requires all data pipelines working
Special Purpose Models
V7a - Event + Enhanced Insider Intelligence
Status: Depends on insider system completion
New Features Added:
- Clustering detection: Multiple insiders buying same timeframe
- Magnitude analysis: Trade size vs typical holdings
- Role hierarchy: CEO/CFO trades weighted higher than directors
- Timing patterns: Buys before catalysts, post-blackout buying
- Filing velocity: Form 4 filing delay as urgency signal
- 10% owner tracking: Activist and institutional position changes
- Repeated buying: Same insider accumulating over time
Why Valuable:
- Insiders know more than we do (legal front-running)
- Clustering = high conviction across management
- Large unusual trades = material information
- Timing relative to events = predictive
Pricing: Premium add-on (+$Y/month to base)
V7b - Event + Agreement Intelligence
Status: Depends on agreement extraction completion
New Features Added:
- Agreement type classification: Licensing, supply, distribution, JV, M&A, employment
- Strategic importance scoring: Exclusive rights, duration, scope
- Revenue signals: Minimum commitments, milestone payments, royalty rates
- Competitive moat indicators: IP licensing, long-term locks, exclusivity
- Risk factors: Termination clauses, contingencies, penalty provisions
- Counterparty analysis: Size/quality of partner (Fortune 500 vs startup)
Why Valuable:
- Agreements create future revenue visibility
- Exclusive agreements = competitive moats
- Large commitments = material revenue impact
- Strategic partnerships = validation
Pricing: Premium add-on (+$Z/month to base)
Architecture Decision: Modular Add-Ons (Recommended)
The Approach:
Base: V7 - Event Signals
Add-ons customers can choose:
- +Insider Intelligence ($+Y/month)
- +Agreement Intelligence ($+Z/month)
- +Market Context ($+W/month)
- +Fundamentals ($+X/month)
Train model variants:
- V7
- V7 + Insider
- V7 + Agreements
- V7 + Insider + Agreements
- V7 + Insider + Agreements + Market
- etc...
Pros:
- β Maximum customer flexibility
- β Can charge for each feature
- β A/B test feature value
- β Customers pay for what they need
Cons:
- β Need to train multiple variants
- β More complex infrastructure
- β Version management complexity
Sales & Pricing Strategy
Tier 1: Base Event Signals ($X/month)
Target: 1000 customers
Pitch: "Pure event-driven alpha from SEC filings"
Tier 2: Event + Specialized Feature ($2X/month)
Options:
- Event + Insider ($2X)
- Event + Agreements ($2X)
- Event + Market Context ($2X)
Target: 300 customers per variant
Pitch: "Enhanced with [insider intelligence / agreement analysis / macro awareness]"
Tier 3: Event + Multiple Features ($3X/month)
Popular combos:
- Event + Insider + Agreements
- Event + Market + Fundamentals
- Event + Market + Technical
Target: 100 customers
Pitch: "Multi-factor model combining [X, Y, Z]"
Tier 4: Elite Everything ($5X/month)
Features: All of the above
Target: 20 enterprise customers
Pitch: "Complete Alpha Engine - our best model"
Implementation Prioritization
Phase 1: Validate V7 (Current)
- β Complete V7 training
- β Build inference script
- β Measure signal quality on test set (2023-2024)
- β Build deterministic portfolio manager
- β Run backtest, measure alpha
Success criteria: V7 generates alpha above benchmark
Phase 2: Choose Next Feature (Decision Tree)
If insider system is ready:
- β Build V7a (Events + Insider)
- β Measure incremental alpha vs V7
- β If alpha improves significantly: Keep as add-on feature
If agreement extraction is ready:
- β Build V7b (Events + Agreements)
- β Measure incremental alpha vs V7
- β If alpha improves significantly: Keep as add-on feature
If neither ready:
- β Build V8 (Events + Market Context)
- β Easiest to implement (just API calls)
- β Validates modular architecture
Phase 3: Combine Winners
If both insider and agreements add alpha:
- β Build V7ab (Events + Insider + Agreements)
- β Test for interaction effects
- β Measure combined alpha
Phase 4: Build Out Tiers
After proving modular approach works:
- Build V8 (Market Context)
- Build V9 (Fundamentals)
- Build V10 (Technicals)
- Build V11 (Everything)
Each time:
- Measure incremental alpha
- Validate on test set
- Price based on value added
Success Metrics
V7 (Events Only):
- Signal accuracy > 60%
- Sharpe ratio > 1.5
- Positive alpha vs SPY
- Conviction calibration (0.9 conv β 90% success)
V7+ with Additional Features:
- Incremental accuracy improvement > +5%
- Incremental Sharpe improvement > +0.3
- Incremental alpha > +2% annualized
- Feature actually used by model (ablation study)
Product Success:
- Customer acquisition rate
- Customer retention (monthly churn)
- Upsell rate (base β premium tiers)
- Net revenue retention
- Customer satisfaction (NPS score)
Next Actions
- Complete V7 training and evaluation (in progress)
- Measure V7 baseline performance on test set
- Decide which feature to add next based on:
- What's ready (insider vs agreements vs market)
- What's easiest (market context = easiest)
- What adds most value (TBD from testing)
- Build data pipeline for chosen feature
- Train V7+feature model and measure improvement
- Iterate based on results