← Back to Home

Experiments

Learning from both successes and failures

Why Document Experiments?

In ML/AI research, failed experiments are often more valuable than successes. They reveal edge cases, expose flawed assumptions, and guide future work. This is a collection of real experiments—what worked, what didn't, and what we learned.

Philosophy: If an experiment "fails" but teaches you something important about the problem space, it's actually a success. The only true failure is not learning from the attempt.

"I have not failed. I've just found 10,000 ways that won't work." — Thomas Edison
Failed (Insightful)
v1: Q-Learning for Trading
November 4, 2025
Trained a Q-learning agent to decide WHEN to trade based on transformer return predictions. Agent learned to HOLD everything (100% HOLD, 0% return). But the failure was insightful...
Key Insight:

Q-learning worked correctly, but state representation was flawed. 99.8% of predictions fell into one bucket because transformer only produced positive predictions (+1.68% to +10.74%), but bucketing thresholds assumed -10% to +10% range. Lesson: State representation is critical in RL.

🎯 Reinforcement Learning
📊 129K Episodes
⏱️ ~8 hours
Partial Success
v2: Percentile Bucketing Fix
November 4, 2025 (same day!)
Fixed the state representation from v1 using percentile-based bucketing. Agent learned selective trading: 30.5% BUY on high-confidence predictions, 69.5% HOLD on rest. Achieved +2.12% test return (vs +3.65% always-buy baseline). Success: Q-learning works! Problem revealed: Transformer calibration needs improvement.
Key Insight:

Q-learning worked as a diagnostic tool - it proved the algorithm works correctly, but revealed that transformer's confidence levels don't correlate with actual returns. High-confidence predictions don't actually outperform average predictions.

Q-Learning Validated
⚠️ Transformer Calibration Issue
⏱️ ~2 hours
Success
v3: Real Portfolio Backtest
November 4, 2025 (same day!)
Built a real backtesting system to simulate actual portfolio trading. Q-learning agent achieved +11.20% returns over 87 days by being extremely selective (only 20 trades out of 24,436 opportunities). Beat always-buy baseline (+3.65%) by 7.55 percentage points. Proves Q-learning works in practice with real portfolio mechanics!
Key Insight:

Per-trade metrics (v2's +2.12%) don't tell the full story. Real portfolio simulation with capital concentration and compounding turned the same Q-learning algorithm into +11.20% returns. Agent learned that 20 well-timed trades beat 7,440 average trades.

🎯 +11.20% Returns
📊 20 Trades / 24,436
⏱️ 87-day backtest
Failed (Valuable!)
v4: Event-Based Q-Learning
November 5, 2025
Tested whether Q-learning could learn directly from event counts without transformers. Agent achieved +0.02% (random), far below v1's +11.20%. This validates that transformers extract valuable patterns (sequences, combinations, temporal dynamics) that simple count features cannot capture. Transformers do real feature engineering, not just compression.
Key Insight:

The +11.18% gap between transformer-based Q-learning (v1) and event-count Q-learning (v4) quantifies the value of transformers. This valuable negative result proves transformers are necessary for extracting trading signals from SEC filings - they're not optional complexity.

🔬 Controlled Experiment
📊 267K Filings
Hypothesis Tested
Complete
v6: Nanochat Portfolio Manager
November 2025
Trained a custom 561M parameter LLM to generate portfolio decisions from SEC filing events. Training successful (val_loss 0.096), model generates valid decisions with reasoning. However, 100% position size override rate revealed architecture flaw: LLMs should generate signals (conviction), not portfolio decisions (dollar amounts).
Key Insight:

Separation of concerns needed: LLM for pattern recognition and conviction scoring (0.0-1.0), deterministic code for portfolio construction and risk management. Also achieved 10x backtesting speedup via pre-computed contexts (59 contexts/sec).

🧠 Custom LLM (561M)
10x Speedup
🎯 Architecture Insights