← Back to Ideas

Competitive Analysis: Event-Based Architecture vs Fintool

Fundamentally different architectural philosophies for processing SEC filings. Knowledge compression creates unique competitive advantages and defensible moats.
To: Raul (CEO)
From: Technical Team
Date: October 31, 2025
Status: Strategic Analysis

Executive Summary

Fintool and our system both process SEC filings, but represent fundamentally different architectural philosophies:

Fintool: RAG (Retrieval-Augmented Generation) - Store everything, retrieve on demand, let LLMs figure it out

Our System: Semantic Event Extraction - Compress knowledge upfront, build temporal narratives, enable predictive modeling

Key Insight: By extracting structured events rather than searching raw text, we achieve massive knowledge compression (747K filings → 11.9M events → 388-3,558 event types) while building a complete temporal narrative of each company's strategic evolution.

Architecture Comparison

Fintool's Approach

"Smart Search Over Everything"
SEC Filings (3K daily)
Apache Spark ingestion (5TB/decade)
Chunking (400 tokens, 10% overlap)
Vector embeddings (70M chunks)
Elasticsearch storage
Query-time retrieval (BM25 + semantic)
LLM processing (50B tokens/week)
Answer generation with citations

Key Characteristics:

  • Data volume: 5TB/decade, 70M chunks, 2M documents
  • Processing: 50 billion tokens/week through OpenAI
  • Model: GPT-4o for complex, Llama-3 8B for simple
  • Use case: Question answering for institutional investors

Our Approach

"Extract Once, Use Forever"
SEC Filings (747K processed)
Pattern matching (30 event types in C)
vLLM batch extraction (GPU-accelerated)
Structured events (11.9M events)
Event compression (388-3,558 types)
Temporal sequences per company
Transformer/Q-learning models
Return predictions + trading signals

Key Characteristics:

  • Data volume: 11.9M structured events from 747K filings
  • Processing: One-time extraction, cached forever
  • Model: Custom transformers + Q-learning agents
  • Use case: Predictive trading, alpha generation, company scoring

The Knowledge Compression Advantage

"We get huge knowledge compression" - Here's the math:

Fintool
5TB
per decade
70M text chunks
50B tokens/week
~$1M+/week cost
Our System
3GB
per decade
11.9M events
236K sequences
$5K-10K one-time
166x Compression Ratio

Knowledge Preserved:

Who did what
subject + verb + object
When it happened
temporal certainty
How important
strategic scoring
Market impact
sentiment, materiality

Data Efficiency Comparison

Metric Fintool Our System Advantage
Storage 5TB/decade 5GB/decade 1000x smaller
Processing 50B tokens/week One-time extraction Infinite reuse
Query cost $1M+/week $0 (cached) 100% savings
Latency 2-5s per query <10ms (lookup) 200-500x faster

Use Case Comparison

Use Case Fintool Our System
Question answering Excellent Better for "what did they do?"
Return prediction Not possible Primary use case
Alpha generation No predictive model Optimized for pre-price signals
Q-learning trading Can't learn from text Clear reward signals
Company scoring ⚠️ Manual analysis Automated multi-factor
Timeline analysis ⚠️ Requires multiple queries Native temporal sequences
Real-time decisions 2-5s latency Pre-computed

Token Cost Analysis

Fintool

$1M+
per week operational cost
  • • Every query triggers new LLM calls
  • • Must reprocess same information repeatedly
  • • Scales linearly with usage
  • • 50 billion tokens/week through OpenAI

Our System

$5K-10K
one-time extraction cost
  • • Extract once, use forever
  • • Zero marginal cost per prediction
  • • Scales with model inference only
  • • Fixed cost regardless of usage

🔮 Event Oracle: Our Q&A Advantage

An LLM-powered Q&A system that queries our structured event database instead of raw text chunks. Key difference: Fintool searches text to find "what they said" - Event Oracle queries events to answer "what they did"

Example: Timeline Questions

Fintool's Approach

1. Search 50+ filing chunks
2. Send 10,000+ tokens to GPT-4
3. Parse unstructured text
4. Synthesize answer

Cost: $0.20
Latency: 3-5s

Event Oracle Approach

1. SQL query on events table
2. Return structured events:
- expanded_cloud_services
- partnered_enterprise
- declared_dividend
3. Format as timeline

Cost: $0.001
Latency: <100ms
Result: 200x faster, 200x cheaper, more accurate

Pattern Questions - Unique Capability

Question: "Show me all companies that had workforce reductions followed by debt refinancing in 2023"

Fintool:

Cannot answer - would need manual query of every company, no temporal pattern matching

Event Oracle:

SELECT a.cik, a.date, b.date
FROM events a
JOIN events b ON a.cik = b.cik
WHERE a.verb = 'workforce_reduction'
  AND b.verb = 'refinanced'
  AND b.date BETWEEN a.date AND a.date + 90

Returns: 87 companies with this distress pattern + subsequent returns

Our Competitive Moat

Fintool Can Be Replicated

  • RAG is commodity (OpenAI docs are public)
  • Vector databases are off-the-shelf
  • Chunking strategies are known
  • No proprietary training data
  • Easily competed away by OpenAI or Bloomberg

Our System is Defensible

  • 30 event types = years of domain expertise
  • 11.9M events = proprietary training data
  • Compression strategies = novel research
  • Trained models = IP
  • Deep moat: data + models + domain knowledge

Multi-Model Architecture Advantage

Fintool: One architecture (RAG)

Our System: Multiple specialized models from same event foundation

Event Stream
    ├→ Transformer: Return prediction (R² = 0.23+)
    ├→ Q-learning: Trading decisions (Sharpe optimization)
    ├→ GradientBoosting: Fast baseline (9% correlation)
    └→ Event Oracle: Question answering (when needed)

Positioning Statement

Fintool:
"AI that helps you understand what companies are saying"
Descriptive (backward-looking)
VS
Ours:
"AI that predicts what companies will do next"
Predictive (forward-looking)

The Tagline

From "what did they say?"
to "what did they do?"
to "what will they do next?"

Bottom Line for Leadership

We're not building a better search engine for filings.

We're building a semantic abstraction layer that compresses years of corporate history into predictive temporal sequences, enabling quantitative models to learn what hedge fund analysts do manually: read between the lines and predict what happens next.

That's knowledge compression. That's our moat. That's the product.