← Back to Ideas

Competitive Analysis: Event-Based Architecture vs Fintool

Fundamentally different architectural philosophies for processing SEC filings. Knowledge compression creates unique competitive advantages and defensible moats.

To: Raul (CEO)

From: Technical Team

Date: October 31, 2025

Status: Strategic Analysis

Executive Summary

Fintool and our system both process SEC filings, but represent fundamentally different architectural philosophies:

Fintool: RAG (Retrieval-Augmented Generation) - Store everything, retrieve on demand, let LLMs figure it out

Our System: Semantic Event Extraction - Compress knowledge upfront, build temporal narratives, enable predictive modeling

Key Insight: By extracting structured events rather than searching raw text, we achieve massive knowledge compression (747K filings → 11.9M events → 388-3,558 event types) while building a complete temporal narrative of each company's strategic evolution.

Architecture Comparison

Fintool's Approach

"Smart Search Over Everything"

SEC Filings (3K daily)

↓

Apache Spark ingestion (5TB/decade)

↓

Chunking (400 tokens, 10% overlap)

↓

Vector embeddings (70M chunks)

↓

Elasticsearch storage

↓

Query-time retrieval (BM25 + semantic)

↓

LLM processing (50B tokens/week)

↓

Answer generation with citations

Key Characteristics:

Data volume: 5TB/decade, 70M chunks, 2M documents
Processing: 50 billion tokens/week through OpenAI
Model: GPT-4o for complex, Llama-3 8B for simple
Use case: Question answering for institutional investors

Our Approach

"Extract Once, Use Forever"

SEC Filings (747K processed)

↓

Pattern matching (30 event types in C)

↓

vLLM batch extraction (GPU-accelerated)

↓

Structured events (11.9M events)

↓

Event compression (388-3,558 types)

↓

Temporal sequences per company

↓

Transformer/Q-learning models

↓

Return predictions + trading signals

Key Characteristics:

Data volume: 11.9M structured events from 747K filings
Processing: One-time extraction, cached forever
Model: Custom transformers + Q-learning agents
Use case: Predictive trading, alpha generation, company scoring

The Knowledge Compression Advantage

"We get huge knowledge compression" - Here's the math:

Fintool

5TB

per decade

70M text chunks

50B tokens/week

~$1M+/week cost

→

Our System

3GB

per decade

11.9M events

236K sequences

$5K-10K one-time

166x Compression Ratio

Knowledge Preserved:

Who did what

subject + verb + object

When it happened

temporal certainty

How important

strategic scoring

Market impact

sentiment, materiality

Data Efficiency Comparison

Metric	Fintool	Our System	Advantage
Storage	5TB/decade	5GB/decade	1000x smaller
Processing	50B tokens/week	One-time extraction	Infinite reuse
Query cost	$1M+/week	$0 (cached)	100% savings
Latency	2-5s per query	<10ms (lookup)	200-500x faster

Use Case Comparison

Use Case	Fintool	Our System
Question answering	✅ Excellent	✅ Better for "what did they do?"
Return prediction	❌ Not possible	✅ Primary use case
Alpha generation	❌ No predictive model	✅ Optimized for pre-price signals
Q-learning trading	❌ Can't learn from text	✅ Clear reward signals
Company scoring	⚠️ Manual analysis	✅ Automated multi-factor
Timeline analysis	⚠️ Requires multiple queries	✅ Native temporal sequences
Real-time decisions	❌ 2-5s latency	✅ Pre-computed

Token Cost Analysis

Fintool

$1M+

per week operational cost

• Every query triggers new LLM calls
• Must reprocess same information repeatedly
• Scales linearly with usage
• 50 billion tokens/week through OpenAI

Our System

$5K-10K

one-time extraction cost

• Extract once, use forever
• Zero marginal cost per prediction
• Scales with model inference only
• Fixed cost regardless of usage

🔮 Event Oracle: Our Q&A Advantage

An LLM-powered Q&A system that queries our structured event database instead of raw text chunks. Key difference: Fintool searches text to find "what they said" - Event Oracle queries events to answer "what they did"

Example: Timeline Questions

Fintool's Approach

1. Search 50+ filing chunks
2. Send 10,000+ tokens to GPT-4
3. Parse unstructured text
4. Synthesize answer

Cost: $0.20
Latency: 3-5s

Event Oracle Approach

1. SQL query on events table
2. Return structured events:
- expanded_cloud_services
- partnered_enterprise
- declared_dividend
3. Format as timeline

Cost: $0.001
Latency: <100ms

Result: 200x faster, 200x cheaper, more accurate

Pattern Questions - Unique Capability

Question: "Show me all companies that had workforce reductions followed by debt refinancing in 2023"

Fintool:

Cannot answer - would need manual query of every company, no temporal pattern matching

Event Oracle:

SELECT a.cik, a.date, b.date
FROM events a
JOIN events b ON a.cik = b.cik
WHERE a.verb = 'workforce_reduction'
AND b.verb = 'refinanced'
AND b.date BETWEEN a.date AND a.date + 90

Returns: 87 companies with this distress pattern + subsequent returns

Our Competitive Moat

Fintool Can Be Replicated

RAG is commodity (OpenAI docs are public)
Vector databases are off-the-shelf
Chunking strategies are known
No proprietary training data
Easily competed away by OpenAI or Bloomberg

Our System is Defensible

30 event types = years of domain expertise
11.9M events = proprietary training data
Compression strategies = novel research
Trained models = IP
Deep moat: data + models + domain knowledge

Multi-Model Architecture Advantage

Fintool: One architecture (RAG)

Our System: Multiple specialized models from same event foundation

                    Event Stream

                        ├→ Transformer: Return prediction (R² = 0.23+)

                        ├→ Q-learning: Trading decisions (Sharpe optimization)

                        ├→ GradientBoosting: Fast baseline (9% correlation)

                        └→ Event Oracle: Question answering (when needed)

Positioning Statement

Fintool:

"AI that helps you understand what companies are saying"

Descriptive (backward-looking)

Ours:

"AI that predicts what companies will do next"

Predictive (forward-looking)

The Tagline

From "what did they say?"
to "what did they do?"
to "what will they do next?"

Bottom Line for Leadership

We're not building a better search engine for filings.

We're building a semantic abstraction layer that compresses years of corporate history into predictive temporal sequences, enabling quantitative models to learn what hedge fund analysts do manually: read between the lines and predict what happens next.

That's knowledge compression. That's our moat. That's the product.