Back to Event Feed API

Query Language Options

Designing a user-friendly query interface

SQL is too complex. Let's build something better.

The Core Problem

Users want to ask questions like:

  • • "Show me all red flag events for tech companies in Q3 2024 with >$50M amounts"
  • • "Find companies with 3+ material negative events in the last 6 months"
  • • "Show me all upgrades followed by expansions within 90 days"

But they don't want to write SQL queries.

Proposed Solution: Multi-Tier Query System

TIER 1

Query Templates

⭐⭐⭐⭐⭐ User-Friendly

Pre-built queries for common use cases. No coding required.

Example:

GET /api/templates/red-flag-scanner
{
  "cik": [789019, 1018724],
  "date_range": "2024-Q3"
}

Suggested Templates:

📊 red-flag-scanner
📈 growth-signals
💎 alpha-opportunities
📅 company-timeline
📉 sentiment-shift
🔗 event-correlation
💰 large-transactions
👔 insider-signals
🎯 competitive-intel
⚖️ compliance-tracker
✓ Instant to use ✓ No learning curve ⚠ Limited flexibility
TIER 2

Query Builder DSL

⭐⭐⭐⭐ User-Friendly

Flexible JSON query language without SQL. Covers 80% of use cases.

Example:

POST /api/query
{
  "filters": {
    "and": [
      {"event_type": {"like": "upgraded%"}},
      {"sentiment": "positive"},
      {"idf_score": {"gte": 7.0}},
      {"filing_date": {"between": ["2024-01-01", "2024-12-31"]}}
    ]
  },
  "sort": [{"filing_date": "desc"}],
  "limit": 100
}

Supported Operators:

eq, ne
gt, gte, lt, lte
like, ilike
in, not_in
between
and, or, not
✓ Highly flexible ✓ Programmatic access → Recommended for Phase 1
TIER 3

Advanced Queries

⭐⭐ Power Users

Complex pattern detection, event sequences, and aggregations for data scientists.

Pattern Detection Example:

POST /api/query/advanced
{
  "type": "pattern_detection",
  "pattern": {
    "events": [
      {"event_type": "dismissed_auditor", "name": "red_flag"},
      {"event_type": "material_weakness", "name": "control_issue"}
    ],
    "within_days": 180,
    "sequence": "any"
  },
  "groupBy": "cik"
}

Capabilities:

  • • Pattern detection: "Event A followed by Event B within N days"
  • • Event sequences: "Upgrade → Expansion → Partnership"
  • • Aggregations: "Companies with 3+ negative events in 6 months"
  • • Time-series analysis
✓ Maximum flexibility ⚠ Steeper learning curve → Phase 2
TIER 4

Natural Language Queries

⭐⭐⭐⭐⭐ User-Friendly

LLM-powered queries. Ask questions in plain English.

Example:

POST /api/query/natural
{
  "query": "Show me companies that got credit upgrades in 2024 and then expanded into new markets",
  "max_results": 50
}

How it works:

  1. 1. LLM parses natural language intent
  2. 2. Converts to Query Builder DSL or SQL
  3. 3. Executes query and returns results
  4. 4. Explains what it found
✓ Zero learning curve ⚠ LLM API costs → Phase 3 (optional)

Alternative Approaches

Option A: GraphQL Endpoint

Pros:

  • • Industry standard
  • • Flexible nested queries
  • • Great tooling (GraphiQL)

Cons:

  • • Learning curve for non-developers
  • • May be overkill for simple queries

Option B: Saved Queries / Bookmarks

Build queries once, save them, reuse them.

• Save queries with names

• Share queries with team

• Schedule queries (alerts)

• Export results to CSV/JSON

Option C: Visual Query Builder UI

Drag-and-drop interface that generates queries.

• Select fields to filter with dropdowns

• Add conditions visually

• Preview results in real-time

• Export as JSON/SQL

Decision Matrix

Approach User-Friendly Flexibility Dev Effort Best Use Case
Query Templates ⭐⭐⭐⭐⭐ ⭐⭐ Low Common queries, non-technical users
Query Builder DSL ⭐⭐⭐⭐ ⭐⭐⭐⭐⭐ Medium Power users, programmatic access
Advanced Queries ⭐⭐ ⭐⭐⭐⭐⭐ High Complex analytics, data scientists
Natural Language ⭐⭐⭐⭐⭐ ⭐⭐⭐ Medium Non-technical users, exploration
GraphQL ⭐⭐⭐ ⭐⭐⭐⭐⭐ Medium Frontend developers
Visual Builder ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐ High All users, reduces support burden

Recommended Approach

Hybrid Strategy

Start simple, add complexity as needed based on real user feedback.

PHASE 1

Foundation (Week 1-2)

  • • Enhanced REST API with Query Builder DSL
  • • 5-10 pre-built query templates
  • • Comprehensive documentation with examples
PHASE 2

Power Features (Week 3-4)

  • • Pattern detection for event sequences
  • • Basic aggregations
  • • Saved queries
PHASE 3

Advanced (Month 2)

  • • Natural language queries (if budget allows)
  • • Visual query builder UI
  • • Scheduled queries / alerts

Why this order?

✓ Query Builder DSL covers 80% of use cases

✓ Templates make common queries instant

✓ Can add NL queries later if needed

✓ Avoids over-engineering before feedback

Next Steps

Questions to Answer:

  1. 1. Which tier(s) should we prioritize?
  2. 2. What are the top 5 query use cases your users need?
  3. 3. Who are the primary users and their technical level?
  4. 4. Timeline and resources available?

Then we can:

  • ✓ Design the chosen query system
  • ✓ Build examples and documentation
  • ✓ Implement the API endpoints
  • ✓ Create test coverage