Financial AI

AI systems for investment research, deal intelligence, and document-heavy workflows in private markets—where speed, traceability, and retrieval quality matter as much as model novelty.

Where we focus

Structured and unstructured financial data, confidential information memoranda (CIMs), CRM lineage, and precedent deals. The goal is not a one-off demo but repeatable pipelines: ingest, embed, index, rank, and expose results through APIs and analyst tools teams can adopt.

Flagship open build

Equity Deal Research System

A modular similarity-search system that ingests CRM records and CIM PDFs, builds multi-modal embeddings, stores vectors alongside metadata, and ranks historical deals that resemble a new opportunity—in seconds instead of manual desk research.

RAG Finance Private equity

Source code and configuration are public on GitHub—suitable as a reference architecture for teams building their own deal-memory or precedent-search layer.

Diagram: CRM and CIM ingestion through embeddings to FAISS retrieval and FastAPI or Streamlit

How it works

The implementation follows a modular layout: ingestion normalises CRM feeds and extracts PDF text; embedding layers fuse structured financial encodings with language embeddings; hybrid storage keeps vectors and deal metadata aligned; retrieval applies similarity scoring and ranking before responses are exposed through FastAPI and an analyst-facing Streamlit application.

Capabilities

Document intelligence

PDF extraction pipelines prepare CIM narratives for embedding alongside structured deal features.

Multi-modal fusion

Separate encoders for tabular signals and text with a fusion stage so both modalities influence retrieval.

Scalable retrieval

FAISS-backed vector search with metadata-aware ranking to surface the most relevant precedents.

Analyst-ready interfaces

REST API for integration plus Streamlit UI for exploratory search and validation workflows.

Architecture at a glance

Diagram: layered modules and FastAPI or Streamlit interfaces
Layered modules and dual interfaces for API-driven and interactive use. The CRM-to-embedding-to-retrieval overview is shown in the hero above.

Technology stack

  • Python
  • FastAPI
  • Streamlit
  • FAISS
  • PDF extraction
  • YAML configuration