RAG Finance

Equity Deal Research System

A modular similarity-search system that ingests CRM records and CIM PDFs, builds multi-modal embeddings, stores vectors alongside metadata, and ranks historical deals that resemble a new opportunity—in seconds instead of manual desk research.

Diagram: ingestion to embeddings to retrieval

Overview

The implementation follows a modular layout: ingestion normalises CRM feeds and extracts PDF text; embedding layers fuse structured financial encodings with language embeddings; hybrid storage keeps vectors and deal metadata aligned; retrieval applies similarity scoring and ranking before responses are exposed through FastAPI and an analyst-facing Streamlit application.

Capabilities

Document intelligence

PDF extraction pipelines prepare CIM narratives for embedding alongside structured deal features.

Multi-modal fusion

Separate encoders for tabular signals and text with a fusion stage so both modalities influence retrieval.

Scalable retrieval

FAISS-backed vector search with metadata-aware ranking to surface the most relevant precedents.

Analyst-ready interfaces

REST API for integration plus Streamlit UI for exploratory search and validation workflows.

Visual summary

Diagram: layered modules and FastAPI or Streamlit interfaces
Layered modules and dual interfaces for API-driven and interactive use. The CRM-to-embedding-to-retrieval overview is shown in the hero above.

Technology stack

  • Python
  • FastAPI
  • Streamlit
  • FAISS
  • PDF extraction
  • YAML configuration

Exploring document intelligence for investments or corporate development?

Get in touch