pgvector on InsForge: Vector Search and RAG in Postgres

08 Mar 2026 • 3 minutes

Tony Chang

CTO & Co-Founder

InsForge now supports pgvector natively. Store vector embeddings, run similarity search, and build RAG pipelines directly inside your Postgres database -- no external vector store required.

Embeddings Built In

InsForge provides an embeddings method through the AI gateway. Generate vectors with a single SDK call:

typescript

const response = await insforge.ai.embeddings.create({
  model: 'openai/text-embedding-3-small',
  input: 'Your text here',
});

Models available through the gateway include openai/text-embedding-3-small, openai/text-embedding-3-large, openai/text-embedding-ada-002, and google/gemini-embedding-001. No external API keys needed in your application code.

Store and Search

Enable pgvector, create a table with a vector column, and insert embeddings from the SDK. Search uses Postgres distance operators -- cosine, L2, or inner product -- so retrieval runs server-side in a single query.

For production, wrap your search logic in a Postgres RPC function like match_documents and call it via insforge.database.rpc(). This keeps similarity search close to the data with no round trips.

See the full pgvector documentation for table schemas, indexing strategies (HNSW and IVFFlat), distance operators, and code examples.

Building RAG

InsForge gives you the primitives: embeddings, vector storage, similarity search, and chat completions. That covers the basic retrieve-and-generate loop.

For production RAG, raw retrieval is not enough. You need structured chunking, query rewriting, re-ranking, and evaluation to get reliable results. We recommend pairing InsForge with an orchestration framework:

Framework	Language	Best For
LangChain	Python / TypeScript	Full pipeline orchestration
LlamaIndex	Python / TypeScript	Document indexing and query engines
Haystack	Python	Modular pipelines and evaluation
Vercel AI SDK	TypeScript	Streaming UI and Next.js integration

Point any of these at your InsForge database. Use insforge.ai.embeddings.create() for vectors and insforge.ai.chat.completions.create() for generation. The data layer runs on InsForge. The orchestration framework handles the intelligence on top.

Get Started

pgvector is available on all InsForge plans. Enable the extension and start building.