pgvector on InsForge: Vector Search and RAG in Postgres

08 Mar 20263 minutes
Tony Chang

Tony Chang

CTO & Co-Founder

InsForge PGVector

InsForge now supports pgvector natively. Store vector embeddings, run similarity search, and build RAG pipelines directly inside your Postgres database -- no external vector store required.

Embeddings Built In

InsForge provides an embeddings method through the AI gateway. Generate vectors with a single SDK call:

typescript
const response = await insforge.ai.embeddings.create({
  model: 'openai/text-embedding-3-small',
  input: 'Your text here',
});

Models available through the gateway include openai/text-embedding-3-small, openai/text-embedding-3-large, openai/text-embedding-ada-002, and google/gemini-embedding-001. No external API keys needed in your application code.

Enable pgvector, create a table with a vector column, and insert embeddings from the SDK. Search uses Postgres distance operators -- cosine, L2, or inner product -- so retrieval runs server-side in a single query.

For production, wrap your search logic in a Postgres RPC function like match_documents and call it via insforge.database.rpc(). This keeps similarity search close to the data with no round trips.

See the full pgvector documentation for table schemas, indexing strategies (HNSW and IVFFlat), distance operators, and code examples.

Building RAG

InsForge gives you the primitives: embeddings, vector storage, similarity search, and chat completions. That covers the basic retrieve-and-generate loop.

For production RAG, raw retrieval is not enough. You need structured chunking, query rewriting, re-ranking, and evaluation to get reliable results. We recommend pairing InsForge with an orchestration framework:

FrameworkLanguageBest For
LangChainPython / TypeScriptFull pipeline orchestration
LlamaIndexPython / TypeScriptDocument indexing and query engines
HaystackPythonModular pipelines and evaluation
Vercel AI SDKTypeScriptStreaming UI and Next.js integration

Point any of these at your InsForge database. Use insforge.ai.embeddings.create() for vectors and insforge.ai.chat.completions.create() for generation. The data layer runs on InsForge. The orchestration framework handles the intelligence on top.

Get Started

pgvector is available on all InsForge plans. Enable the extension and start building.