InsForge now supports pgvector natively. Store vector embeddings, run similarity search, and build RAG pipelines directly inside your Postgres database -- no external vector store required.
Embeddings Built In
InsForge provides an embeddings method through the AI gateway. Generate vectors with a single SDK call:
const response = await insforge.ai.embeddings.create({
model: 'openai/text-embedding-3-small',
input: 'Your text here',
});
Models available through the gateway include openai/text-embedding-3-small, openai/text-embedding-3-large, openai/text-embedding-ada-002, and google/gemini-embedding-001. No external API keys needed in your application code.
Store and Search
Enable pgvector, create a table with a vector column, and insert embeddings from the SDK. Search uses Postgres distance operators -- cosine, L2, or inner product -- so retrieval runs server-side in a single query.
For production, wrap your search logic in a Postgres RPC function like match_documents and call it via insforge.database.rpc(). This keeps similarity search close to the data with no round trips.
See the full pgvector documentation for table schemas, indexing strategies (HNSW and IVFFlat), distance operators, and code examples.
Building RAG
InsForge gives you the primitives: embeddings, vector storage, similarity search, and chat completions. That covers the basic retrieve-and-generate loop.
For production RAG, raw retrieval is not enough. You need structured chunking, query rewriting, re-ranking, and evaluation to get reliable results. We recommend pairing InsForge with an orchestration framework:
| Framework | Language | Best For |
|---|---|---|
| LangChain | Python / TypeScript | Full pipeline orchestration |
| LlamaIndex | Python / TypeScript | Document indexing and query engines |
| Haystack | Python | Modular pipelines and evaluation |
| Vercel AI SDK | TypeScript | Streaming UI and Next.js integration |
Point any of these at your InsForge database. Use insforge.ai.embeddings.create() for vectors and insforge.ai.chat.completions.create() for generation. The data layer runs on InsForge. The orchestration framework handles the intelligence on top.
Get Started
pgvector is available on all InsForge plans. Enable the extension and start building.

