feat(knowledge): add Ollama embedding provider support by teedonk · Pull Request #3714 · simstudioai/sim

teedonk · 2026-03-22T23:34:49Z

Summary

Adds Ollama as an embedding provider for knowledge bases, enabling fully offline/private RAG workflows
Implements per-KB dynamic pgvector tables to handle variable embedding dimensions across Ollama models
Auto-detects embedding dimension and context length from Ollama's /api/show endpoint
Adds provider selection UI with preset models (nomic-embed-text, mxbai-embed-large, all-minilm, etc.)

Fixes (#3037)

Per-KB embedding tables solve the variable dimension problem
Context-aware chunking and smart batching prevent context length overflow
Cross-provider search normalizes distance scores across OpenAI and Ollama KBs
All existing OpenAI/Azure paths remain untouched

Type of Change

Testing

Tested with Ollama (nomic-embed-text) on local setup:

Created KB with Ollama provider — dimension auto-detected
Uploaded small and large documents (40+ pages) — all chunks processed successfully
Search returns relevant results across both OpenAI and Ollama KBs
All 4267 existing tests pass, zero TypeScript errors, Biome lint clean

Checklist

Code follows project style guidelines
Self-reviewed my changes
Tests added/updated and passing
No new warnings introduced
I confirm that I have read and agree to the terms outlined in the [Contributor License Agreement (CLA)](./CONTRIBUTING.md#contributor-license-agree

Video.Project.1.mp4

ment-cla)

… batching

…ization

vercel · 2026-03-22T23:34:55Z

@teedonk is attempting to deploy a commit to the Sim Team on Vercel.

A member of the Team first needs to authorize it.

cursor · 2026-03-22T23:34:56Z

PR Summary

High Risk
High risk because it adds a new embedding provider path that changes how embeddings are generated, stored (dynamic per-KB tables), and searched, plus introduces new network calls and validation for Ollama URLs/models that could impact reliability and security.

Overview
Adds Ollama (local) embedding provider support for knowledge bases, including UI selection, API validation, and end-to-end document ingestion/search for non-OpenAI embeddings.

Knowledge base creation now accepts ollama/* models and optional ollamaBaseUrl, validates the Ollama model via /api/show, auto-corrects embedding dimensions, and creates a dedicated per-KB kb_embeddings_{id} pgvector table; failures roll back via a new hardDeleteKnowledgeBase. Document processing and embedding generation are updated to route to OpenAI vs Ollama, cap chunk sizes based on Ollama context length, and write embeddings to either the shared embedding table (OpenAI) or the per-KB table (Ollama).

Search is updated to route queries per KB provider, generating embeddings per provider (and per unique Ollama model+URL), querying per-KB tables for Ollama, normalizing distances only when mixing providers, and tracking cost only for OpenAI calls. Tests and types are updated accordingly, and deletion clarifies that per-KB tables are not dropped on soft delete.

^{Written by Cursor Bugbot for commit dbabedd. This will update automatically on new commits. Configure here.}

apps/sim/lib/knowledge/embeddings.ts

apps/sim/lib/knowledge/documents/service.ts

apps/sim/lib/knowledge/dynamic-tables.ts

greptile-apps · 2026-03-22T23:38:45Z

Greptile Summary

This PR adds Ollama as a fully offline/private embedding provider for knowledge bases. It introduces per-KB dynamic pgvector tables (kb_embeddings_{uuid}) to handle variable embedding dimensions across Ollama models, auto-detects model dimension and context length via Ollama's /api/show endpoint, adds provider selection UI, and wires cross-provider search with score normalization when mixing OpenAI and Ollama KBs.

Key design decisions:

Per-KB tables: Each Ollama KB gets its own kb_embeddings_{uuid} table with the exact vector dimension, avoiding dimension conflicts in a shared table.
ollamaBaseUrl is stored inside the chunkingConfig JSONB field to avoid a schema migration — pragmatic but couples embedding config to chunking config.
Chunk size is capped at 30% of the Ollama model's context length (using a conservative 3 chars/token estimate) to stay safely within context limits.
Score normalization (min-max to [0, 1]) is applied only when mixing OpenAI and Ollama results, leaving single-provider similarity scores unchanged.
Soft-delete preserves the per-KB table for restore compatibility; hardDeleteKnowledgeBase is scoped to creation-rollback only.

Issues from previous reviews that are now addressed:

✅ Non-atomic delete + insert for Ollama embeddings — now wrapped in db.transaction
✅ Orphaned KB row on table creation failure — hard-deleted in catch block
✅ sql.raw for queryVector — replaced with parameterized sql\\${queryVector}::vector\``
✅ Document status update moved back inside the transaction
✅ Native <select> replaced with the Select component from the UI library

Remaining findings:

The ollamaBaseUrl SSRF allowlist accepts any *.internal hostname, broader than needed for Docker use cases.
Tag definitions are fetched twice per KB per search request — a minor redundancy worth caching across the two passes.
The sql.raw usage in createKBEmbeddingTable is safe (dimension is validated) but lacks an explicit comment documenting why.

Confidence Score: 5/5

PR is safe to merge; all previously raised P0/P1 issues have been addressed and the remaining findings are minor style/hardening suggestions.

All critical issues from prior review rounds (non-atomic delete+insert, orphaned KB row, sql.raw injection, status-update regression, native select element) have been resolved in this iteration. The remaining findings are P2: an overly broad SSRF allowlist suffix, a double tag-definition fetch, and a missing safety comment on sql.raw DDL. None of these affect correctness or data integrity in the current implementation.

apps/sim/app/api/knowledge/route.ts (SSRF allowlist), apps/sim/app/api/knowledge/search/route.ts (duplicate tag definition fetch)

Important Files Changed

Filename	Overview
apps/sim/lib/knowledge/dynamic-tables.ts	New file: per-KB pgvector table lifecycle (create/drop/insert/search) with safe UUID-to-table-name conversion and validated dimension; sql.raw usage is safe given upfront validation but could benefit from an explicit safety comment.
apps/sim/app/api/knowledge/route.ts	POST creates per-KB table after inserting the KB row, with a cleanup block that hard-deletes the orphaned row on table-creation failure; ollamaBaseUrl SSRF validation is solid but *.internal allowance is broader than needed.
apps/sim/app/api/knowledge/search/route.ts	Routes search to per-KB tables for Ollama and shared embedding table for OpenAI; generates one query embedding per unique (model, URL) pair; normalizes distances only when mixing providers; tag definitions fetched twice per KB.
apps/sim/lib/knowledge/embeddings.ts	Adds Ollama embedding generation via /api/embed with token-aware batching, context-length caching, and chunk truncation; module-level cache is a no-op in serverless (acknowledged in previous review).
apps/sim/lib/knowledge/documents/service.ts	Embedding generation and insertion wrapped in a single db.transaction for both Ollama (per-KB table) and OpenAI paths; document status update moved inside the transaction; chunk size capped at 30% of Ollama context length.
apps/sim/lib/knowledge/service.ts	Adds hardDeleteKnowledgeBase (creation-rollback only) and preserves ollamaBaseUrl in chunkingConfig JSONB during updates by reading the existing value before writing.
apps/sim/app/workspace/[workspaceId]/knowledge/components/create-base-modal/create-base-modal.tsx	Adds provider toggle and Ollama-specific fields using the existing component library (Select, Input, Label); previous native select concern was addressed.
apps/sim/lib/chunkers/text-chunker.ts	Adds Ollama-aware token estimation ratio (3 chars/token vs 4 for OpenAI) to produce smaller chunks that fit within model context limits.

Sequence Diagram

sequenceDiagram
    participant UI as Create KB Modal
    participant API as POST /api/knowledge
    participant Ollama as Ollama /api/show
    participant DB as Database
    participant PG as PostgreSQL

    UI->>API: POST embeddingModel ollama/nomic-embed-text
    API->>Ollama: POST /api/show validate model and detect dimension
    Ollama-->>API: contextLength and embeddingLength
    API->>DB: INSERT knowledgeBase row
    API->>PG: CREATE TABLE kb_embeddings_uuid vector dimension
    alt Table creation fails
        API->>PG: DROP TABLE kb_embeddings_uuid
        API->>DB: DELETE knowledgeBase row
        API-->>UI: 500 error
    else Success
        API-->>UI: 201 with id and embeddingDimension
    end

    participant DocSvc as Document Service
    participant Embed as Ollama /api/embed

    DocSvc->>Ollama: GET context length cached 5min
    Ollama-->>DocSvc: contextLength
    DocSvc->>DocSvc: Cap chunkSize to 30pct of contextLength
    DocSvc->>Embed: POST model and chunks batch of 100
    Embed-->>DocSvc: embeddings array
    DocSvc->>DB: transaction DELETE old then INSERT into kb_embeddings_uuid then UPDATE status

_{Reviews (2): Last reviewed commit: "fix: batch Ollama embeddings by item cou..." | Re-trigger Greptile}

greptile-apps · 2026-03-22T23:38:49Z

apps/sim/lib/knowledge/documents/service.ts

+          if (embeddingProvider === 'ollama') {
+            // Per-KB table: delete old chunks then bulk-insert new ones
+            await deleteKBDocumentEmbeddings(knowledgeBaseId, documentId)
+            await insertKBEmbeddings(knowledgeBaseId, embeddingRecords, kb[0].embeddingDimension)


Non-atomic delete + insert risks data loss

The Ollama path deletes all existing embeddings for the document and then inserts new ones without a wrapping transaction. If insertKBEmbeddings fails mid-way (e.g., after a partial batch insert), the old embeddings have already been deleted but the new ones are only partially written — leaving the document with fewer embeddings or none at all, with no way to recover automatically.

The OpenAI path below correctly wraps both operations in a db.transaction. Consider wrapping the Ollama path similarly. Since deleteKBDocumentEmbeddings and insertKBEmbeddings both use db.execute / db.insert, they should participate in a transaction too.

greptile-apps · 2026-03-22T23:38:49Z

apps/sim/app/api/knowledge/route.ts

      const newKnowledgeBase = await createKnowledgeBase(createData, requestId)

+      if (provider === 'ollama') {
+        try {
+          await createKBEmbeddingTable(newKnowledgeBase.id, effectiveDimension)
+        } catch (tableError) {
+          logger.error(
+            `[${requestId}] Failed to create embedding table for KB ${newKnowledgeBase.id}`,
+            tableError
+          )
+          throw tableError
+        }


Orphaned KB row when embedding table creation fails

createKnowledgeBase persists the KB row to the database at line 136 before createKBEmbeddingTable is called. If createKBEmbeddingTable throws (e.g. the pgvector extension isn't enabled or a naming collision occurs), the error is re-thrown and the request returns 500 — but the KB record is left behind in the database without a corresponding embedding table. Any subsequent document upload to this KB will fail with a table-not-found error and the user has no way to fix it from the UI.

A safe fix is to delete the orphaned KB row in the catch block before re-throwing:

} catch (tableError) { logger.error(...) // Clean up orphaned KB row await deleteKnowledgeBase(newKnowledgeBase.id, requestId).catch(() => {}) throw tableError }

greptile-apps · 2026-03-22T23:38:50Z

apps/sim/lib/knowledge/dynamic-tables.ts

+      // Use drizzle's insert API with dynamic table schema
+      await db.insert(dynamicTable).values(batch)
+    } catch (err: unknown) {
+      const pg = err as { code?: string; detail?: string; message?: string; cause?: unknown }
+      logger.error(`insertKBEmbeddings failed for table ${table}`, {
+        code: pg.code,
+        detail: pg.detail,
+        message: pg.message,
+        cause: pg.cause,


Vector literal injected via sql.raw — fragile and risky pattern

queryVector is constructed by the caller as JSON.stringify(emb) where emb is a number[], so in practice it contains no single quotes. However, interpolating it directly into a sql.raw string template is fragile: a future refactor that passes a different value could silently introduce a SQL injection or at minimum a syntax error.

Drizzle's parameterized approach is safer and just as performant:

// Instead of: const vecLiteral = sql.raw(`'${queryVector}'::vector`) // Prefer: const vecLiteral = sql`${queryVector}::vector`

Using a Drizzle placeholder ensures the value is always parameterized correctly, regardless of its content.

greptile-apps · 2026-03-22T23:38:51Z

apps/sim/lib/knowledge/documents/service.ts

+        await db
+          .update(document)
+          .set({
+            chunkCount: processed.metadata.chunkCount,
+            tokenCount: processed.metadata.tokenCount,
+            characterCount: processed.metadata.characterCount,
+            processingStatus: 'completed',
+            processingCompletedAt: now,
+            processingError: null,
+          })
+          .where(eq(document.id, documentId))


Document status update moved outside the transaction (regression)

In the original code, both the embedding inserts and the processingStatus: 'completed' update were inside a single db.transaction, so they were atomic. This PR extracts the status update to a separate await db.update(document)... call that runs after the transaction.

For the OpenAI path, if the transaction (embedding inserts) succeeds but the subsequent status update fails, the document stays permanently in 'processing' state — even though its embeddings are fully in place. There's no retry that would recover this (the catch block at line 656 sets status to 'failed', which is also incorrect since the embeddings are already there).

This is a regression for the OpenAI path that was introduced by the refactor to support the Ollama code path. The processingStatus update should remain inside the db.transaction for the OpenAI path.

greptile-apps · 2026-03-22T23:38:53Z

...sim/app/workspace/[workspaceId]/knowledge/components/create-base-modal/create-base-modal.tsx

+                        onChange={(e) => {
+                          const preset = OLLAMA_PRESET_MODELS.find(
+                            (m) => m.value === e.target.value
+                          )
+                          setOllamaPreset(e.target.value)
+                          if (preset && preset.dimension > 0) {
+                            setOllamaDimension(preset.dimension)
+                          }
+                        }}
+                        className='rounded-[4px] border border-[var(--border-1)] bg-[var(--surface-1)] px-[10px] py-[6px] text-[12px] text-[var(--text-primary)]'
+                      >
+                        {OLLAMA_PRESET_MODELS.map((m) => (
+                          <option key={m.value} value={m.value}>
+                            {m.label}
+                          </option>
+                        ))}
+                      </select>
+                    </div>
+


Native <select> instead of the component library's Select

The rest of the form uses components from the existing @/components/ui library (e.g. Input, Label, Button). This <select> is a raw HTML element with hand-written Tailwind classes for the border/background/text, which won't automatically respect dark mode tokens or any future design-system updates.

Consider replacing it with the project's Select / SelectContent / SelectItem / SelectTrigger pattern used elsewhere in the codebase.

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

greptile-apps · 2026-03-22T23:38:54Z

apps/sim/lib/knowledge/embeddings.ts

+  baseUrl = 'http://localhost:11434'
+): Promise<OllamaModelInfo> {
+  const cacheKey = `${modelName}@${baseUrl}`
+  const cached = ollamaModelInfoCache.get(cacheKey)


Module-level cache ineffective in serverless/edge deployments

ollamaModelInfoCache is a module-level Map. In serverless environments (Next.js API routes deployed to Vercel, AWS Lambda, etc.) each invocation may spawn a fresh process where the module is re-initialized, so the cache is effectively a no-op. Additionally, multiple concurrent invocations will bypass the cache and issue redundant /api/show calls.

If the project already uses a shared cache layer (e.g. Redis via lib/core/storage), consider using it here. Otherwise, this is worth documenting as a known limitation so future maintainers aren't surprised.

…t silent failures

…update

apps/sim/app/api/knowledge/search/route.ts

apps/sim/app/api/knowledge/route.ts

apps/sim/lib/knowledge/dynamic-tables.ts

…works

…licate TAG_SLOT_KEYS

apps/sim/lib/knowledge/dynamic-tables.ts

apps/sim/app/api/knowledge/route.ts

apps/sim/lib/knowledge/dynamic-tables.ts

apps/sim/app/api/knowledge/search/route.ts

apps/sim/lib/knowledge/embeddings.ts

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

apps/sim/app/api/knowledge/search/route.ts

teedonk added 15 commits March 22, 2026 23:20

feat(knowledge): add Ollama embedding types

255640f

feat(knowledge): add per-KB dynamic pgvector tables

b043bc2

feat(knowledge): add Ollama embedding generation with retry and smart…

61f05a7

… batching

feat(knowledge): store ollamaBaseUrl in KB config

546dd7c

feat(chunkers): add embeddingModel to ChunkerOptions

616761d

feat(chunkers): add model-aware token estimation ratio

133f326

feat(knowledge): pass embeddingModel to all chunkers

2693251

feat(knowledge): add Ollama chunk size and overlap capping

18e7ac2

feat(knowledge): add Ollama model validation and auto-detect dimension

983efc3

feat(knowledge): update KB detail API for Ollama support

53a1423

feat(knowledge): add provider routing and cross-provider score normal…

0b5d218

…ization

feat(knowledge): add Ollama provider selection UI

606b70b

feat(knowledge): add Ollama params to create KB hook

da36fcd

test(knowledge): update KB detail tests for Ollama support

b9e6ab7

test(knowledge): update search tests for provider routing

b1e92b8

cursor bot reviewed Mar 22, 2026

View reviewed changes

apps/sim/lib/knowledge/embeddings.ts Show resolved Hide resolved

apps/sim/lib/knowledge/documents/service.ts Outdated Show resolved Hide resolved

apps/sim/lib/knowledge/dynamic-tables.ts Outdated Show resolved Hide resolved

greptile-apps bot reviewed Mar 22, 2026

View reviewed changes

teedonk added 4 commits March 22, 2026 23:56

fix(knowledge): separate validation from runtime model info to preven…

3698a04

…t silent failures

fix(knowledge): parameterize query vector and accept transaction handle

988158e

fix(knowledge): wrap Ollama delete+insert in transaction with status …

166a7f3

…update

fix(knowledge): clean up orphaned KB on table creation failure

2f30934

cursor bot reviewed Mar 23, 2026

View reviewed changes

apps/sim/app/api/knowledge/search/route.ts Show resolved Hide resolved

apps/sim/app/api/knowledge/route.ts Show resolved Hide resolved

apps/sim/lib/knowledge/dynamic-tables.ts Show resolved Hide resolved

fix(knowledge): replace native select with project Select component

f88e9f9

cursor bot reviewed Mar 23, 2026

View reviewed changes

apps/sim/lib/knowledge/dynamic-tables.ts Outdated Show resolved Hide resolved

teedonk added 3 commits March 23, 2026 00:33

fix(knowledge): sort and trim Ollama results to topK

863e497

fix(knowledge): restrict Ollama base URL to localhost and private net…

546061e

…works

fix(knowledge): filter deleted documents from Ollama search and dedup…

00b3c7d

…licate TAG_SLOT_KEYS

teedonk added 4 commits March 29, 2026 19:19

fix(knowledge): wrap BETWEEN compound conditions in parentheses

547de40

fix(knowledge): wrap BETWEEN compound conditions in parentheses

7afb708

fix(knowledge): add retry to Ollama search embedding generation

2cdb519

docs(knowledge): clarify soft-delete table retention rationale

507cc36

vercel bot temporarily deployed to Preview March 29, 2026 18:19 Inactive

cursor bot reviewed Mar 29, 2026

View reviewed changes

apps/sim/lib/knowledge/dynamic-tables.ts Show resolved Hide resolved

fix(knowledge): validate UUID format in kbTableName

50858d4

vercel bot temporarily deployed to Preview March 29, 2026 18:28 Inactive

chore: merge staging into feat/ollama-embedding-support

5bdfe15

vercel bot temporarily deployed to Preview April 2, 2026 02:51 Inactive

cursor bot reviewed Apr 2, 2026

View reviewed changes

apps/sim/app/api/knowledge/route.ts Outdated Show resolved Hide resolved

apps/sim/lib/knowledge/dynamic-tables.ts Show resolved Hide resolved

apps/sim/app/api/knowledge/search/route.ts Outdated Show resolved Hide resolved

teedonk added 4 commits April 2, 2026 04:04

fix(knowledge): hard-delete KB row on creation rollback

f6d121e

fix(knowledge): use hardDeleteKnowledgeBase in cleanup path

d210669

fix(knowledge): align drizzle schema id type to uuid

ff08fb0

fix(knowledge): clamp single-result distance instead of forcing zero

5cebdea

vercel bot temporarily deployed to Preview April 2, 2026 03:04 Inactive

cursor bot reviewed Apr 2, 2026

View reviewed changes

apps/sim/lib/knowledge/embeddings.ts Show resolved Hide resolved

teedonk added 4 commits April 2, 2026 04:23

fix: use global score normalization across all providers

2552edc

fix: validate embedding count matches chunk count before insert

c6fde92

fix: prevent NaN on Ollama dimension input

71b1769

fix: correct overlap chunk size unit in JSDoc comment

61d7936

vercel bot temporarily deployed to Preview April 2, 2026 03:23 Inactive

cursor bot reviewed Apr 2, 2026

View reviewed changes

apps/sim/app/api/knowledge/search/route.ts Show resolved Hide resolved

fix: only normalize scores when mixing OpenAI and Ollama providers

1991604

vercel bot temporarily deployed to Preview April 2, 2026 03:34 Inactive

fix: batch Ollama embeddings by item count, not cumulative chars

dbabedd

vercel bot temporarily deployed to Preview April 2, 2026 03:54 Inactive

waleedlatif1 deleted the branch simstudioai:staging April 3, 2026 23:01

waleedlatif1 closed this Apr 3, 2026

waleedlatif1 reopened this Apr 3, 2026

Conversation

teedonk commented Mar 22, 2026

Summary

Type of Change

Testing

Checklist

Uh oh!

vercel bot commented Mar 22, 2026

Uh oh!

cursor bot commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

Uh oh!

Uh oh!

Uh oh!

greptile-apps bot commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cursor bot commented Mar 22, 2026 •

edited

Loading

greptile-apps bot commented Mar 22, 2026 •

edited

Loading