NewMCP ServerView docs
Configuration

Embedding Models

Configure and customize embedding models including BGE-M3 and other supported models.

6 min readUpdated 2026-01-17

Embedding Models

Configure the embedding model used for dense vector search.

Available Models

ModelDimensionsLanguagesSpeed
BGE-M3 (default)1024100+Fast
E5-Large1024100+Medium
OpenAI Ada-002153650+Fast
CustomVariableVariableVariable

Changing the Model

python
# Per-request
results = client.search(
    query="...",
    options={"embedding_model": "e5-large"}
)

# Organization default
client.settings.update({
    "default_embedding_model": "e5-large"
})

Using Custom Models

Bring your own embedding model:

python
client = LakehouseClient(
    api_key="...",
    embedding_endpoint="https://your-model.com/embed"
)

Your endpoint must accept:

json
{
  "texts": ["text1", "text2"],
  "model": "your-model-name"
}

And return:

json
{
  "embeddings": [[0.1, 0.2, ...], [0.3, 0.4, ...]]
}

Model Comparison

BGE-M3 (Recommended)

  • Best overall quality
  • Excellent multi-language support
  • Balanced speed/quality

E5-Large

  • Strong English performance
  • Good for domain-specific fine-tuning

OpenAI Ada-002

  • Easy integration
  • Good general performance
  • Higher latency

Re-indexing

When changing models, re-index your documents:

python
client.documents.reindex(
    model="new-model",
    batch_size=100
)