ai/embeddinggemma-vllm

Verified Publisher

By Docker

•Updated 6 months ago

Embedding Gemma is a state-of-the-art text embedding model from Google DeepMind

Model

3.5K

Overview Tags

ai/embeddinggemma-vllm repository overview

⁠Embedding Gemma

logo

Embedding Gemma is a state-of-the-art text embedding model from Google DeepMind, designed to create high-quality vector representations of text. Built on the Gemma architecture, this model converts text into dense vector embeddings that capture semantic meaning, making it ideal for retrieval-augmented generation (RAG), semantic search, and similarity tasks. With open weights and efficient design, Embedding Gemma provides a powerful foundation for embedding-based applications.

⁠Intended uses

Embedding Gemma is designed for applications requiring high-quality text embeddings:

Semantic search and retrieval: Excellent for building search systems, document retrieval, and RAG applications that need to find semantically relevant content.
Text similarity and clustering: Generate embeddings for measuring text similarity, document clustering, and content deduplication tasks.
Classification and downstream tasks: Use embeddings as input features for various NLP classification tasks and machine learning pipelines.

⁠Characteristics

Attribute	Details
Provider	Google DeepMind
Architecture	Gemma Embedding
Cutoff date	-
Languages	English
Tool calling	❌
Input modalities	Text
Output modalities	Embedding vectors
License	Gemma Terms⁠

⁠Use this AI model with Docker Model Runner

First, pull the model:

docker model pull ai/embeddinggemma-vllm

To generate embeddings using the API:

curl --location 'http://localhost:12434/engines/llama.cpp/v1/embeddings' \
--header 'Content-Type: application/json' \
--data '{
    "model": "ai/embeddinggemma-vllm",
    "input": "Your text to embed here"
  }'

For more information on Docker Model Runner, explore the documentation⁠.

⁠Considerations

Context length: The model supports up to 2K tokens. Longer texts may need to be chunked for optimal performance.
Language support: Primarily trained on English text, performance on other languages may vary.
Embedding dimension: The model produces 768-dimensional embeddings suitable for most downstream tasks.
Normalization: Embeddings are normalized by default, making them suitable for cosine similarity calculations.

⁠Benchmark performance

Task Category	Embedding Gemma
Retrieval	54.87
STS	78.53
Classification	73.26
Clustering	44.72
Pair Classification	85.94
Reranking	59.36

⁠Links

Tag summary

Recent tags

Content type

Model

Digest

sha256:907318686…

Size

1.2 GB

Last updated

6 months ago

docker model pull ai/embeddinggemma-vllm:300M

This week's pulls

Pulls:

108

Last week

Learn more⁠