Embeddings API Guide

Overview

New model release

text-embedding-3-small

text-embedding-3-large
Features: lower cost, better multilingual performance, controllable dimensions

Main use cases

🔍 Search (relevance ranking)

📊 Clustering (similarity grouping)

👍 Recommendation systems

⚠️ Anomaly detection

📈 Diversity analysis

🏷️ Text classification

Basic usage

Get embedding vectors

Response format

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [
        -0.006929283495992422,
        -0.005336422007530928,
        // ... more values
      ],
    }
  ],
  "model": "text-embedding-3-small",
  "usage": {
    "prompt_tokens": 5,
    "total_tokens": 5
  }
}

Model comparison

Model	Pages per dollar	MTEB performance evaluation	Maximum input
text-embedding-3-small	62,500	62.3%	8191
text-embedding-3-large	9,615	64.6%	8191
text-embedding-ada-002	12,500	61.0%	8191

Practical application example

Processing review data

Technical details

Dimension description

text-embedding-3-small: default 1536 dimensions

text-embedding-3-large: default 3072 dimensions

The dimensions can be adjusted through the Dimensions parameter

Notes

Billing is based on the number of input tokens

About 800 tokens per page

The maximum input for all models is 8191 tokens

Python Using Embeddings for Vectorization

Embeddings API Guide#

Overview#

New model release#

Main use cases#

Basic usage#

Get embedding vectors#

Response format#

Model comparison#

Practical application example#

Processing review data#

Technical details#

Dimension description#

Notes#