Embeddings API Guide

Overview

New Model Releases

text-embedding-3-small

text-embedding-3-large
Features: lower cost, better multilingual performance, controllable dimensions

Key Use Cases

🔍 Search (relevance ranking)

📊 Clustering (similarity grouping)

👍 Recommendation systems

⚠️ Anomaly detection

📈 Diversity analysis

🏷️ Text classification

Basic Usage

Getting Embeddings

Response Format

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [
        -0.006929283495992422,
        -0.005336422007530928,
        // ... more values
      ],
    }
  ],
  "model": "text-embedding-3-small",
  "usage": {
    "prompt_tokens": 5,
    "total_tokens": 5
  }
}

Model Comparison

Model	Pages per Dollar	MTEB Performance	Max Input
text-embedding-3-small	62,500	62.3%	8191
text-embedding-3-large	9,615	64.6%	8191
text-embedding-ada-002	12,500	61.0%	8191

Practical Usage Examples

Processing Review Data

Technical Details

Dimensions

text-embedding-3-small: 1536 dimensions by default

text-embedding-3-large: 3072 dimensions by default

Dimensions can be adjusted via the Dimensions parameter

Notes

Billing is based on the number of input tokens

Approximately 800 tokens per page

The maximum input for all models is 8191 tokens

Python Using Embeddings for Vectorization

Embeddings API Guide#

Overview#

New Model Releases#

Key Use Cases#

Basic Usage#

Getting Embeddings#

Response Format#

Model Comparison#

Practical Usage Examples#

Processing Review Data#

Technical Details#

Dimensions#

Notes#