How to Build an AI-Powered Recommendation Engine (2026)

Recommendation engines drive 35% of Amazon's revenue and 80% of Netflix viewing. In 2026, you can build one in a weekend using embeddings and vector search — no ML PhD required. Here's how.

The Three Approaches

1. Collaborative Filtering: "Users like you also liked..."
   Based on: User behavior patterns
   Example: Spotify's "Discover Weekly"

2. Content-Based: "Because you liked X, here's similar Y..."
   Based on: Item attributes/features
   Example: Netflix genre matching

3. Embedding-Based (2026 approach): "These items are close in meaning..."
   Based on: AI-generated semantic understanding
   Example: "You bought a camping tent" → recommends hiking boots
   (not because others bought both, but because AI understands
   camping and hiking are related activities)

The Modern Stack

2020 approach:
  TensorFlow → train model for weeks → deploy on GPU servers → $$$

2026 approach:
  OpenAI/Voyage embeddings → store in vector DB → query similar items
  Build in a weekend. Deploy on serverless. Costs $5-50/month.

Architecture

┌─────────────┐     ┌──────────────┐     ┌────────────┐
│  Your App   │────▶│  Embeddings  │────▶│ Vector DB  │
│             │     │  (OpenAI)    │     │ (Pinecone) │
│  User views │     │              │     │            │
│  product    │     │  Convert to  │     │  Find      │
│             │◀────│  vectors     │◀────│  nearest   │
│  Show recs  │     │              │     │  neighbors │
└─────────────┘     └──────────────┘     └────────────┘

Step 1: Generate Embeddings

Convert your items into vectors (numbers that capture meaning):

import OpenAI from 'openai';

const openai = new OpenAI();

async function getEmbedding(text: string): Promise<number[]> {
  const response = await openai.embeddings.create({
    model: 'text-embedding-3-small',
    input: text,
  });
  return response.data[0].embedding;
}

// Create embedding for a product
const product = {
  name: 'Lightweight Hiking Backpack',
  description: '40L ultralight backpack for multi-day hikes. 
    Waterproof, ventilated back panel, trekking pole attachments.',
  category: 'Outdoor Gear',
  tags: ['hiking', 'backpacking', 'ultralight'],
};

const embedding = await getEmbedding(
  `${product.name}. ${product.description}. ${product.tags.join(', ')}`
);
// Returns: [0.023, -0.041, 0.087, ...] (1536 numbers)

Step 2: Store in Vector Database

import { Pinecone } from '@pinecone-database/pinecone';

const pinecone = new Pinecone();
const index = pinecone.index('products');

// Store all products with their embeddings
async function indexProducts(products: Product[]) {
  const vectors = await Promise.all(
    products.map(async (product) => ({
      id: product.id,
      values: await getEmbedding(
        `${product.name}. ${product.description}. ${product.tags.join(', ')}`
      ),
      metadata: {
        name: product.name,
        category: product.category,
        price: product.price,
      },
    }))
  );

  await index.upsert(vectors);
}

Vector DB Options

Database	Type	Free Tier	Price
Pinecone	Managed	100K vectors	$25/mo+
Qdrant	Self-host/cloud	Self-host free	$25/mo+
Weaviate	Self-host/cloud	Self-host free	$25/mo+
Turso	SQLite + vectors	9GB free	$29/mo
Supabase pgvector	Postgres extension	500MB free	$25/mo+

Step 3: Query for Recommendations

// "Similar items" recommendations
async function getRecommendations(productId: string, limit = 5) {
  // Get the product's embedding
  const product = await getProduct(productId);
  const embedding = await getEmbedding(
    `${product.name}. ${product.description}`
  );

  // Find similar products
  const results = await index.query({
    vector: embedding,
    topK: limit + 1, // +1 to exclude self
    includeMetadata: true,
    filter: { id: { $ne: productId } }, // Exclude the product itself
  });

  return results.matches.map((match) => ({
    id: match.id,
    name: match.metadata.name,
    score: match.score, // 0-1, higher = more similar
  }));
}

// Usage
const recs = await getRecommendations('hiking-backpack-001');
// Returns:
// [
//   { name: "Trail Running Vest", score: 0.92 },
//   { name: "Trekking Poles", score: 0.89 },
//   { name: "Hiking Boots", score: 0.87 },
//   { name: "Water Filter", score: 0.85 },
//   { name: "Camping Hammock", score: 0.82 },
// ]

Step 4: Add User Personalization

User Profile Embeddings

// Build a user preference vector from their behavior
async function getUserPreferenceEmbedding(userId: string) {
  const history = await getUserHistory(userId);
  // Get items they've viewed, purchased, or liked
  
  // Combine recent items into a preference description
  const preferenceText = history
    .slice(-10) // Last 10 interactions
    .map((item) => `${item.name}: ${item.description}`)
    .join('. ');

  return getEmbedding(
    `User interested in: ${preferenceText}`
  );
}

// Personalized recommendations
async function getPersonalizedRecs(userId: string, limit = 10) {
  const userEmbedding = await getUserPreferenceEmbedding(userId);
  
  const results = await index.query({
    vector: userEmbedding,
    topK: limit,
    includeMetadata: true,
    filter: {
      id: { $nin: await getUserPurchasedIds(userId) } // Exclude already purchased
    },
  });

  return results.matches;
}

Step 5: Hybrid Approach

Combine content similarity with collaborative signals:

async function getHybridRecommendations(userId: string, productId: string) {
  // Content-based: similar to this product
  const contentRecs = await getRecommendations(productId, 10);
  
  // Personalized: based on user history
  const personalRecs = await getPersonalizedRecs(userId, 10);
  
  // Collaborative: what similar users bought
  const collabRecs = await getCollaborativeRecs(userId, 10);
  
  // Merge and rank (weighted scoring)
  const scores = new Map();
  
  contentRecs.forEach((r) => {
    scores.set(r.id, (scores.get(r.id) || 0) + r.score * 0.4);
  });
  personalRecs.forEach((r) => {
    scores.set(r.id, (scores.get(r.id) || 0) + r.score * 0.35);
  });
  collabRecs.forEach((r) => {
    scores.set(r.id, (scores.get(r.id) || 0) + r.score * 0.25);
  });
  
  return [...scores.entries()]
    .sort(([, a], [, b]) => b - a)
    .slice(0, 8);
}

Cost Estimate

For a store with 10,000 products and 5,000 daily users:

Embeddings (one-time for catalog):
  10,000 products × $0.00002/embedding = $0.20

Embeddings (user queries):
  5,000 queries/day × $0.00002 = $0.10/day = $3/mo

Vector database:
  Pinecone starter: $0 (under 100K vectors)
  Or Supabase: $25/mo

Total: $3-28/month for Netflix-quality recommendations

FAQ

How is this different from "customers also bought"?

"Customers also bought" uses purchase correlation (collaborative filtering). Embedding-based recommendations understand semantic meaning — they can recommend hiking boots for a tent buyer even if no one has bought both yet.

How many items do I need for this to work?

Embedding-based recommendations work with as few as 50 items. Collaborative filtering needs thousands of user interactions. That's why embeddings are better for smaller catalogs.

How do I handle cold start (new users)?

Start with content-based recommendations (similar to what they're viewing). As they interact more, blend in personalization. After 5-10 interactions, personalized recommendations become useful.

Should I use OpenAI or train my own model?

Use OpenAI/Voyage embeddings to start — they work surprisingly well for most use cases. Only train custom models if you have domain-specific needs and 100K+ training examples.

Bottom Line

Use OpenAI embeddings + Pinecone or Supabase pgvector for a recommendation engine that works in a weekend. Start with content-based similarity, add user personalization as you collect data, then blend with collaborative filtering for the best results.

The recommendation engines of 2026 aren't about complex ML pipelines — they're about embeddings, vector search, and smart scoring.