How to Add AI Search to Any Website with Vector Embeddings

By Rohit Patil | Published: September 15, 2025

Introduction: Beyond Keyword Search

For decades, website search has been dominated by simple keyword matching. If a user doesn't type the exact term you used, they get zero results. In the age of AI, this is no longer acceptable. Users expect search to understand intent and meaning, not just words. This is the power of semantic search.

In this tutorial, we will build a complete, end-to-end AI-powered semantic search API for a blog. We'll learn how to convert our articles into vector embeddings, store them in a specialized database, and then use AI to find the most conceptually similar articles to a user's query. Best of all, we'll do it all on the serverless edge using Cloudflare's powerful AI stack.

The Architecture: How it Works

Our system has two main parts: an Indexing Pipeline (which we do once to prepare our data) and a Query Pipeline (which runs every time a user searches).

AI Search Architecture

Indexing (Offline):
Blog Content → [AI Model] → Vector Embeddings → [Vector DB]

Query (Real-time):
User Query → [AI Model] → Query Vector → [Vector DB Search] → Similar Content

Vector Embeddings: These are numerical representations of text. An AI model reads a piece of text and converts its meaning into a list of numbers (a vector). Texts with similar meanings will have mathematically similar vectors.
Vector Database: A specialized database designed to store and efficiently search these vectors to find the "nearest neighbors" to a query vector.

For our stack, we'll use Cloudflare Workers AI to generate embeddings, and Cloudflare Vectorize as our edge vector database.

Step 1: The Indexing Pipeline - Turning Content into Vectors

First, we need to process our existing blog posts. This is a one-time script you would run locally or as a separate Worker. The goal is to take each article, generate an embedding for its content, and store it in Vectorize.

1. Set up Your Vectorize Index

Using the Wrangler CLI, create a new Vectorize index. We'll use a pre-set model dimension that matches Cloudflare's embedding model.

npx wrangler vectorize create ai-search-index --dimensions=768 --metric=cosine

2. Create an Indexing Script

This Node.js script will read a local JSON file of your blog posts, call a Worker to generate embeddings, and then insert them into the Vectorize index.

// indexer.js - Simplified example
import { Ai } from '@cloudflare/ai';

// Assume 'posts' is an array of your blog articles
for (const post of posts) {
  const ai = new Ai(env.AI);
  
  // Chunk content for better results
  const textToEmbed = `${post.title}. ${post.description}`;
  
  const embedding = await ai.run(
    '@cf/baai/bge-base-en-v1.5',
    { text: [textToEmbed] }
  );

  const vector = {
    id: post.id.toString(), // Must be a string
    values: embedding.data[0],
    metadata: { 
      url: post.url, 
      title: post.title 
    }
  };
  
  // Use Wrangler to insert into Vectorize
  // npx wrangler vectorize insert ai-search-index --json='[...]'
  console.log("Vector to insert:", JSON.stringify(vector));
}

After running this for all your articles, your Vectorize database is now populated with the semantic "fingerprint" of your content.

Step 2: The Query Pipeline - Building the Search API

Now we'll create the Cloudflare Worker that will power our live search. This Worker will receive a user's query, convert it into a vector, and search the Vectorize index for the most similar articles.

// Your main Cloudflare Worker (e.g., /api/search)
export default {
  async fetch(request, env) {
    const { query } = await request.json();
    if (!query) {
      return new Response('Query is required', { status: 400 });
    }

    const ai = new Ai(env.AI);

    // 1. Convert the user's query into a vector embedding
    const queryEmbedding = await ai.run(
      '@cf/baai/bge-base-en-v1.5',
      { text: [query] }
    );
    const queryVector = queryEmbedding.data[0];

    // 2. Query the Vectorize index
    const searchResults = await env.AI_SEARCH_INDEX.query(queryVector, {
      topK: 5 // Return the top 5 most similar articles
    });

    // 3. Format and return the results
    const matches = searchResults.matches.map(match => ({
      score: match.score,
      title: match.vector.metadata.title,
      url: match.vector.metadata.url
    }));

    return new Response(JSON.stringify(matches), {
      headers: { 'Content-Type': 'application/json' }
    });
  }
}

You now have a fully functional semantic search API endpoint! A frontend can call this API and display the results, which will be sorted by conceptual relevance (the score), not just keyword matches.

Conclusion: The Future of Search is Semantic

By moving beyond simple keyword matching, you create a vastly superior user experience. Users can now search for "how to secure my app" and find your article on JWTs, even if the exact words don't match. This is the power of understanding intent.

With tools like Cloudflare Workers AI and Vectorize, building this once-complex technology is now accessible to all developers. By integrating semantic search, you're not just upgrading a feature; you're building a smarter, more intuitive website for the AI-driven future.

← Back to All Articles