How to Add AI Search to Any Website with Vector Embeddings
Introduction: Beyond Keyword Search
For decades, website search has been dominated by simple keyword matching. If a user doesn't type the exact term you used, they get zero results. In the age of AI, this is no longer acceptable. Users expect search to understand *intent* and *meaning*, not just words. This is the power of **semantic search**.
In this tutorial, we will build a complete, end-to-end AI-powered semantic search API for a blog. We'll learn how to convert our articles into **vector embeddings**, store them in a specialized database, and then use AI to find the most conceptually similar articles to a user's query. Best of all, we'll do it all on the serverless edge using Cloudflare's powerful AI stack.
The Architecture: How it Works
Our system has two main parts: an **Indexing Pipeline** (which we do once to prepare our data) and a **Query Pipeline** (which runs every time a user searches).
AI Search Architecture
Indexing (Offline):
Blog Content → [AI Model] → Vector Embeddings → [Vector DB]
Query (Real-time):
User Query → [AI Model] → Query Vector → [Vector DB Search] → Similar Content
- Vector Embeddings: These are numerical representations of text. An AI model reads a piece of text and converts its meaning into a list of numbers (a vector). Texts with similar meanings will have mathematically similar vectors.
- Vector Database: A specialized database designed to store and efficiently search these vectors to find the "nearest neighbors" to a query vector.
For our stack, we'll use Cloudflare Workers AI to generate embeddings, and Cloudflare Vectorize as our edge vector database.
Step 1: The Indexing Pipeline - Turning Content into Vectors
First, we need to process our existing blog posts. This is a one-time script you would run locally or as a separate Worker. The goal is to take each article, generate an embedding for its content, and store it in Vectorize.
1. Set up Your Vectorize Index
Using the Wrangler CLI, create a new Vectorize index. We'll use a pre-set model dimension that matches Cloudflare's embedding model.
npx wrangler vectorize create ai-search-index --dimensions=768 --metric=cosine
2. Create an Indexing Script
This Node.js script will read a local JSON file of your blog posts, call a Worker to generate embeddings, and then insert them into the Vectorize index.
// indexer.js - Simplified example
import { Ai } from '@cloudflare/ai';
// Assume 'posts' is an array of your blog articles
for (const post of posts) {
const ai = new Ai(env.AI);
// Chunk content for better results
const textToEmbed = `${post.title}. ${post.description}`;
const embedding = await ai.run(
'@cf/baai/bge-base-en-v1.5',
{ text: [textToEmbed] }
);
const vector = {
id: post.id.toString(), // Must be a string
values: embedding.data[0],
metadata: {
url: post.url,
title: post.title
}
};
// Use Wrangler to insert into Vectorize
// npx wrangler vectorize insert ai-search-index --json='[...]'
console.log("Vector to insert:", JSON.stringify(vector));
}
After running this for all your articles, your Vectorize database is now populated with the semantic "fingerprint" of your content.
Step 2: The Query Pipeline - Building the Search API
Now we'll create the Cloudflare Worker that will power our live search. This Worker will receive a user's query, convert it into a vector, and search the Vectorize index for the most similar articles.
// Your main Cloudflare Worker (e.g., /api/search)
export default {
async fetch(request, env) {
const { query } = await request.json();
if (!query) {
return new Response('Query is required', { status: 400 });
}
const ai = new Ai(env.AI);
// 1. Convert the user's query into a vector embedding
const queryEmbedding = await ai.run(
'@cf/baai/bge-base-en-v1.5',
{ text: [query] }
);
const queryVector = queryEmbedding.data[0];
// 2. Query the Vectorize index
const searchResults = await env.AI_SEARCH_INDEX.query(queryVector, {
topK: 5 // Return the top 5 most similar articles
});
// 3. Format and return the results
const matches = searchResults.matches.map(match => ({
score: match.score,
title: match.vector.metadata.title,
url: match.vector.metadata.url
}));
return new Response(JSON.stringify(matches), {
headers: { 'Content-Type': 'application/json' }
});
}
}
You now have a fully functional semantic search API endpoint! A frontend can call this API and display the results, which will be sorted by conceptual relevance (the `score`), not just keyword matches.
Conclusion: The Future of Search is Semantic
By moving beyond simple keyword matching, you create a vastly superior user experience. Users can now search for "how to secure my app" and find your article on JWTs, even if the exact words don't match. This is the power of understanding intent.
With tools like Cloudflare Workers AI and Vectorize, building this once-complex technology is now accessible to all developers. By integrating semantic search, you're not just upgrading a feature; you're building a smarter, more intuitive website for the AI-driven future.
← Back to All Articles