Skip To Main Content
search in headless cms

Implementing Native Search with Headless CMS Architecture

TL;DR: Too often when site search is a business requirement, the default reaction is to reach for a third-party search provider (e.g. Algolia). However, most headless CMS platforms offer a search endpoint within their Content API, which offers a much simpler implementation.

Sitewide search is a common implementation request for most projects we work on. We can’t help but smile when the requirement is addressed, and immediately some trending third-party search provider (Algolia, Elasticsearch, Meilisearch, etc etc) is suggested. And from a composability perspective, this makes sense and fits the paradigm. 

However, our perspective is that more isn’t always more. What most clients don’t realize is that modern headless CMS’s typically (but quietly) ship search capability directly within their Content Delivery API. And for 90% of marketing, content, and product sites we work on, it's powerful enough. This post is the case for pausing before you add another vendor to the stack.

Most Headless CMS’s Offer Search

For the purpose of this article, we’ll focus on Storyblok, as it’s fairly common and increasing in popularity. Storyblok’s Content Delivery API exposes a search_term query parameter on the stories endpoint

Drop a string in, get matching stories back. 

That's the entire integration.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
// app/api/search/route.ts
import { storyblokInit, apiPlugin } from '@storyblok/js'
const { storyblokApi } = storyblokInit({
  accessToken: process.env.STORYBLOK_TOKEN,
  use: [apiPlugin],
})

export async function GET(request: Request) {
  const { searchParams } = new URL(request.url)
  const q = searchParams.get('q') ?? ''
  const locale = searchParams.get('locale') ?? 'en-ca'

  const { data } = await storyblokApi.get('cdn/stories', {
    search_term: q,
    starts_with: `${locale}/`,
    per_page: 20,
  })

  return Response.json({ results: data.stories })
}

That's a working search endpoint. Add a debounced input on the client, render the results, and you have implemented sitewide search.

This is common to most/all enterprise-grade CMS platforms. Contentful gives you a query parameter on its content API. Sanity has GROQ with match. Prismic exposes fulltext predicates. We could go on!

The shape of the request changes, but the principle doesn't. Your CMS already houses and indexes all of your content. It’s merely duplicative to add on another provider to ask it to find things within said content.

The key decision: rendering results

The two common patterns are search-as-you-type (results update on each keystroke) and submit-on-enter (results render after the user completes the query). We've built both, and the "right" one depends on the nature of the content being indexed and searched.

For content-heavy sites (blog, knowledge base, docs), search-as-you-type typically yields the most user-friendly experience. Since results are typically limited to the one or two most highly relevant pages/articles, scanning is fast and users are able to find what they need quickly. For sites with heavier result cards (e.g. e-commerce site with large catalogue), submit-on-enter is the preferred approach, otherwise it’s resource-expensive and cognitively-taxing to re-render numerous product tiles on every single keystroke.

Ultimately, the infrastructure underneath is identical: debounce the user’s input, fetch results from the CMS, then decide how to render the returned results (accounting for loading state, empty state, and "no results" copy). 

As you can see, this is a simple, yet elegant and effective method to implement search natively. There's no library to learn, no index to sync, no auth keys to rotate.

Per-locale search isn't a special case

On a recent Next.js + Storyblok build that ships in 14 locales, search results needed to be scoped per locale. This may seem obvious, but it’s an important technical prerequisite. A user on the French-Canadian version of the site shouldn't be fed results in German or for the German market. 

The way to achieve this is relatively simple. Utilizing the same starts_with filter we use for routing the rest of the site: we pass the locale prefix, and the CMS only returns stories matching the query, from that folder.

1
2
3
4
storyblokApi.get('cdn/stories', {
  search_term: q,
  starts_with: `${locale}/`,
})

As you can see, it’s almost too simple. Using only a single parameter, we’ve achieved locale-scoped search. That means there’s no need for separate index per locale to maintain and no Algolia replicas to configure. If your CMS is set up with folder-level translation (which it should be, for a build of any reasonable size), per-locale search is inherently achievable within the architecture, for no additional cost.

The limitations of this approach

We don’t want to over-hype this approach, as there are downsides. Fundamentally, CMS-native search is keyword-based, which means a few things won't work without some additional tooling:

  • Typo tolerance: Mistyped searches for "wigdet" won't match to "widget". If your audience or content/products are typo-heavy, search users will feel it.

  • Relevance tuning: Results come back in a default order. Therefore, weighting matches (e.g. in titles is higher than in matches in body copy) is on you.

  • Autocomplete suggestions. If you want a "did you mean…" or query suggestion layer, the CMS isn't where you build it.

For 90% of sites in existence, none of those are material issues. For a 50K+ SKU catalogue with a typo-heavy audience and merchandising rules that need certain products surfaced first, those issues produce real barriers to UX. That's when an Algolia-level search engine is worth investing in. 

We always recommend using the best tool for the job. But ensure that you understand the job first.

The time-to-ship argument

The reason we often recommend this pattern is the sheer delivery speed. Algolia, Elasticsearch are fine products, and excel when justified. But in our experience, getting them set up properly (index sync, schema mapping, filters and facets configured, custom ranking tuned, replica indexes for sort orders) takes significantly longer (sometimes weeks) than just using the CMS's own endpoint.

When a client's base requirement is "let users use search to find a blog post or product", spending multiple sprint cycles on infrastructure is hard to justify. Our CMS-native approach takes just a few days, adds nothing extra to monthly licensing fees, and uses fresh data that your editors are already maintaining. If your users’ search behaviours become more complex or demanding over time, you can layer in a dedicated engine then, against real query data.

Agencies that specialize in headless can ship features like this fast, because we've built them before. We're not waiting on a third-party SDK to catch up to a new framework release, or a sales conversation about volume pricing.

The main takeaway

Before you add a search vendor to a headless project, look at what your CMS already provides. You'll usually find that the API you're paying for solves the problem sufficiently. Reach for the dedicated engine when there's a concrete business case to justify it (typo tolerance, scale, complex ranking, autocomplete).

You're already paying for the content platform. Use it to its full capabilities.

Contact Us

Feel free to reach out.