This is the story of a mid-sized digital agency that turned a frustrating support problem into a working semantic search, told as a sequence of decisions rather than a list of features. The names and exact numbers are illustrative, drawn from patterns common to teams making this move, but the arc, the tensions, the wrong turns, the eventual payoff, mirrors what actually happens when a team adopts a vector database for the first time.
The point of a narrative is to show the reasoning in context. You will see where the team hesitated, what they got wrong on the first pass, and which choice quietly determined the outcome. Read it as a rehearsal for your own version of the same project.
We begin where most of these projects begin: with a metric that would not improve.
The Situation
A Support Queue That Would Not Shrink
The agency ran client services across dozens of accounts, and its internal documentation had grown into a sprawling wiki of process docs, runbooks, and policy notes. Account managers were filing internal questions constantly, "what is our refund policy for paused retainers," "how do we handle a client's DNS change", because finding the answer in the wiki was harder than asking a colleague. The support burden on senior staff kept climbing.
Why Keyword Search Had Failed
The wiki had a search box, but it matched keywords. People phrased questions in their own words, which rarely matched the documentation's headings. A search for "client wants to pause and resume later" returned nothing useful, even though a "retainer suspension" runbook covered exactly that. The vocabulary gap was the whole problem.
The Decision
Choosing Semantic Search
A small internal team proposed embedding the wiki and replacing keyword search with similarity search, so questions phrased naturally would surface the right runbook. The skeptics worried it would be expensive and fragile. The deciding argument was scope: they would start with one well-bounded corpus, the internal wiki, and measure whether senior-staff interruptions dropped.
Starting Small on Purpose
Rather than build a grand platform, they scoped a two-week prototype over the existing wiki. They chose a vector extension on the database they already ran, avoiding new infrastructure, an approach echoed in Pinecone, Weaviate, pgvector: Matching Engines to Workloads. The narrow scope kept the experiment cheap to abandon if it failed.
The Execution
The First Pass That Disappointed
The first version embedded entire wiki pages as single vectors. Results were mediocre: queries returned long pages that technically related to the topic but buried the answer. The team had hit a classic failure, oversized chunks, documented in Seven Ways a Vector Store Quietly Returns Junk. It looked like the idea might not work.
The Fix That Changed Everything
They re-chunked the wiki into sections of a few hundred words with light overlap and re-embedded. Suddenly queries matched the specific passage that answered them. The same questions that had returned vague pages now returned the exact paragraph. That single change, from page-level to section-level chunks, was the inflection point of the entire project.
What made the lesson stick was how undramatic the fix was. No new database, no fancier model, no clever algorithm, just a different decision about what a record should be. The team had spent the first week suspecting the whole approach was flawed, when the real problem was a setting they had chosen almost without thinking. It reset their instincts: when results disappoint, look first at how the content is being prepared, not at the technology underneath.
Adding Filters and Freshness
They stored metadata, account type, document category, last-updated date, so results could be filtered and stale docs deprioritized. Then they wired ingestion into the wiki's edit flow so new and changed pages re-embedded automatically, keeping the index fresh. These choices followed the discipline laid out in Opinionated Rules for Running Embeddings in Production.
The Outcome
What Actually Changed
Over the following weeks, internal questions routed to senior staff dropped noticeably as account managers found answers themselves. The team tracked a clear before-and-after: the volume of "how do we handle" messages in the internal channel fell substantially, and the time to find a documented answer shrank from minutes of hunting to a single query. The senior staff reclaimed hours each week.
Just as important, the change was self-reinforcing. Because answers were now easy to find, account managers actually used the documentation instead of routing around it, which surfaced gaps and errors that had gone unnoticed for years. The team began fixing those, and each fix made the search more useful, which drove more usage. A tool that merely saved time had quietly become a feedback loop improving the underlying knowledge base.
The Honest Caveats
It was not magic. Exact-lookup questions, finding a specific client ID or invoice number, still belonged to structured search, and the team kept those separate. And the wiki's quality became the ceiling: where docs were thin, no amount of clever retrieval invented an answer. The system surfaced what existed, no more.
The Lessons
What They Would Do Differently
Chunk first, page-level embedding cost them a week of doubt that section-level chunking would have prevented. They also wished they had stored metadata from the very first prototype rather than retrofitting it. Both lessons appear, generalized, in Twelve Items to Verify Before You Trust a Vector Index.
The Cost That Surprised Them
One expense they had not budgeted for was the human review. Validating that results were genuinely relevant, rather than merely plausible, took real hours from someone who knew the documentation well. They had assumed the system would be self-evidently good or bad, but quality lived in the gray zone where only a knowledgeable reader could judge. In hindsight, they would have scheduled that review time from the start and treated it as part of the build, not an afterthought, because it was the only thing that caught the oversized-chunk problem.
What Transferred to Other Teams
The pattern, scope a bounded corpus, prototype on existing infrastructure, measure a real metric, chunk deliberately, then automate freshness, transferred cleanly to other internal search projects at the agency. The vector database was never the hard part; the surrounding discipline was.
Within a quarter, the same playbook was applied to a second corpus, the agency's library of past proposals, so account teams could find precedent for new pitches. Because the team already knew where the pitfalls lived, the second project skipped the week of doubt entirely and reached useful results in days. That acceleration is the real return on a first project done carefully: the methodology compounds, and each subsequent search costs less to build than the last because the hard-won lessons travel with the team rather than having to be relearned.
Frequently Asked Questions
Why did the first version with whole-page embeddings fail?
Embedding a whole page compresses many topics into one vector, so queries match the page's general subject rather than the specific passage that answers them. Results felt related but unhelpful. Splitting pages into sections let queries match the exact relevant paragraph.
How did the team measure success?
They tracked a concrete behavioral metric: the volume of internal questions routed to senior staff and the time to find a documented answer. Tying the project to an existing pain metric, rather than a vague sense of improvement, made the outcome defensible.
Why start with a vector extension instead of a dedicated database?
The wiki was small enough that an extension on their existing database handled it easily, avoiding new infrastructure and keeping the prototype cheap to abandon. Dedicated databases earn their place at larger scale or when managed operations justify the cost.
Did semantic search replace keyword search entirely?
No. Exact lookups, client IDs, invoice numbers, stayed with structured search, where precision matters more than meaning. The agency ran both, routing each query type to the tool that handled it best. The two complement rather than replace each other.
What was the biggest non-technical lesson?
That retrieval quality is capped by content quality. Where documentation was thin or outdated, no retrieval technique invented good answers. The project quietly motivated the team to improve the underlying docs, which compounded the benefit.
Could this approach work for client-facing support too?
Yes, with care. The same pattern applies to public help centers, but access controls and content accuracy matter more when customers are the audience. The agency planned that as a later phase once the internal version proved the approach.
Key Takeaways
- A bounded corpus, the internal wiki, and a real pain metric made the project measurable and cheap to attempt.
- Whole-page embeddings disappointed; re-chunking into sections was the single change that made retrieval useful.
- Storing metadata and automating freshness turned a prototype into a maintainable system.
- The measurable outcome was fewer interruptions to senior staff and faster answer-finding for account managers.
- Retrieval quality is capped by content quality; thin docs cannot be rescued by clever search.
- The transferable lesson is that the vector database is the easy part; chunking, metadata, and freshness discipline carry the result.