CodeJourney Dev

Searching across vast, federated enterprise data is a significant challenge. In an environment like Netflix's, with hundreds of UIs and diverse domains, users faced a steep learning curve, needing to master a complex Filter Domain Specific Language (DSL) to find the information they needed. To break down this barrier, Netflix's engineering team leveraged AI, specifically Large Language Models (LLMs), to build a system that converts natural language questions into structured Graph Search filter statements. This post is based on the source material detailing their approach and key insights.

AI and Large Language Model concept illustration IT Technology Image

The Core Challenge: Syntactic, Semantic, and Pragmatic Correctness

When an LLM converts natural language to Graph Search Filter DSL, it must guarantee correctness on three levels:

Syntactic Correctness: Does the generated statement follow the DSL's grammar?
Semantic Correctness: Does the statement use only fields that exist in the index and values that are permitted? (Preventing hallucination)
Pragmatic Correctness: Does the generated filter accurately capture the user's true intent?

The key strategy to ensure semantic correctness was applying a RAG (Retrieval-Augmented Generation) pattern separately to both fields and controlled vocabularies.

Server rack and cloud infrastructure representing backend systems Dev Environment Setup

Context Optimization Strategy Using RAG

Providing all hundreds of fields and thousands of controlled vocabulary values to the LLM is inefficient and reduces accuracy. Netflix refined the context using a two-stage RAG approach.

Target	Method	Purpose
Field RAG	Embed field names, descriptions, and types, indexing them in a vector store. Perform similarity search between question chunks and embeddings to select Top-K fields.	Include only question-relevant fields in the context, reducing noise and improving accuracy.
Controlled Vocabulary RAG	Embed controlled vocabulary values (display names, descriptions, synonyms), indexing them. Search for values similar to the question to provide as allowed values for relevant fields.	Prevent LLM value hallucination and infer additional related fields.

This process resolves each question into an optimized combination of relevant fields and values.

Data flow and analysis diagram on a digital screen System Abstract Visual

Building User Trust and Pragmatic Strategies

To tackle the hardest problem—pragmatic correctness—Netflix introduced two practical UI/UX strategies.

Showing the Work: Instead of displaying the complex raw DSL query, the system parses it into an AST (Abstract Syntax Tree) and visualizes it as 'Chips' or 'Facets' in the user interface. Users can intuitively see how their question was interpreted into filter conditions and make adjustments.
Explicit Entity Selection (@Mentions): When a user mentions a specific entity like @dark, the system bypasses the RAG inference step and hardcodes that controlled vocabulary into the context. This removes ambiguity and simultaneously boosts accuracy and user trust.

Conclusion: This system is evolving beyond a simple translator into an extensible platform. It represents a best-practice case study combining the flexibility of LLMs, the robustness of existing Graph Search infrastructure, the accuracy gains from RAG patterns, and user-centric feedback mechanisms. It's an essential reference for any team looking to integrate AI into complex enterprise search systems.

The AI Evolution of Graph Search How Netflix Enables Natural Language Queries

The Core Challenge: Syntactic, Semantic, and Pragmatic Correctness

Context Optimization Strategy Using RAG

Building User Trust and Pragmatic Strategies

Share this post

Did you find this post helpful?
It helps the author a lot!

Comments 0

More to Explore

Critical React Server Components RCE Vulnerability (CVE-2025-55182) Immediate Action Guide

Context Engineering for Background Coding Agents A Deep Dive from Spotifys Trenches

Redesigning the Internets Most-Seen UI A Deep Dive into Cloudflares Turnstile

The AI Evolution of Graph Search How Netflix Enables Natural Language Queries

The Core Challenge: Syntactic, Semantic, and Pragmatic Correctness

Context Optimization Strategy Using RAG

Building User Trust and Pragmatic Strategies

Share this post

Did you find this post helpful?It helps the author a lot!

Comments 0

More to Explore

Critical React Server Components RCE Vulnerability (CVE-2025-55182) Immediate Action Guide

Context Engineering for Background Coding Agents A Deep Dive from Spotifys Trenches

Redesigning the Internets Most-Seen UI A Deep Dive into Cloudflares Turnstile

Did you find this post helpful?
It helps the author a lot!