When Partial Context Creates Dangerous Answers
A user asked, 'Can I expense a new coffee machine for my home office?' The retriever fetched a permissive policy document but completely missed the exclusions list that explicitly forbids kitchen appliances. The LLM answered 'Yes,' faithfully reflecting the incomplete context it received.
Impact:
Partially correct but subtly wrong answers are more dangerous than obviously wrong ones. This could have led to policy violations and financial disputes.
Enhanced documents with policy type, section, and relationship metadata
# LlamaIndex parsing with enhanced metadata
from llama_index.core.node_parser import SemanticSplitterNodeParser
parser = SemanticSplitterNodeParser.from_defaults(buffer_size=3)
nodes = parser.get_nodes_from_documents(docs)
for node in nodes:
node.metadata.update({
"policy_type": "reimbursement",
"section": extract_section(node.text),
"related_docs": find_related_policies(node.text)
})
Implemented hybrid retrieval with reranking to surface related exclusion documents
# Hybrid retrieval with metadata boosting
retriever = index.as_retriever(
similarity_top_k=10,
filters={"policy_type": "reimbursement"}
)
# Apply reranking to boost exclusion documents
reranked_nodes = reranker.rerank(nodes, query, top_k=5)
CI/CD gate that fails when required documents are missing from context
# Automated gate catches missing context
def contextual_recall_gate(test_case, retrieved_nodes):
required_docs = test_case["expected_context_sources"]
found_docs = [n.metadata["doc_id"] for n in retrieved_nodes]
missing = [doc for doc in required_docs if doc not in found_docs]
if missing:
raise Exception(f"Missing required docs: {missing}")
return len(found_docs) / len(required_docs)
CI/CD gate now fails with actionable error: 'Retriever not finding exclusion list for reimbursement queries.'