Lexicon Fundamentals: Building a Communications Surveillance Lexicon

Overview

This is Part 1 of our three-part series on mastering lexicon-based communications surveillance. In this blog, we explore what lexicons are, the levels of lexicon sophistication (from basic matching to AI-enhanced approaches), and how to set up an effective lexicon-surveillance framework from day one.

In Part 2, we will look at how to calibrate your surveillance lexicon to optimise performance and minimise keyword fatigue. In Part 3, we will explore how Artificial Intelligence (AI) can be applied alongside lexicon-based surveillance frameworks to improve risk detection, reduce false positives, and enhance efficiencies.

What is Lexicon-Based Communications Surveillance?

Communications surveillance lexicons are models that reference keyword lists and flag specific words or phrases that could indicate risky or prohibited behaviour, such as market manipulation, insider dealing, or non-financial misconduct.

These systems excel at detecting specific, predictable, well-defined risks like references to sanctioned countries and offensive language, making them critical components of firms' wider communications surveillance programs. Historically, they have been the standard for surveillance, and many institutions still use them today.

However, while financial firms spend millions annually on compliance technology, many still struggle with efficiently monitoring employee communications for potential misconduct, not because the technology is lacking, but because of how they set up and manage their lexicons. Getting communications surveillance right starts with well-designed scenarios.

Traditional Challenges with Lexicons

Lexicons are often viewed as a necessary but frustrating part of any electronic and voice communications surveillance suite. Most solutions don't come with any suggested terms, leaving compliance analysts unsure of how to create a surveillance lexicon from scratch. Others come pre-loaded with lexicon packages that are difficult to adjust and insensitive to context, generating high volumes of false positives and significant keyword fatigue.

The fundamental issue is that lexicon-based searches are inherently narrow. They only capture a subsection of an actual thought, often based on historical transgressions, and only look at prescribed words without understanding their context. While this can trigger valid alerts, it can also create several problems downstream:

High false positive rates that waste analyst time and cause genuine risks to be overlooked
False sense of security that gives the illusion of comprehensive coverage while missing critical threats
Operational burden from maintaining inflexible rule sets that deliver limited compliance value

Regulatory Expectations

Under regulations like the Market Abuse Regulation (MAR) in the UK and Europe, US rules from FINRA and the SEC, and requirements from regulators in Australia (ASIC), Singapore (MAS), and Hong Kong (HKMA), firms must demonstrate that their surveillance systems are both effective and proportionate.

A well-built and calibrated lexicon is essential for meeting these regulatory expectations, and that is what we will explore in this guide, covering best practices for design and implementation.

Setting Realistic Expectations

Today, technology presents a wide range of tools at your disposal, from reactive keyword searches to random sampling, proactive lexicons and artificial intelligence analyses. Understanding what lexicon-based surveillance can and cannot achieve is crucial for getting communications surveillance right:

Lexicons Excel At

Detecting high-certainty scenarios:

High-risk scenarios like "I need to push this rate higher", i.e., phrases you never want to miss
Sanctions monitoring
Offensive language detection

Extensible use cases: The beauty of lexicons is that you can easily add new terms, monitor a new department for a different use case by simply setting up a new rule.

Where Lexicons Fall Down:

Context and nuance: Lexicons cannot distinguish between innocent and harmful uses of the same phrase.
Future-proofing: They can only catch risks you have told the program to look for, creating blind spots for emerging threats.
Coded language: They are unable to detect individuals deliberately using alternative terms to evade detection.
Sarcasm and implied meanings: Lexicons struggle to detect subtle communication that humans would recognise as problematic.
Language evolution: They require constant updates as communication styles and terminology change.
Multi-language complexity: They require separate rule sets for each language, as cultural nuances don't translate easily.

setting-realistic-expectations-where-lexicons-fall-down

Understanding Precision Rates

A certain volume of false positives is expected. For example, the phrase "I just heard that" could indicate insider trading or rumour spreading, but it might also be used in the innocent context of a colleague leaving the company. While this creates a false positive, you may still want to retain this phrase in your lexicon because removing it could mean missing genuine insider trading risks, where this exact language is used in a harmful context.

The key is managing this ratio through ongoing calibration to prevent keyword fatigue.

What is Keyword Fatigue?

Keyword fatigue is a significant operational challenge that occurs when firms generate so many lexicon alerts that compliance teams cannot handle investigating them properly. When teams face overwhelming alert volumes, several dangerous patterns emerge:

Cursory reviews: Analysts spend insufficient time on each alert, potentially missing genuine risks
Bulk closures: Teams close multiple alerts without proper investigation to manage workload and meet internal SLAs
Investigation shortcuts: Reduced scrutiny leads to inconsistent and inadequate risk assessment

Keyword fatigue undermines the entire surveillance programme, creating regulatory risk and potentially allowing genuine misconduct to go undetected.

Setting Up Your Risk Scenarios and Keyword Lists

The foundation of any effective lexicon is understanding what risks you are trying to detect. Each term or phrase should map to a specific risk scenario.

Here are common categories and examples:

Market Abuse Detection

Collusion/Inducement: "ping whatsapp", "between us", "mustn't find out"
Market Manipulation: "big order coming", "heads up", "lift the price"
Information Sharing: "rumour has it", "about to announce"

Conduct Risk Monitoring

Breach Awareness: "being flagged", "keep it low key", "under the radar"
General Conduct Risk: "big favour", "bit cheeky but", "you owe me a"
Harassment and Bullying: Inappropriate language, intimidation tactics, threats

Operational Risk

Unauthorised Disclosure: Terms related to sharing client information or forward guidance
Conflicts and Complaints: "id faith in you", "you didn't follow my", "aggressive"

LINGUISTIC Diversity

Multinational firms should consider local languages, regional slang, and cultural differences in communication styles. What seems innocent in one culture might carry different implications in another. This requires separate sets of terms per language and is often not a like-for-like translation. Any accommodation of a different language has to be done with consideration for the spirit of what is intended to be detected.

Lexicon-Fundamentals-Language-diversity

Levels of Lexicon Sophistication

To set up an effective and proportionate communications surveillance program, firms can apply different levels of sophistication to their lexicons. Understanding these levels helps you choose the right approach for different risk scenarios:

+ Basic Keyword Matching (Like-for-Like)

Example: "fixing the price"

This approach searches for precise words or phrases exactly as specified. Basic matching offers clear benefits; you will always catch high-risk words and phrases when they appear precisely as programmed. However, this approach has significant limitations.

If you are too broad, you risk high volumes of false positives. For example, searching for "fixing" will generate alerts every time this word is used, including innocent conversations about "fixing the coffee machine" or "fixing the printer".

Conversely, if you are too specific with basic keyword matching, you risk missing variations of concerning phrases. A search for "fixing the price" will only flag this exact phrase, missing variations like "fix the price" or "fixing the price".

++ Advanced Matching Using Permutations/Regular Expressions, Stemming, Inclusions/Exclusions, Fuzzy Matching

Example: "do not/don't/dnt" followed by "disclose/disclosure", NOT followed by "my age"

This approach recognises the limitations of basic matching and incorporates several advanced techniques:

Permutations and regular expressions capture different ways of expressing the same concept ("do not/don't/dnt")
Stemming identifies word variations ("disclose/disclosure")
Fuzzy matching allows for typos and other undefined variations ("dont/dnt")
Inclusions and exclusions help eliminate known false-positive patterns (followed by/not followed by/preceded by)

This approach helps capture key risks while minimising false positives by adding a degree of context and language variations.

+++ Applied Filters and Machine Learning

Examples:

Metadata filters: Communications involving only Front Office Desk employees OR those based in France AND sent to external parties AND NOT more than 4 participants
Machine learning: Exclude from alerting if the term is in a disclaimer or the email is a newsletter

The third level of sophistication uses additional intelligence to further reduce false positives and increase effectiveness. By applying contextual filters and machine learning capabilities, firms can create highly targeted searches that focus on the most relevant communications.

++++ Intelligent Scoring using artificial intelligence

Building on the alerts that have been generated, AI can be used to provide context and score the alert depending on its risk level, even allowing for automatic closures of false positives. More of this will be described in Part 3 of this blog series.

Choosing the Right Approach

There are instances where basic keyword matching is sufficient or even preferred, particularly for phrases you absolutely do not want to miss, regardless of context. However, other scenarios require more sophisticated methods to create effective lexicons that balance comprehensive coverage with manageable alert volumes. This balance is critical to prevent keyword fatigue.

Lexicon Setup Principles

Lexicon-Fundamentals-Lexicon-set-up-principles

Tailor Your Approach

Different business lines and regions may require tailored lexicons. Market Manipulation terms may be more applicable to Rates desks, whereas Insider Trading may be more appropriate for the Equities desks. It is good practice for lexicons to be applied to specific teams, departments, and/or jurisdictions to reduce false positives by targeting relevant risks.

Even within departments, an IRS desk may look at different market abuse scenarios compared to a STIR desk, so tailored policies help ensure more precise detection and reduce irrelevant alerts.

Start Simple, Build Complexity

When building your first lexicon, resist the temptation to create an exhaustive list immediately. Start with high-confidence terms that indicate risk, then expand based on performance data and stakeholder feedback.

Some AI-based systems can suggest new phrases and terms as you build your lexicon, which helps ensure your lexicon includes various language patterns and potential new risk scenarios.

Carry Out Backtesting

Before launching your lexicon, it is crucial to conduct backtesting using a relevant set of historical data across teams, departments, locations, languages, and communication channels. This testing phase helps uncover potential issues before they impact live operations.

For example, if you operate in America and Spain, consider how words like "favor" would trigger differently in the different regions. This word is used frequently in Spanish ("por favor") and has a different meaning from the English word "favor". Backtesting on labelled historic data or a "golden source" helps you to test against communications you know should or should not trigger alerts.

Common Lexicon Mistakes that Lead to High Volumes of False Positives

Overusing Generic Terms

Common words like "sell", "buy", or "market" can produce thousands of irrelevant alerts. Focus on contextual phrases rather than individual common words. New implementations often fall into this trap, creating unmanageable alert volumes. Instead, think about using permutations, inclusions and exclusions (see the previous section) to create more targeted searches that capture the risk without too much noise.

Setting Search Terms that are Too Precise

Avoid creating overly specific terms to capture the risks identified in one historic situation. For example, if you uncovered that someone was using the term “umbrella” as a codeword for nefarious activity, simply adding that exact term to your lexicon won't help detect future coded language, which will likely use different terms entirely. Focus on broader behavioural patterns, such as quid pro quo or acting secretively, rather than one-off specific phrases.

Single Point of Failure

Avoid having just one person responsible for lexicon decisions. Include input from:

Compliance and surveillance teams that understand regulatory risk
Subject Matter Experts who know the current market language
HR or conduct teams where non-financial misconduct is in scope
Regional representatives for multinational firms

Even within these groups, seek multiple inputs to avoid bias and ensure a more comprehensive perspective on language usage and risk scenarios.

Setting and Forgetting

Lexicons should be dynamic, not set once and forgotten. Language evolves, new risks emerge, and trading strategies change. Plan for regular reviews at a frequency relevant to your business and risk profile. Regular reports on false positive ratios can help identify quick ways to optimise lexicons, while deeper analyses of missed risks may require more time and exploration. Consider tracking these basic metrics from the start:

Alert volume: Total alerts generated across each scenario
Precision rate: Percentage of alerts that led to an investigation
False Positive rate: Percentage of alerts that were identified as non-risks
Analyst time: How much time is spent reviewing alerts
Risk Coverage: Are there any risks that are being missed, either unintentionally or currently out of scope

Inadequate Documentation from Day One

Don't delay your internal documentation. You may need to log how you built your lexicons, with what data, for what purpose, including what is in and out of scope. This documentation becomes crucial for regulatory examinations and internal governance, providing the evidence trail that demonstrates thoughtful, risk-based decision making. We take a deeper look at documentation requirements in Part 2 of this blog, where we look at how to calibrate your existing lexicon.

Building Your Technology Foundation

Lexicon-Fundamentals-Building-your-technology-foundation

Your lexicon strategy is only as strong as the technology platform that supports it. The right surveillance infrastructure can make the difference between a responsive, adaptive program and one that struggles to keep pace with evolving risks.

Core Platform Requirements

Flexible Lexicon Management: Your platform should support easy addition and modification of terms without requiring extensive technical intervention. Given how quickly language evolves, compliance teams need the ability to adapt lexicons rapidly as new risks emerge, without relying on external IT resources.
Advanced Search Capabilities: Look for contextual search features including inclusions/exclusions, permutations, stemming and fuzzy matching. Basic keyword matching is no longer sufficient for sophisticated surveillance needs.
Customisable Scenarios: Your platform should empower you with the tools to tailor your alert generation as per your business needs, allowing for customised alerting using metadata filtering, machine learning classification and genAI capabilities.
Comprehensive Backtesting: The platform should facilitate seamless backtesting of new lexicon rules against both your production dataset and historical labelled data. This allows you to validate the impact of changes on alert volumes and precision rates before live deployment.
AI Keyword Suggestions: Your system should be able to interrogate your data in the background and suggest new keywords that might be relevant for your firm, providing insights that can expedite lexicon setup.
Robust Audit Trails: It is imperative that your system tracks all lexicon changes, decisions, and system interactions in a full audit trail. Regulators expect complete transparency in how surveillance rules evolve, and accountability should any reviews be needed.

Governance and Control Features

Maker-Checker Functionality: Implement "four eyes" approval processes to ensure all lexicon changes are reviewed and approved before implementation. This prevents unauthorised modifications and maintains quality control.
Performance Monitoring: The system should provide detailed reporting on alert volumes, patterns, and performance metrics. You need clear visibility into how your lexicons are performing and where adjustments may be needed.
AI Integration Readiness: Ensure your platform can seamlessly incorporate artificial intelligence models to complement lexicon-based approaches. (We'll explore this extensively in Part 3 of this series.)

Critical Integration Considerations

Voice Communications Handling: Assess how effectively the technology processes voice communications, particularly transcription accuracy. As lexicon-based monitoring depends on precise speech-to-text conversion, consider broader lexicons, using proximity settings and fuzzy matching, tailored to transcript data.
System Performance at Scale: Monitor performance as lexicon complexity increases. Your technology must handle growing rule sets and processing demands without compromising speed or efficiency, even during peak communication volumes.
Compliance Ecosystem Integration: Consider how keyword detection integrates with:
- Employee privacy rights and data retention policies
- Existing compliance workflows and case management systems
- Reporting requirements to senior management and regulators
- Cross-platform communication monitoring (email, chat, voice, social media).

What's Next?

This foundational approach to lexicon building provides the groundwork for effective communication surveillance, but it's just the beginning. As your firm grows and the regulatory, trading, and communications landscape evolves, your lexicons must evolve alongside them.

The reality is that surveillance is never "set and forget". Regular testing and refinement are essential, at a frequency that matches your firm's risk profile and operational tempo, to reduce false positives while maintaining comprehensive risk coverage. A lexicon that perfectly served your needs last year may leave critical gaps in today's landscape.

Moreover, your lexicons should be just one component of a broader communications surveillance framework. The most effective programs combine traditional keyword-based approaches with context-aware technologies and AI-driven models that can understand not just what words are used, but what they mean in context.

This is what we'll explore in the upcoming blogs in this series:

Part 2: Calibrating an Effective Lexicon Policy: We'll dive deep into best practices and proven techniques for fine-tuning your lexicons to achieve optimal performance with minimal false positives.
Part 3: AI-Enhanced Surveillance: We'll examine how artificial intelligence can enhance and complement traditional lexicon-based approaches, moving from simple pattern matching to true meaning detection. The future of communications surveillance isn't about choosing between lexicons and AI; it's about building intelligent, adaptive systems that leverage the strengths of both.

Experience Smarter Communications Surveillance

Ready to move beyond basic keyword matching? SteelEye's AI-enhanced Surveillance Lexicon delivers the intelligent, adaptive monitoring capabilities outlined in this guide, reducing keyword fatigue while catching real risks.

See How It Works →

Book a Demo

Nothing compares to seeing it for yourself. Schedule a demo now to discover how SteelEye transforms compliance. Provide your details below and we'll be in touch.

Newsletter Signup

Stay ahead of compliance updates, market trends, and exclusive SteelEye news.

Latest News

Blog

Elliot Associates vs. Peru - Precurser to the Argentinian Sovereign Debt Crisis

Matt Storey

| 13 Nov 2025

Enforcement

Carat GP Fine - €2.5m - AML & Conduct - AMF - Nov-25

SteelEye

| 05 Nov 2025

Press Release

FundApps and SteelEye merge to create a unified, end-to-end global regulatory compliance platform

SteelEye

| 03 Nov 2025

First Trust Portfolios L.P. 2025 FINRA Fine - $10,000,000

Enforcement

First Trust Fine - $10M - Gifts & Entertainment - FINRA - Oct-25

SteelEye

| 31 Oct 2025

Blog

Data Ownership Part 3: Ending Extraction Fees Under the EU Data Act - An Outreach Template!

Matt Storey

| 30 Oct 2025

Macquarie 2025 ASIC Inadequate Supervision Fine

Enforcement

Macquarie - $TBC - Inadequate Supervision - ASIC - Sep-25 - Shield Fund Contraventions

SteelEye

| 26 Oct 2025

Lexicon Fundamentals: Building a Communications Surveillance Lexicon

You Seem Interested

Contents

Overview

What is Lexicon-Based Communications Surveillance?

Traditional Challenges with Lexicons

Regulatory Expectations

Setting Realistic Expectations

Lexicons Excel At

Where Lexicons Fall Down:

Understanding Precision Rates

What is Keyword Fatigue?

Setting Up Your Risk Scenarios and Keyword Lists

Market Abuse Detection

Conduct Risk Monitoring

Operational Risk

LINGUISTIC Diversity

Levels of Lexicon Sophistication

+ Basic Keyword Matching (Like-for-Like)

++ Advanced Matching Using Permutations/Regular Expressions, Stemming, Inclusions/Exclusions, Fuzzy Matching

+++ Applied Filters and Machine Learning

++++ Intelligent Scoring using artificial intelligence

Choosing the Right Approach

Lexicon Setup Principles

Tailor Your Approach

Start Simple, Build Complexity

Carry Out Backtesting

Common Lexicon Mistakes that Lead to High Volumes of False Positives

Overusing Generic Terms

Setting Search Terms that are Too Precise

Single Point of Failure

Setting and Forgetting

Inadequate Documentation from Day One

Building Your Technology Foundation

Core Platform Requirements

Governance and Control Features

Critical Integration Considerations

What's Next?

Experience Smarter Communications Surveillance

Book a Demo

Newsletter Signup

Latest News

Elliot Associates vs. Peru - Precurser to the Argentinian Sovereign Debt Crisis

Carat GP Fine - €2.5m - AML & Conduct - AMF - Nov-25

FundApps and SteelEye merge to create a unified, end-to-end global regulatory compliance platform

First Trust Fine - $10M - Gifts & Entertainment - FINRA - Oct-25

Data Ownership Part 3: Ending Extraction Fees Under the EU Data Act - An Outreach Template!

Macquarie - $TBC - Inadequate Supervision - ASIC - Sep-25 - Shield Fund Contraventions