• Decis SITREPS
  • Posts
  • The Domain Speciality Paradox and Implications for AI-Assisted Analysis

The Domain Speciality Paradox and Implications for AI-Assisted Analysis

This is a ‘living’ article which will be updated as progress is made in tests and other research emerges. The original articile was written on Jaunuary 13 2025 and major edits will be dated.

There are also demos of this process in action, the latest being the presentation to the Virtual AI Engineer Summit in NY in March 2025.

Updates to add

  • Utilize domain-specific heuristics (rules of thumb) for few-shot training

  • Include validation loops to review and assess results against heuristics, goals and constraints

  • Run scenarios on top of a worldsim to allow the model to draw general, real world knowledge into the deliberations

  • Challenges:

    • Domains that are highly System II dependent lack robust heuristics and rules of thumb

    • Strict rules could constrain creativity

Executive Summary

Large Language Models (LLMs) continue to improve but data deficits in specialized domains and the fundamental differences between predictive completion and expert reasoning mean that we are still a long way from creating specialist analytical agents for high-value, specialized tasks (eg negotiations, crisis management or intelligence analysis). Current LLM development remains founded on enhancing pattern recognition systems through better training which will not overcome this shortfall, nor will the introduction of chain of thought processes. 

However, we have a human model for overcoming domain inexperience from management consultancy and other roles that require high-level problem-solving. Non subject-matter-experts (SME) can be provided with domain-specific heuristics or rules of thumb as guidance. These heuristics supplement their otherwise capable reasoning skills, allowing them to perform in unfamiliar conditions. We can take a similar approach with capable LLMs, providing them with SME-generated, domain-specific heuristics they can apply as necessary. Unlike few-shot training, the base model is not being trained on these rules: instead it recursively applies these guidelines as it iterates through the problem. It is noteworthy that this approach is not without difficulty as there are many domains that lack a clean set of higher-order guiding principles so developing the heuristics will be time-consuming and complex in its own right.

Utilizing the world simulation capabilities of some models further enhances performance as it allows the model to build its reasoning on a realistic view of the world, ensuring that suggestions are realistic and take into account relevant factors not specified in the problem. Initial test results using corporate negotiations have been promising and proved the concept was valid.

This Domain Speciality Paradox -- where the most complex, high-value tasks are least supported by artificial intelligence (AI) -- is likely to persist until new approaches to model architecture, training and application are discovered. However, while this gap persists, the application of domain-specific heuristics offers a straightforward solution to this problem, allowing us to apply these LLMs to high-value tasks with encouraging results.

The Domain Speciality Paradox

As a domain becomes more specialized, narrow and high-value, there appears to be an inverse relationship with the availability of suitable training data, creating a fundamental barrier to deploying general-purpose LLMs in these contexts. This paradox manifests in three critical ways:

  1. Data Scarcity: Specialized domains inherently generate less documentable data

  2. Data Privacy: High-value domains often have strict confidentiality requirements

  3. Data Quality: The available data may not capture crucial decision-making processes

These shortfalls are characteristics of high-level analytical tasks such as complex business negotiations, crisis management, or intelligence forecasting: activities we’ll define here as narrow specialties. This is distinct from broad specialties like trading, medicine or coding, where there is an abundance of publicly available, well-structured data.

Therefore, this creates a fundamental challenge: the most valuable applications of AI in terms of expert analysis might be the least suited to current frontier LLM approaches.

Differentiation of Tasks

Understanding the limitations of current approaches requires distinguishing between several types of analytical work:

  • General application: Broad, shallow analysis across multiple domains

  • Creative tasks: Generation of new content or ideas

  • Pattern matching analysis: Identification of known patterns or structures

  • Expert analysis: Deep, domain-specific reasoning with complex decision-making.

See the exhibit at the end for an example that further differentiates between pattern matching  and creativeaAnalysis

General vs Specific, Complex vs Creative Comparison

Mapping these four applications across two axes -- general vs specific and complexity vs creativity -- helps differentiate further.

Task Complexity

Task Creativity

High Domain Specificity

Specialized GeneratorsCode completion, Domain writing, Technical documentation, High-precision generation

Expert Analysis Systems, Legal reasoning, Financial modeling, Complex domain logic, Rules-based validation

Low Domain Specificity

Pattern Recognition, Medical imaging, Trading patterns, Statistical analysis, Verifiable outputs

General Creative LLMs, GPT-4/Claude, Broad content generation, High adaptability, Lower reliability

Ongoing Challenges Maintain Imbalance

In this four-quadrant structure, our concern is that progress is moving from bottom to top left only: top right high-creativity domain-specific work is not addressed. Moreover, it is hard to see how these advances will change direction due to some fundamental challenges.

Training Data Challenges

There is only a finite amount of data in the world and we may have already used all available written data to train LLMs. Open AI founder Ilya Sutskever recently stated that “We’ve achieved peak data and there’ll be no more.” Model trainers are turning to synthetic training data which would work well for generative tasks but we can’t synthesize new laws or mathematical formulae. Nor can we effectively create synthetic data in narrow, specialist domains as there’s still a shortage of original material to use as a reference point.

Importantly, a reliance on previous examples means that Black Swan events or truly innovative solutions are impossible to conceive as there is no historical precedent to learn from. Human-generated examples can overcome this but the limitations of humans to produce high volumes of suitable data will be a bottleneck.

Architectural Challenges

Training AIs requires the models to mimic outputs which are scored for accuracy over and over until the model can reliably produce the ‘correct’ answer. However, this focus on accurate replication is at odds with the ‘good enough’ common sense that humans rely upon to make deductions and decisions. Meta’s head of AI research, Yann LeCun, believes that a wholly different approach to building AIs is required than the current approach, describing today’s two main approaches as ‘dead ends’. Many are skeptical of LeCun’s suggestion, but even if this is the right approach, the solution is a long time off as even LeCun admits he isn't sure how to train a cognitive model noting, “We need to figure out a good recipe to make this work, and we don’t have that recipe yet,”

However, even if LeCun is wrong about the solution, his observation that simply creating bigger and bigger LLMs to get to high-level reasoning seems to be borne out by the limitations in creative analysis in the current crop of huge models.

Market Challenges

Despite the high value a decision-making AI could offer in a crisis, this is a small, very focused activity that is utilized very infrequently. Many other high-level decision-making activities in business and government are similarly low-frequency, high-impact events. This infrequency means that there is little financial incentive to spend money on these initiatives beforehand, meaning that there’s no market demand for these kinds of solutions. Plus, the narrowness of specialization means that practitioners are unaware of the parallel struggles that peers in other functions are experiencing. Therefore market demand can appear less than it actually is as the core requirement -- a creative analytical engine -- is diluted over several domains.

Contrast this with medical imaging, which is a significant domain in its own right where the segmentation takes place within the overall specialty but the size of the field is sufficient to generate market interest.

Short-Term Solutions

Despite these constraints and the lack of an obvious solution, there are some solutions that can be implemented in the short-term to use AI to augment and improve creative analysis.

Chaining Specialist AIs

We have identified that it is unrealistic to expect one AI to conduct a mutli-step creative process and this has also been our experience. However, ’chaining’ a series of narrowly trained, focused AIs and other specialist tools into a single workflow can produce high quality results. This takes advantage of each AI’s specialist ability while maintaining control over inputs and outputs. Maintaining humans in the loop can increase the effectiveness of this approach to reorientate any agents that stray off course and to review and audit results, utilizing the human’s System II experiential insight. 

Enhanced RAG

Retrieval Augmented Generation (RAG) is a technique where the LLM provides the natural language interface and search but the knowledgebase is augmented by, or limited to, a specific dataset. This is extremely effective for domain-specific work where the LLM is restricted to only return results from the specific references. In that sense, RAG results are closer to enhanced search than LLM generated-responses.

However, there are several techniques that expand the capabilities of a strict RAG system which could expand the creativeness and breadth of responses from an AI. Naveen Pandey has expanded on several of these identifying 25 techniques. Many of these would help maintain a narrow, domain-specific focus (the RAG element) but enhance the results with different types of additional input, such as websearch, tasking a different model, etc.

Maze Running

This is untested and entirely speculative

Many of the techniques used by maze-running robots have applications to search and potentially decision-making. Therefore, some form of maze-running algorithm could be adapted to create a form of optimized decision-making tool. Given a goal and a set of constraints, the model could begin creating a map of potential options before checking each option to determine if 1) it achieves the goal or 2) it contravenes the constraints.

Using a breadth-first approach would identify quick win solutions and eliminate non-viable options quickly allowing the model to run a series of options at high speed. 

Potentially, two or more models could run options for all participant parties analysing their options and utilizing their constraints and goals to determine their possible courses of action (CoA).

Conclusion and Next Steps

The fundamental limitations of the available data and architecture of existing LLMs indicates that the Domain Specific Paradox will remain for some time. A breakthrough in common sense training of AIs would overcome this limitation but, in the meantime, subject-matter-experts in narrow domains who wish to use AI to enhance their creative decision-making should recognize these limitations and use AI with care. 

However, the chaining, enhanced RAG and maze running techniques offer promise to overcome these limitations and are with further exploration.

Exhibit - Differentiating Pattern Matching from Creative Analysis

A legal example helps illustrate the difference between these two use cases which can appear similar but are very different.

High-Domain-Specificity / High Complexity

US legislation and case-law are well recorded and clearly structured, making these excellent training materials for an LLM. The resultant model will perform well with pattern-based tasks such as researching similar cases,and creating basic briefs using a standard template, in addition to general LLM tasks such as summarizing.

High-Domain-Specificity / High Creativity

Conducting a series of negotiations or designing a case strategy are informed by the law but require high levels of System II (experiential) thinking, emotional intelligence, creativity, adaptability and innovation. 

An LLM might be able to offer some boilerplate suggestions but will be heavily dependent on a small handful of well-recorded examples, limiting its abilities. Simply adding more case law examples will not overcome this limitation as the deliberation and discussion of the strategy will not be included in these records.

Fine-Tuning and RAG Augmentation

  • Fine-tuning: Potential for adjusting basic domain understanding

  • RAG (Retrieval-Augmented Generation): Essential for maintaining current knowledge and cited sources

  • Hybrid approaches: Combining multiple techniques for enhanced reliability

Architectural Considerations

  • The need for verifiable outputs

  • Integration of rule-based systems

  • Maintenance of clear information provenance

Implementation and Validation

Validation Frameworks

  • Metrics for evaluating analytical reliability

  • Testing methodologies for specialized systems

  • Comparative performance measures

Risk Management

  • Liability considerations in high-stakes analytical decisions

  • Accountability frameworks

  • Audit trails and decision documentation

Implementation Strategy

  • Migration paths from current systems

  • Required organizational capabilities

  • Resource and skill requirements

  • Integration with existing processes

Future Implications

The Domain Speciality Paradox suggests that:

  1. Development of truly expert AI systems may require fundamentally different approaches

  2. Current trends toward larger, more general models may not address specialized analytical needs

  3. Alternative architectures focusing on reliability and verification may be necessary

Looking Forward

The path forward likely involves:

  • Recognition of the limitations of current approaches

  • Development of specialized architectures for analytical work

  • Integration of multiple approaches for different types of tasks

  • Understanding the appropriate role of AI in analytical processes

The future of AI-assisted analysis may not lie in creating ever-larger general models, but in developing specialized systems that can reliably augment human expertise while maintaining verifiable outputs and clear decision trails.