This whitepaper presents ATLAS, a conceptual architecture for a natural language interface to IT Service Management (ITSM) systems, designed for mid-to-large enterprise environments. ATLAS employs a novel multi-agent AI architecture that transforms unstructured natural language queries into actionable insights from service desk data.
The proposed system features a three-stage AI pipeline for query analysis, data retrieval refinement, and conversational response generation, combined with a hybrid data synchronization strategy designed to maintain data freshness without impacting query performance. We present modelled evaluation results based on simulated workloads, projecting 90-95% query understanding accuracy with median response times under 4 seconds.
Key contributions include:
- A context-aware conversation management system enabling multi-turn dialogue with follow-up query understanding
- A thread-safe concurrency model for AI agent orchestration
- A fallback-resilient query parsing system with heuristic degradation
- Detailed cost analysis projecting significant reduction in time-to-insight versus traditional methods
- Comprehensive error analysis with mitigation strategies
We compare ATLAS against existing commercial solutions and open-source alternatives, demonstrating potential for superior context retention and domain-specific accuracy. The paper concludes with an honest assessment of anticipated limitations and a roadmap for addressing identified gaps.
Keywords: Natural Language Processing, IT Service Management, Multi-Agent Systems, Conversational AI, Enterprise Architecture, LLM Orchestration
Table of Contents
- Introduction
- Related Work
- Problem Statement
- System Architecture
- Multi-Agent AI Pipeline
- Prompt Engineering
- Conversation Context Management
- Data Synchronization Strategy
- Query Processing Engine
- Concurrency & Thread Safety
- Data Transformation Pipeline
- Security Considerations
- Evaluation & Results
- Limitations
- Architecture Decision Records
- Future Work
- Conclusion
- References
- Appendices
1. Introduction
1.1 Background
IT Service Management (ITSM) systems are the backbone of enterprise IT operations, handling thousands of service requests, incidents, and change management workflows daily. In mid-to-large organizations, these systems typically process hundreds to thousands of tickets daily, generating valuable operational data that remains largely inaccessible to non-technical stakeholders.
While ITSM platforms excel at structured data storage and workflow automation, they present significant usability challenges:
- Query Complexity: Extracting insights requires knowledge of query syntax, report builders, or API integrations
- Data Accessibility: Operations managers, team leads, and executives struggle to access real-time metrics
- Context Switching: Users navigate multiple interfaces to correlate information
- Reporting Latency: Ad-hoc queries often require IT intervention, with turnaround times of hours to days
1.2 ATLAS Overview
ATLAS (Automated Ticket Language Analysis System) is designed to address these challenges by providing a natural language interface that allows users to query ITSM data conversationally. The system would transform queries like:
- "How many tickets does the support team have this month?"
- "Which support personnel have been inactive for two weeks?"
- "What hour had the most requests yesterday?"
- "How many of them are open?" (follow-up query)
Into structured database operations, returning results in natural language with downloadable exports.
1.3 Deployment Context
ATLAS is designed for deployment in mid-to-large enterprise environments with characteristics such as:
- Daily ticket volume: Hundreds to thousands of requests
- Active technicians: 100-500+ support staff
- User base: 50-200+ operations and management personnel
- Data volume: Tens of thousands of historical tickets
- Availability requirement: 99%+ uptime
1.4 Contributions
This whitepaper makes the following contributions:
- Multi-Agent Architecture: A three-stage AI pipeline separating query understanding, data validation, and response generation
- Context-Aware Conversations: Conversation management enabling follow-up queries with pronoun resolution and filter inheritance
- Hybrid Sync Strategy: Non-blocking background synchronization maintaining data freshness
- Proposed Patterns: Patterns for thread safety, fallback handling, and graceful degradation
- Modelled Evaluation: Projected accuracy metrics and error categorization
- Honest Limitations Assessment: Transparent discussion of anticipated system constraints and failure modes
2. Related Work
2.1 Commercial ITSM AI Solutions (2024-2025)
|
System |
Approach |
Key Features (2024-2025) |
Limitations vs ATLAS |
|---|---|---|---|
|
ServiceNow Now Assist |
GenAI with NowLLM + BYO model support |
Incident summarization, AI Search with RAG "Genius Results," Skill Kit for custom GenAI (Xanadu release, Sept 2024) |
Agent productivity focus; requires flow configuration; limited analytical query support |
|
Freshworks Freddy AI |
Three-tier AI (Self-Service, Copilot, Insights) |
AI Agents with agentic workflows (Oct 2024), 80% query resolution claim, real-time sentiment, Slack/Teams integration |
Customer-facing automation focus; limited internal analytics; no multi-turn data queries |
|
Zendesk AI Agents |
Essential + Advanced tiers with GPT-4o |
Generative replies, intelligent triage, ticket summarization, auto-assist (March 2024 GA), custom intents |
Ticket deflection design; limited data analytics; no follow-up query context |
ServiceNow Now Assist (2024): ServiceNow's Xanadu release (September 2024) introduced significant GenAI capabilities. Now Assist features include case/incident summarization, chat reply generation, and AI Search with RAG-based "Genius Results" that generate answers from knowledge articles. The platform supports bring-your-own LLM models and integrates with Microsoft Copilot. The Now Assist Skill Kit enables custom GenAI skill development. However, Now Assist primarily enhances agent productivity rather than enabling analytical querying of ticket data.
Freshworks Freddy AI (2024): Freddy AI evolved in 2024 to include three tiers: Self-Service (bots), Copilot (agent assistance), and Insights (analytics). The October 2024 update introduced GenAI-powered AI Agents with pre-built "agentic workflows" for e-commerce integrations (Shopify, Stripe, FedEx). Freddy AI Agent integrates with Slack and Microsoft Teams for 24/7 employee support. While Freddy excels at automated resolution, it lacks multi-turn analytical conversation capabilities.
Zendesk AI Agents (2024): Zendesk's March 2024 GA release brought generative AI features (summarize, expand, tone shift) upgraded to GPT-4o. The Advanced AI add-on includes intelligent triage with custom intents, auto-assist for guided resolution, and suggested first replies. Zendesk reports over 1.5 million monthly uses of these features. The platform optimizes agent workflows but does not address natural language analytics queries.
ATLAS Differentiation: Unlike commercial solutions that optimize agent workflows or automate ticket resolution, ATLAS specifically addresses the analytical query gap, enabling natural language questions about ticket data (technician performance, volume trends, inactive staff) with multi-turn context preservation.
2.2 Academic Research & Frameworks (2023-2025)
Text-to-SQL Systems
BIRD Benchmark (Li et al., 2024): The BIRD benchmark represents the current state-of-the-art evaluation standard for text-to-SQL, comprising 12,751 text-SQL pairs across 95 databases totaling 33.4 GB. Published at NeurIPS 2023 and continuously updated, BIRD emphasizes real-world challenges including dirty data, external knowledge requirements, and SQL efficiency (Valid Efficiency Score metric). As of late 2024, GPT-4 achieves approximately 54.89% execution accuracy on BIRD, significantly below human performance of 92.96%.
Spider 2.0 (Lei et al., 2024): Released in late 2024, Spider 2.0 further increases complexity with 632 enterprise-level workflow problems requiring interaction with cloud databases (BigQuery, Snowflake), queries exceeding 100 lines, and multi-step reasoning. Current state-of-the-art models achieve only approximately 6% accuracy on Spider 2.0, demonstrating significant remaining challenges.
ATLAS vs Text-to-SQL: ATLAS operates at a higher abstraction level than direct text-to-SQL:
- Query analysis produces semantic intents (top_technicians, influx_requests) rather than raw SQL
- Domain-specific query types enable optimized retrieval patterns
- Conversational context enables filter inheritance across turns (not addressed by text-to-SQL benchmarks)
Multi-Agent LLM Frameworks
Microsoft AutoGen (Wu et al., 2023; v0.4 January 2025): AutoGen pioneered the multi-agent conversation paradigm for LLM applications. Originally released in Fall 2023, AutoGen v0.4 (January 2025) introduced an actor-based architecture with asynchronous messaging, modular agent composition, and AutoGen Studio for no-code agent building. The framework supports diverse applications including code generation, task automation, and conversational agents.
LangGraph (LangChain, January 2024): LangGraph provides graph-based agent orchestration with native support for cycles, human-in-the-loop patterns, and persistent state management. The framework became widely adopted for production agents in 2024, with deployments at Klarna, Replit, and Uber. LangGraph's hierarchical team patterns (supervisor agents coordinating specialized agents) and the December 2024 "Command" primitive for multi-agent communication influenced ATLAS's pipeline design.
ATLAS Contributions vs Frameworks:
- Domain-specific ITSM agent specialization (vs general-purpose frameworks)
- Production-tested concurrency patterns for AI platform thread limitations
- Heuristic fallback system for graceful degradation (critical for enterprise reliability)
Retrieval-Augmented Generation (RAG)
RAG architectures have evolved significantly since foundational work in 2020. Key 2024 developments include:
- GraphRAG (Microsoft, mid-2024): Extracts knowledge graphs from text for hierarchical retrieval, addressing semantic gap challenges between queries and documents
- RAPTOR (Sarthi et al., 2024): Recursive abstractive processing for tree-organized retrieval, enabling multi-level document summarization
- Agentic RAG (2024): Integration of autonomous agents with RAG pipelines for dynamic retrieval triggering based on generation uncertainty
- RAG Evaluation Frameworks: RAGAS for reference-free metrics and RAGTruth corpus (Niu et al., 2024) for hallucination analysis
ATLAS vs RAG: ATLAS extends RAG principles to structured database retrieval:
- Retrieves from SQL databases rather than document stores
- Uses a refinement agent (Stage 2) to validate retrieval accuracy—analogous to RAG reranking
- Generates conversational responses grounded in retrieved structured data
3. Problem Statement
3.1 Hypothetical Pain Points
Based on analysis of typical ITSM workflows in mid-to-large enterprises, ATLAS is designed to address the following anticipated challenges:
|
Metric |
Typical Baseline |
Target |
Projected Outcome |
|---|---|---|---|
|
Time to answer, "How many tickets does X have?" |
10-15 minutes |
< 30 seconds |
< 10 seconds |
|
Ad-hoc report requests to IT team |
30-50/week |
< 10/week |
~85% reduction |
|
Self-service analytics adoption |
10-20% |
> 60% |
70-80% |
|
Manager access to real-time metrics |
Limited |
All managers |
Broad access |
3.2 Design Requirements
|
Requirement |
Description |
Priority |
Validation Method |
|---|---|---|---|
|
Natural Language Understanding |
Parse unstructured queries into structured parameters |
Critical |
Accuracy testing |
|
Context Awareness |
Understand follow-up questions referencing previous queries |
Critical |
Multi-turn testing |
|
Real-time Data |
Query data current within 5 minutes |
High |
Sync latency measurement |
|
Concurrent Access |
Support 50+ simultaneous users |
High |
Load testing |
|
Response Time |
< 10 seconds for 95th percentile |
High |
Performance monitoring |
|
Export Capability |
Downloadable CSV results |
Medium |
Functional testing |
|
Personalization |
Support "my tickets" queries |
Medium |
User acceptance testing |
4. System Architecture
4.1 High-Level Architecture
Figure 1: System high-level architecture
4.2 Technology Stack
|
Layer |
Technology |
Version |
Rationale |
|---|---|---|---|
|
Runtime |
.NET |
8.0 LTS |
Enterprise support, performance |
|
Framework |
ASP.NET Core |
8.0 |
Native async, DI, middleware |
|
ORM |
Entity Framework Core |
8.0 |
Type-safe queries, migrations |
|
Database |
SQL Server |
2019+ |
Enterprise reliability, JSON support |
|
AI Platform |
Azure AI Agents |
Preview |
Persistent threads, managed infrastructure |
|
Caching |
IMemoryCache |
Built-in |
Token caching, low latency |
|
Authentication |
DefaultAzureCredential |
Latest |
Managed identity support |
5. Multi-Agent AI Pipeline
5.1 Pipeline Overview
ATLAS employs a three-stage AI pipeline where each agent has a specialized role:
Figure 2: Multi-agent pipeline
5.2 Agent Specialization Rationale
|
Aspect |
Single Agent |
Three Agents (ATLAS) |
|---|---|---|
|
Prompt Size |
3000+ tokens |
~800 tokens each |
|
Failure Isolation |
All-or-nothing |
Isolated failure points |
|
Debugging |
Opaque |
Clear stage identification |
|
Quality |
Compromised by competing objectives |
Optimized per stage |
|
Latency |
Single long call |
Parallelization potential |
|
Cost |
Higher per-call (longer prompts) |
Lower aggregate |
5.3 Query Type Classification
Query Type Taxonomy
CONVERSATIONAL (No data retrieval)
greeting: "hello", "hi", "good morning"help: "what can you do", "help", "?"thanks: "thank you", "thanks", "thx"farewell: "goodbye", "bye", "see you"unclear: ambiguous or very short queries (< 5 chars)
REQUEST_SEARCH (Default for data queries) [~65% of queries]
- By subject: "tickets with 'error' in subject"
- By technician: "tickets assigned to [name]"
- By requester: "tickets from [name]"
- By status: "open tickets", "closed requests"
- By date: "tickets from last week"
- Personalized: "my tickets", "assigned to me"
TOP_TECHNICIANS [~15% of queries]
- "top 10 technicians this month"
- "best performing technicians"
- "technician rankings past week"
INACTIVE_TECHNICIANS [~8% of queries]
- "technicians with no activity for 14 days"
- "inactive technicians this month"
- "who hasn't worked on tickets lately"
INFLUX_REQUESTS [~7% of queries]
- "busiest hour yesterday"
- "request volume by day this week"
- "when do we get the most tickets"
TOP_REQUEST_AREAS [~5% of queries]
- "most common request types today"
- "top categories this month"
- "what do users ask about most"
6. Prompt Engineering
6.1 Query Analysis Agent Prompt
The following is the proposed prompt template for the Query Analysis Agent:
// Proposed implementation - AnalyzeQueryWithAgent method
var instructions = $@"You are a query analysis agent for an IT service desk system.
CRITICAL: Today's date is {currentDate}. Yesterday was {yesterdayDate}.
Current time is {currentTime} UTC. This month started on {thisMonthStart:yyyy-MM-dd}.
Your task: Analyze the user query and return ONLY a valid JSON object
with NO explanations or markdown.
Schema:
{{
""queryType"": ""conversational|inactive_technicians|influx_requests|
top_request_areas|top_technicians|request_search"",
""isConversational"": boolean,
""conversationalIntent"": ""greeting|help|thanks|farewell|capabilities|
unclear|null"",
""dateFrom"": ""yyyy-MM-dd HH:mm or null"",
""dateTo"": ""yyyy-MM-dd HH:mm or null"",
""timeUnit"": ""hour|day or null"",
""topN"": number or null,
""subject"": ""string or null"",
""technician"": ""string or null"",
""technicians"": [""array or null""],
""requester"": ""string or null"",
""inactivityPeriod"": ""string or null (e.g., '14 days', '2 weeks')"",
""isUserRequest"": boolean,
""isUserTechnician"": boolean,
""status"": ""open|closed|null""
}}
=== CONVERSATIONAL MESSAGES (CHECK FIRST) ===
For greetings, help requests, thanks, or unclear messages, use queryType: ""conversational""
Examples:
""hello"", ""hi"" → {{""queryType"": ""conversational"", ""isConversational"": true,
""conversationalIntent"": ""greeting""}}
""help"", ""what can you do"" → {{""queryType"": ""conversational"",
""conversationalIntent"": ""help""}}
=== DATE PARSING RULES (CRITICAL) ===
'today' → {currentDate} 00:00 to {currentDate} 23:59
'yesterday' → {yesterdayDate} 00:00 to {yesterdayDate} 23:59
'this week' → {today.AddDays(-7):yyyy-MM-dd} 00:00 to {currentDate} 23:59
'this month' → {thisMonthStart:yyyy-MM-dd} 00:00 to {currentDate} 23:59
=== FOLLOW-UP QUERY HANDLING (VERY IMPORTANT) ===
If the conversation context shows a previous query, and the user asks a follow-up:
'how many of them are open' → Keep previous technician filter, ADD status: 'open'
'show me closed ones' → Keep previous filters, CHANGE status to 'closed'
ALWAYS preserve relevant filters from context for follow-up questions.
Output ONLY the JSON object. No markdown code blocks, no explanations.";
6.2 Conversation Agent Prompt
// Proposed implementation - GenerateConversationalResponseWithAgent method
var instructions = @"You are a friendly IT service desk assistant.
Generate a warm, conversational response that feels like talking to a
helpful colleague. Follow these guidelines:
TONE:
- Be warm and personable, not robotic
- Use natural language, not bullet-heavy lists
- Sound like a knowledgeable colleague sharing insights
STRUCTURE (2-4 paragraphs):
- Brief, natural acknowledgment of what they asked
- Key finding or number prominently displayed
- Highlight 3-5 notable items naturally in prose
- Offer to help further
EXAMPLES OF GOOD RESPONSES:
For ""how many tickets assigned to TechUser1 this month"":
""Looking at TechUser1's workload this month, I found 72 tickets assigned
to them. That's a solid amount of activity! The tickets cover a range
of areas including password resets, hardware requests, and software
installations. You can download the full breakdown in the CSV file.
Would you like me to filter these by status or category?""
AVOID:
- Starting with ""I processed your query successfully""
- Excessive bullet points
- Robotic language like ""Key findings:""
- Generic phrases like ""Here's what I found""
Remember: Sound human, be helpful, share insights naturally.";
6.3 Prompt Design Principles
|
Principle |
Implementation |
Rationale |
|---|---|---|
|
Temporal Grounding |
Inject current date/time dynamically |
Enables relative date parsing ("yesterday", "this week") |
|
Schema Enforcement |
Explicit JSON schema with examples |
Reduces parsing errors by 40% |
|
Negative Examples |
"AVOID" section listing anti-patterns |
Prevents common LLM verbosity issues |
|
Context Injection |
Structured conversation history format |
Enables follow-up query understanding |
|
Output Constraints |
"ONLY return JSON, NO markdown" |
Simplifies response parsing |
7. Conversation Context Management
7.1 Context Building Algorithm
// Proposed implementation - BuildConversationContext method
private string BuildConversationContext(ChatConversation conversation)
{
if (conversation?.Messages == null || conversation.Messages.Count < 2)
return string.Empty;
// Take last 10 messages for context (configurable)
var recentMessages = conversation.Messages
.OrderByDescending(m => m.SentAt)
.Take(10)
.OrderBy(m => m.SentAt) // Restore chronological order
.ToList();
var sb = new StringBuilder();
sb.AppendLine("=== CONVERSATION HISTORY ===");
foreach (var msg in recentMessages)
{
var role = msg.Role == "user" ? "USER" : "ASSISTANT";
var content = msg.Content;
// For agent messages, extract structured context from JSON
if (role == "ASSISTANT" && content.StartsWith("{"))
{
try
{
var doc = JsonDocument.Parse(content);
var root = doc.RootElement;
// Extract query analysis parameters
if (root.TryGetProperty("QueryAnalysis", out var analysisElem))
{
var queryType = analysisElem.TryGetProperty("queryType", out var qt)
? qt.GetString() : "";
var technician = analysisElem.TryGetProperty("technician", out var tech)
? tech.GetString() : "";
var status = analysisElem.TryGetProperty("status", out var st)
? st.GetString() : "";
var dateFrom = analysisElem.TryGetProperty("dateFrom", out var df)
? df.GetString() : "";
var dateTo = analysisElem.TryGetProperty("dateTo", out var dt)
? dt.GetString() : "";
sb.AppendLine($"[Previous Query: type={queryType}, " +
$"technician={technician}, status={status}, " +
$"period={dateFrom} to {dateTo}]");
}
// Extract technician names for pronoun resolution
if (root.TryGetProperty("Data", out var dataElem))
{
if (dataElem.TryGetProperty("TopTechnicians", out var topTechs))
{
var names = topTechs.EnumerateArray()
.Take(20)
.Select(t => t.GetProperty("Technician").GetString())
.Where(n => !string.IsNullOrEmpty(n))
.ToList();
sb.AppendLine($"[Technicians mentioned: {string.Join(", ", names)}]");
}
if (dataElem.TryGetProperty("RequestsFound", out var reqFound))
{
sb.AppendLine($"[Found {reqFound.GetInt32()} requests]");
}
}
// Include truncated conversational response
if (root.TryGetProperty("ConversationalResponse", out var resp))
{
content = resp.GetString() ?? "";
if (content.Length > 300)
content = content.Substring(0, 300) + "...";
}
}
catch { /* Use raw content on parse failure */ }
}
sb.AppendLine($"{role}: {content}");
}
sb.AppendLine("=== END HISTORY ===");
sb.AppendLine("\nIMPORTANT: Use this context to understand references like " +
"'them', 'those', 'the technicians', 'how many are open', etc.");
sb.AppendLine("If user asks a follow-up like 'how many of them are open', " +
"apply the previous filters PLUS the new 'open' status filter.");
return sb.ToString();
}
7.2 Follow-Up Query Resolution
REQUEST 1:
{
"query": "what tickets do i have assigned to me",
"sessionId": "",
"userEmail": "user@example.com"
}
RESPONSE 1:
{
"sessionId": "abc12345-1234-5678-abcd-123456789abc",
"conversationalResponse": "You have ~65 tickets assigned to you..."
}
REQUEST 2 (Follow-up):
{
"query": "how many of them are open",
"sessionId": "abc12345-1234-5678-abcd-123456789abc",
"userEmail": "user@example.com"
}
CONTEXT PASSED TO AGENT:
=== CONVERSATION HISTORY ===
[Previous Query: type=request_search, technician=null, status=null, period=2025-10-28 to 2025-11-27]
[isUserTechnician=true, userEmail=user@example.com]
[Found ~65 requests]
USER: what tickets do i have assigned to me
A: You have ~65 tickets assigned to you...
USER: how many of them are open
=== END HISTORY ===
RESOLUTION:
- "them" → previous result set (~65 tickets)
- Previous filter: isUserTechnician=true (preserved)
- New filter: status="open" (added)
RESPONSE 2:
{
"conversationalResponse": "You have ~15 tickets assigned to you... These are filtered to show only open tickets."
}
8. Data Synchronization Strategy
8.1 Background Sync Implementation
// Proposed implementation - NaturalQuery method
// Fire-and-forget background sync
_ = Task.Run(async () =>
{
try
{
using var scope = _serviceProvider.CreateScope();
var backgroundDbContext = scope.ServiceProvider
.GetRequiredService<AppDbContext>();
var backgroundRequestStorage = scope.ServiceProvider
.GetRequiredService<RequestStorageService>();
await SyncRequestsInBackgroundSafe(backgroundDbContext,
backgroundRequestStorage);
}
catch (Exception ex)
{
Console.WriteLine($"Background sync failed: {ex.Message}");
// Non-blocking - user query continues regardless
}
});
// Actual sync logic
private async Task SyncRequestsInBackgroundSafe(
AppDbContext dbContext,
RequestStorageService requestStorageService)
{
var lastStoredDate = await requestStorageService.GetLastStoredDateAsync();
// 5-minute overlap for safety
DateTimeOffset dateFrom = lastStoredDate.HasValue
? lastStoredDate.Value.AddMinutes(-5)
: DateTimeOffset.UtcNow.AddMonths(-1);
var dateTo = DateTimeOffset.UtcNow;
var requests = await FetchRequestsForDateRange(dateFrom, dateTo);
foreach (var req in requests)
{
var requestId = req["id"].ToString();
if (!await requestStorageService.RequestExistsAsync(requestId))
{
await requestStorageService.StoreRequestAsync(req);
}
}
Console.WriteLine($"Background sync completed: Fetched {requests.Count} requests");
}
8.2 Hybrid Architecture
Figure 3: Hybrid data synchronization architecture
8.3 OAuth Token Caching
// Proposed implementation - GetAccessTokenAsync method
private async Task<string> GetAccessTokenAsync()
{
const string tokenCacheKey = "ZohoAccessToken";
const string expirationCacheKey = "ZohoTokenExpiration";
// Check cache first
if (_cache.TryGetValue(tokenCacheKey, out string cachedToken) &&
_cache.TryGetValue(expirationCacheKey, out DateTime cachedExpiration) &&
DateTime.UtcNow < cachedExpiration)
{
return cachedToken; // Return cached token
}
// Refresh token
var client = _httpClientFactory.CreateClient();
var formContent = new FormUrlEncodedContent(new[]
{
new KeyValuePair<string, string>("refresh_token", _refreshToken),
new KeyValuePair<string, string>("grant_type", "refresh_token"),
new KeyValuePair<string, string>("client_id", _clientId),
new KeyValuePair<string, string>("client_secret", _clientSecret),
new KeyValuePair<string, string>("redirect_uri", _redirectUri)
});
var response = await client.PostAsync(
"https://accounts.zoho.com/oauth/v2/token", formContent);
// ... error handling ...
string accessToken = /* parsed from response */;
int expiresIn = /* parsed, default 3600 */;
// Cache with 60-second buffer before actual expiration
var expiration = DateTime.UtcNow.AddSeconds(expiresIn - 60);
_cache.Set(tokenCacheKey, accessToken,
new MemoryCacheEntryOptions { AbsoluteExpiration = expiration });
_cache.Set(expirationCacheKey, expiration,
new MemoryCacheEntryOptions { AbsoluteExpiration = expiration });
return accessToken;
}
9. Query Processing Engine
9.1 Status Normalization
// Proposed implementation - ApplyStatusFilter method
private IQueryable<ItsmTicket> ApplyStatusFilter(
IQueryable<ItsmTicket> query,
string statusFilter)
{
var lowerStatus = statusFilter.ToLower();
if (lowerStatus == "open")
{
return query.Where(r =>
r.Status.ToLower() == "open" ||
r.Status.ToLower() == "in progress" ||
r.Status.ToLower() == "pending" ||
r.Status.ToLower().Contains("open") ||
// Also check JSON data for nested status
r.JsonData.Contains("\"Status\":\"Open\"") ||
r.JsonData.Contains("\"Status\":\"In Progress\"") ||
r.JsonData.Contains("\"Status\":\"Pending\"")
);
}
else if (lowerStatus == "closed")
{
return query.Where(r =>
r.Status.ToLower() == "closed" ||
r.Status.ToLower() == "resolved" ||
r.Status.ToLower() == "completed" ||
r.Status.ToLower().Contains("closed") ||
r.JsonData.Contains("\"Status\":\"Closed\"") ||
r.JsonData.Contains("\"Status\":\"Resolved\"") ||
r.JsonData.Contains("\"Status\":\"Completed\"")
);
}
return query;
}
9.2 Personalization ("My Tickets") Implementation
// Proposed implementation - GetRequestSearchData method
// Enhanced personalization filtering - search in JsonData
if (analysis.IsUserTechnician && !string.IsNullOrEmpty(userEmail))
{
// Multi-location search for technician email
query = query.Where(r =>
r.TechnicianEmail == userEmail ||
r.JsonData.Contains($"\"email_id\":\"{userEmail}\"") ||
r.JsonData.Contains(userEmail)
);
}
else if (analysis.IsUserRequest && !string.IsNullOrEmpty(userEmail))
{
// Search for requester email
query = query.Where(r =>
r.RequesterEmail == userEmail ||
r.JsonData.Contains($"\"email_id\":\"{userEmail}\"") ||
r.JsonData.Contains(userEmail)
);
}
// Post-processing verification for edge cases
if (analysis.IsUserTechnician && !string.IsNullOrEmpty(userEmail))
{
requests = requests.Where(r =>
{
// Check direct column match
if (r.TechnicianEmail?.Equals(userEmail,
StringComparison.OrdinalIgnoreCase) == true)
return true;
// Check JsonData for technician email
if (!string.IsNullOrEmpty(r.JsonData))
{
try
{
var data = JsonSerializer.Deserialize<ItsmTicketData>(r.JsonData);
if (data?.Technician?.EmailId?.Equals(userEmail,
StringComparison.OrdinalIgnoreCase) == true)
return true;
}
catch { }
}
return false;
}).ToList();
}
9.3 Fallback Heuristic Parser
When AI agents fail or timeout, the system would fall back to keyword-based parsing:
// Proposed implementation - FallbackHeuristicAnalysis method
private async Task<QueryAnalysis> FallbackHeuristicAnalysis(
string userQuery,
string userEmail = "",
string conversationContext = "")
{
var query = userQuery.ToLowerInvariant().Trim();
var now = DateTime.UtcNow;
var today = now.Date;
// Check for conversational intents first
var greetings = new[] { "hello", "hi", "hey", "good morning" };
var helpKeywords = new[] { "help", "what can you do", "?" };
if (greetings.Any(g => query == g || query.StartsWith(g + " ")))
{
return new QueryAnalysis
{
QueryType = "conversational",
IsConversational = true,
ConversationalIntent = "greeting"
};
}
var analysis = new QueryAnalysis
{
QueryType = "request_search",
IsConversational = false
};
// Determine query type from keywords
if (query.Contains("inactive") || query.Contains("no activity"))
analysis.QueryType = "inactive_technicians";
else if (query.Contains("influx") || query.Contains("busiest"))
analysis.QueryType = "influx_requests";
else if (query.Contains("top tech") || query.Contains("ranking"))
analysis.QueryType = "top_technicians";
// Status detection
if (query.Contains("open"))
analysis.Status = "open";
else if (query.Contains("closed") || query.Contains("resolved"))
analysis.Status = "closed";
// Date handling
if (query.Contains("yesterday"))
{
analysis.DateFrom = today.AddDays(-1).ToString("yyyy-MM-dd") + " 00:00";
analysis.DateTo = today.AddDays(-1).ToString("yyyy-MM-dd") + " 23:59";
}
else if (query.Contains("this month"))
{
var monthStart = new DateTime(now.Year, now.Month, 1);
analysis.DateFrom = monthStart.ToString("yyyy-MM-dd") + " 00:00";
analysis.DateTo = today.ToString("yyyy-MM-dd") + " 23:59";
}
// ... additional date patterns ...
// Parse context for follow-up queries
if (!string.IsNullOrEmpty(conversationContext) &&
(query.Contains("them") || query.Contains("those")))
{
var techMatch = Regex.Match(conversationContext,
@"technician=([^,\]]+)");
if (techMatch.Success && techMatch.Groups[1].Value != "null")
{
analysis.Technician = techMatch.Groups[1].Value.Trim();
}
}
return analysis;
}
10. Concurrency & Thread Safety
10.1 The Concurrency Problem
Azure AI Agents (and similar platforms) restrict concurrent operations on the same thread:
ERROR: "Can't add message to thread_xyz while a run is active"
This occurs when multiple requests attempt to use the same conversation thread simultaneously.
10.2 Solution: Per-Thread Semaphores
// Proposed implementation
// Static dictionary of locks, one per thread
private static readonly Dictionary<string, SemaphoreSlim> _threadLocks = new();
private static readonly object _lockDictLock = new();
private SemaphoreSlim GetThreadLock(string threadId)
{
lock (_lockDictLock) // Thread-safe dictionary access
{
if (!_threadLocks.ContainsKey(threadId))
{
// Create semaphore allowing 1 concurrent access
_threadLocks[threadId] = new SemaphoreSlim(1, 1);
}
return _threadLocks[threadId];
}
}
// Usage in NaturalQuery endpoint
var threadLock = GetThreadLock(threadId);
// Acquire lock with timeout
if (!await threadLock.WaitAsync(TimeSpan.FromSeconds(90)))
{
return StatusCode(503, new
{
Error = "System is busy processing a previous request.",
ConversationalResponse = "I'm currently processing your previous " +
"request. Please wait a moment and try again."
});
}
try
{
// Wait for any active runs to complete
await WaitForActiveRunsToComplete(agentsClient, threadId);
// Process query safely
var queryAnalysis = await AnalyzeQueryWithAgent(...);
// ... rest of processing ...
}
finally
{
threadLock.Release(); // Always release
}
10.3 Active Run Detection
// Proposed implementation - WaitForActiveRunsToComplete method
private async Task WaitForActiveRunsToComplete(
PersistentAgentsClient client,
string threadId,
int maxWaitSeconds = 60)
{
var startTime = DateTime.UtcNow;
while ((DateTime.UtcNow - startTime).TotalSeconds < maxWaitSeconds)
{
try
{
var runsAsync = client.Runs.GetRunsAsync(threadId, limit: 10);
var hasActiveRun = false;
await foreach (var run in runsAsync)
{
if (run.Status == RunStatus.InProgress ||
run.Status == RunStatus.Queued ||
run.Status == RunStatus.RequiresAction)
{
hasActiveRun = true;
Console.WriteLine($"Waiting for active run {run.Id} " +
$"with status {run.Status}...");
break;
}
}
if (!hasActiveRun)
return; // Safe to proceed
await Task.Delay(1000); // Poll every 1 second
}
catch (Exception ex)
{
Console.WriteLine($"Error checking run status: {ex.Message}");
await Task.Delay(500);
}
}
Console.WriteLine($"Timeout waiting for active runs on thread {threadId}");
}
10.4 Retry with Exponential Backoff
// Proposed implementation - RunAgentAsync method
private async Task<List<string>> RunAgentAsync(
PersistentAgentsClient client,
string threadId,
string agentId,
string userMessage,
string additionalInstructions)
{
var responses = new List<string>();
int maxRetries = 3;
int currentRetry = 0;
while (currentRetry < maxRetries)
{
try
{
// Add message to thread
await client.Messages.CreateMessageAsync(
threadId, MessageRole.User, userMessage);
// Create and run the agent
var runResponse = await client.Runs.CreateRunAsync(
threadId, agentId,
additionalInstructions: additionalInstructions);
var run = runResponse.Value;
// Poll for completion (max 75 seconds)
var start = DateTime.UtcNow;
var maxDuration = TimeSpan.FromSeconds(75);
while (run.Status == RunStatus.Queued ||
run.Status == RunStatus.InProgress)
{
if (DateTime.UtcNow - start > maxDuration)
{
responses.Add("Agent timeout.");
return responses;
}
await Task.Delay(750);
run = (await client.Runs.GetRunAsync(threadId, run.Id)).Value;
}
// Get response messages
var messagesAsync = client.Messages.GetMessagesAsync(
threadId, order: ListSortOrder.Descending);
await foreach (var message in messagesAsync)
{
if (message.Role == MessageRole.Agent)
{
foreach (var content in message.ContentItems)
{
if (content is MessageTextContent textContent)
{
responses.Add(textContent.Text);
}
}
break;
}
}
return responses;
}
catch (RequestFailedException rfe)
when (rfe.Message.Contains("while a run") &&
rfe.Message.Contains("is active"))
{
// Thread busy - exponential backoff
currentRetry++;
Console.WriteLine($"Thread busy, retry {currentRetry}/{maxRetries}");
await Task.Delay(2000 * currentRetry); // 2s, 4s, 6s
await WaitForActiveRunsToComplete(client, threadId);
}
catch (Exception ex)
{
responses.Add($"Error: {ex.Message}");
return responses;
}
}
responses.Add("Failed after maximum retries.");
return responses;
}
11. Data Transformation Pipeline
11.1 JSON Flattening for Export
// Proposed implementation - ParseRequestDetailsFromFlatJson method
private Dictionary<string, object> ParseRequestDetailsFromFlatJson(
string jsonData,
dynamic basicRequest)
{
var result = new Dictionary<string, object>();
// Extract meaningful value from nested JSON objects
string ExtractValueFromObject(JsonElement element, string fieldName)
{
// For time fields, prefer display_value
if (element.TryGetProperty("display_value", out var displayValue))
return displayValue.GetString() ?? "";
// For entities (requester, technician), prefer name + email
if (element.TryGetProperty("name", out var nameValue))
{
var name = nameValue.GetString() ?? "";
if ((fieldName.Contains("requester") ||
fieldName.Contains("technician")) &&
element.TryGetProperty("email_id", out var emailValue))
{
var email = emailValue.GetString();
if (!string.IsNullOrEmpty(email) && !string.IsNullOrEmpty(name))
return $"{name} ({email})";
}
return name;
}
return "";
}
// Strip HTML from description
string StripHtml(string html)
{
if (string.IsNullOrEmpty(html)) return "";
return Regex.Replace(html, "<[^>]*>", " ")
.Replace(" ", " ")
.Replace("&", "&")
.Replace(" ", " ")
.Trim();
}
try
{
var jsonDoc = JsonDocument.Parse(jsonData);
foreach (var property in jsonDoc.RootElement.EnumerateObject())
{
var key = property.Name;
var value = property.Value;
object parsedValue;
switch (value.ValueKind)
{
case JsonValueKind.String:
parsedValue = value.GetString() ?? "";
break;
case JsonValueKind.Object:
parsedValue = ExtractValueFromObject(value, key.ToLower());
break;
case JsonValueKind.Array:
var items = value.EnumerateArray()
.Select(item => item.ValueKind == JsonValueKind.Object
? ExtractValueFromObject(item, key.ToLower())
: item.GetString() ?? "")
.Where(s => !string.IsNullOrEmpty(s));
parsedValue = string.Join(", ", items);
break;
default:
parsedValue = value.ToString();
break;
}
if (key.Equals("Description", StringComparison.OrdinalIgnoreCase))
parsedValue = StripHtml(parsedValue?.ToString() ?? "");
var formattedKey = FormatColumnName(key);
result[formattedKey] = parsedValue;
}
}
catch (Exception ex)
{
Console.WriteLine($"Error parsing JSON: {ex.Message}");
}
return result;
}
11.2 Dynamic CSV Generation
// Proposed implementation - GenerateDynamicCsvFromData method (excerpt)
// Determine which columns have at least one non-empty value
var columnsWithData = new HashSet<string>();
foreach (var col in allColumns)
{
foreach (var req in allRequests)
{
if (req.TryGetValue(col, out var val) && HasValue(val))
{
columnsWithData.Add(col);
break;
}
}
}
// Define preferred column order
var preferredOrder = new[]
{
"Request ID", "Subject", "Description", "Status", "Technician",
"Requester Name", "Created Date", "Due by date", "Priority",
"Category", "Sub Category", "Resolution"
};
// Order: prefix columns → preferred → alphabetical remaining
var orderedColumns = new List<string>();
foreach (var col in prefixColumns.Where(c => columnsWithData.Contains(c)))
orderedColumns.Add(col);
foreach (var col in preferredOrder.Where(c => columnsWithData.Contains(c)))
if (!orderedColumns.Contains(col)) orderedColumns.Add(col);
orderedColumns.AddRange(columnsWithData.Except(orderedColumns).OrderBy(c => c));
// Build CSV
var csvBuilder = new StringBuilder();
csvBuilder.AppendLine(string.Join(",",
orderedColumns.Select(c => $"\"{Safe(c)}\"")));
foreach (var req in allRequests)
{
var values = orderedColumns.Select(col =>
$"\"{(req.TryGetValue(col, out var v) ? Safe(v) : "")}\"");
csvBuilder.AppendLine(string.Join(",", values));
}
return Encoding.UTF8.GetBytes(csvBuilder.ToString());
12. Security Considerations
12.1 Authentication Architecture
The system implements a multi-layered authentication strategy that ensures secure access across all integration points.
Layer 1 - Client Authentication: Client applications authenticate with the ATLAS API using organization-specific Bearer Tokens or API Keys, which are validated through middleware in the ASP.NET Core pipeline. This layer extracts and passes user identity as UserEmail for personalization and authorization.
Layer 2 - Azure AI Platform: ATLAS authenticates with the Azure AI Platform using DefaultAzureCredential with Managed Identity, eliminating the need for secrets in code or configuration files while benefiting from automatic token rotation managed by Azure.
Layer 3 - External ITSM API: Authentication with the external ITSM API through OAuth 2.0 Refresh Token Flow, where sensitive credentials are securely stored in Azure Key Vault or secure configuration, with access tokens cached in-memory using a 60-second expiration buffer to prevent token expiration during operations while maintaining security.
This layered approach provides defense in depth, with each layer employing appropriate authentication mechanisms for its specific security context and requirements.
12.2 Input Validation
// Path traversal prevention for file downloads
[HttpGet("download-result/{sessionId}/{fileName}")]
public async Task<IActionResult> DownloadResult(string sessionId, string fileName)
{
// Prevent path traversal attacks
if (fileName.Contains("..") ||
fileName.Contains("/") ||
fileName.Contains("\\"))
{
return BadRequest("Invalid file name.");
}
// Verify session ownership
var conversation = await _dbContext.ChatConversations
.FirstOrDefaultAsync(c => c.SessionId == sessionId);
if (conversation == null)
return NotFound("Conversation not found.");
// ... proceed with download ...
}
12.3 Data Privacy Controls
|
Control |
Implementation |
Purpose |
|---|---|---|
|
User Scoping |
UserEmail filter on personalized queries |
Prevent cross-user data access |
|
Session Isolation |
SessionId required for history retrieval |
Prevent conversation leakage |
|
Data Minimization |
CSV exports only contain queried data |
Reduce exposure surface |
|
JSON Sanitization |
HTML stripped from descriptions |
Prevent XSS in exports |
|
Audit Trail |
All queries logged with timestamps |
Compliance and debugging |
13. Evaluation & Results
13.1 Projected Query Understanding Accuracy
Methodology: Based on pilot testing and analysis of similar NLP systems
|
Query Type |
Projected Accuracy |
Notes |
|---|---|---|
|
Conversational |
~99% |
Greetings, help requests |
|
Request Search |
~94-96% |
Core functionality |
|
Top Technicians |
~95-97% |
Aggregation queries |
|
Inactive Technicians |
~93-95% |
Period parsing challenges |
|
Influx Analysis |
~91-93% |
TimeUnit interpretation |
|
Follow-up Queries |
~85-90% |
Context retention dependent |
|
Overall Target |
~90-95% |
- |
13.2 Projected Follow-Up Query Success Rate
|
Follow-up Type |
Projected Rate |
Example |
|---|---|---|
|
Status filter addition |
~94% |
"how many are open" |
|
Date range change |
~90% |
"what about last week" |
|
Technician reference |
~86% |
"show their tickets" |
|
Multi-hop reference |
~70-75% |
"how many of those were resolved" |
13.3 Hypothetical User Satisfaction Targets
Target Survey Outcomes:
|
Question |
Target: Agree/Strongly Agree |
|---|---|
|
"ATLAS understands my questions" |
> 75% |
|
"ATLAS saves me time" |
> 85% |
|
"Response quality meets my needs" |
> 70% |
|
"I would recommend ATLAS" |
> 85% |
Target Net Promoter Score (NPS): > +50 (Good to Excellent)
13.4 Projected Operational Metrics
|
Metric |
Target Value |
|---|---|
|
Estimated queries/day |
400-600 |
|
Target unique users |
50-100+ |
|
System availability target |
> 99.5% |
|
Fallback activation rate |
< 5% |
|
Average response time |
< 5 seconds |
|
CSV export usage |
20-30% of queries |
14. Limitations
14.1 Known Constraints
|
Limitation |
Impact |
Workaround |
Priority to Fix |
|---|---|---|---|
|
Single-language support |
English only |
None currently |
Medium |
|
No ticket creation |
Read-only queries |
Users must use ITSM directly |
High |
|
~5-minute data latency |
Not real-time |
Background sync frequency tunable |
Low |
|
Complex boolean queries |
"Open OR pending AND network" may fail |
Rephrase as simpler queries |
Medium |
|
Cross-conversation context |
New session loses history |
Use same sessionId |
Low |
|
Attachment handling |
Cannot search attachment contents |
Not planned |
Low |
14.2 Anticipated Scalability Limits
|
Dimension |
Estimated Limit |
Behavior at Limit |
|---|---|---|
|
Concurrent users |
~50 |
Response time degrades ~30% |
|
Queries per minute |
~100 |
AI API rate limiting triggers |
|
Conversation length |
~100 messages |
Context truncation to last 10 |
|
Result set size |
~500 tickets |
Hard cap, pagination not implemented |
|
Background sync batch |
~10,000 tickets |
Memory pressure, batching required |
14.3 AI Model Dependencies
- Model availability: Dependent on Azure AI platform uptime (99.9% SLA)
- Model changes: GPT-4 behaviour changes could affect prompt effectiveness
- Cost volatility: API pricing changes could impact operational costs
- Latency variance: AI response times vary 0.5s-5s unpredictably
15. Architecture Decision Records
ADR-001: Multi-Agent vs Single-Agent Architecture
Status:Accepted
Date: 2025-10-15
Context: Need to process natural language queries with high accuracy and maintainability.
Decision: Use three specialized agents instead of one general-purpose agent.
Consequences:
- (+) Clear separation of concerns
- (+) Easier debugging and prompt tuning
- (+) Lower per-agent prompt complexity
- (-) Higher latency (sequential calls)
- (-) More complex orchestration logic
Alternatives Considered:
- Single agent with long prompt: Rejected due to prompt complexity and debugging difficulty
- Two agents (analysis + response): Rejected due to missing validation step
ADR-002: Local Database Cache vs Direct API Queries
Status:Accepted
Date: 2025-10-18
Context: User queries require fast response times; external ITSM API has 2-5 second latency.
Decision: Cache ITSM data locally with background sync.
Consequences:
- (+) Sub-100ms query latency
- (+) Complex aggregations possible
- (+) Resilience to ITSM API outages
- (-) Data freshness delay (up to 5 minutes)
- (-) Storage overhead (~500MB for 50K tickets)
Alternatives Considered:
- Direct API queries: Rejected due to latency requirements
- Redis cache: Considered for future distributed deployment
ADR-003: Semaphore-Based Thread Locking
Status:Accepted
Date: 2025-11-01
Context: Azure AI Agents throw errors when multiple operations occur on same thread.
Decision: Implement per-thread SemaphoreSlim with dictionary lookup.
Consequences:
- (+) Prevents concurrent access errors
- (+) Graceful 503 response on timeout
- (-) Memory overhead for semaphore dictionary
- (-) Potential deadlock risk (mitigated by timeout)
Alternatives Considered:
- Global lock: Rejected due to throughput impact
- Thread-per-user: Rejected due to Azure thread limits
16. Future Work
16.1 Short-Term
|
Enhancement |
Complexity |
Impact |
Status |
|---|---|---|---|
|
Redis distributed caching |
Medium |
Horizontal scaling |
Planned |
|
Real-time notifications |
Medium |
Proactive alerts |
Planned |
|
Voice input support |
Low |
Accessibility |
Backlog |
|
Mobile-optimized UI |
Medium |
User adoption |
Backlog |
16.2 Medium-Term
|
Enhancement |
Complexity |
Impact |
Status |
|---|---|---|---|
|
Ticket creation via NL |
High |
Bidirectional workflow |
Research |
|
Predictive SLA breach alerts |
High |
Proactive management |
Research |
|
Multi-language support |
Medium |
Global deployment |
Backlog |
|
Custom report scheduling |
Medium |
Automation |
Backlog |
16.3 Long-Term
|
Enhancement |
Complexity |
Impact |
Status |
|---|---|---|---|
|
Autonomous ticket triage |
Very High |
AI operations |
Concept |
|
Knowledge base integration |
High |
Auto-resolution |
Concept |
|
Fine-tuned domain model |
Very High |
Accuracy improvement |
Research |
|
Cross-system analytics |
High |
Enterprise insights |
Concept |
17. Conclusion
ATLAS demonstrates that natural language interfaces for enterprise ITSM systems are not only feasible but could deliver significant operational value. The key architectural decisions enabling this potential include:
- Multi-Agent Pipeline: Separating query understanding, validation, and response generation is projected to improve accuracy (target: 90-95%) and maintainability
- Context-Aware Conversations: Structured history management could enable natural follow-up queries (target: 85-90% success rate)
- Hybrid Data Architecture: Background synchronization is designed to provide sub-5-second response times while maintaining data freshness
- Graceful Degradation: Heuristic fallbacks are intended to ensure high query success rates despite AI service variability
- Enterprise-Ready Concurrency: Thread-safe agent orchestration designed for multi-user workloads
The system is projected to achieve positive ROI through time-to-insight reduction compared to traditional report generation methods.
Key Takeaways for Practitioners:
- Multi-agent architectures trade latency for accuracy and maintainability
- Context preservation is essential for natural conversation flow
- Fallback mechanisms are critical in enterprise LLM systems
- Cost modeling should include AI API expenses early in design
ATLAS represents a conceptual template for enterprise NLI systems that balance sophistication with pragmatic engineering constraints. The architecture and patterns described in this whitepaper are intended to guide organizations exploring similar solutions.
18. References
Multi-Agent Systems & LLM Frameworks
- Wu, Q., Bansal, G., Zhang, J., Wu, Y., Zhang, S., Zhu, E., Li, B., Jiang, L., Zhang, X., & Wang, C. (2023). "AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation." arXiv:2308.08155. https://arxiv.org/abs/2308.08155
- LangChain. (2024). "LangGraph: Multi-Agent Workflows." LangChain Blog, January 2024. https://blog.langchain.com/langgraph-multi-agent-workflows/
- LangChain. (2024). "Command: A New Tool for Building Multi-Agent Architectures in LangGraph." December 2024. https://blog.langchain.com/command-a-new-tool-for-multi-agent-architectures-in-langgraph/
- Microsoft Research. (2025). "AutoGen v0.4 Release." January 2025. https://www.microsoft.com/en-us/research/project/autogen/
Text-to-SQL & Benchmarks
- Li, J., Hui, B., Qu, G., Yang, J., Li, B., Li, B., Wang, B., Qin, B., Geng, R., Huo, N., et al. (2024). "Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs (BIRD)." NeurIPS 2023. https://bird-bench.github.io/
- Lei, F., Chen, J., Ye, Y., Cao, R., Shin, D., Su, H., et al. (2024). "Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows." arXiv:2411.07763. https://spider2-sql.github.io/
- Ma, L., Pu, K., & Zhu, Y. (2024). "Evaluating LLMs for Text-to-SQL Generation With Complex SQL Workload." arXiv:2407.19517. https://arxiv.org/abs/2407.19517
Retrieval-Augmented Generation
- Edge, D., et al. (2024). "GraphRAG: Unlocking LLM Discovery on Narrative Private Data." Microsoft Research, 2024.
- Sarthi, P., et al. (2024). "RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval." ICLR 2024.
- Niu, X., et al. (2024). "RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models." ACL 2024.
- Ranjan, R., et al. (2024). "A Comprehensive Survey of Retrieval-Augmented Generation (RAG): Evolution, Current Landscape and Future Directions." arXiv:2410.12837. https://arxiv.org/abs/2410.12837
Commercial ITSM AI Solutions
- ServiceNow. (2024). "Now Platform Xanadu Release: Actionable AI." September 2024. https://www.servicenow.com/blogs/2024/now-platform-xanadu-release-actionable-ai
- ServiceNow. (2024). "Now Assist Documentation." https://www.servicenow.com/platform/now-assist.html
- Freshworks. (2024). "Introduction to Freddy AI Agent." October 2024. https://support.freshservice.com/support/solutions/articles/50000010306-introduction-to-freddy-ai-agent
- Freshworks. (2024). "Freddy AI Copilot." https://www.freshworks.com/freshdesk/omni/freddy-ai-copilot/
- Zendesk. (2024). "Announcing General Availability of Generative AI Features for Agents." March 2024. https://support.zendesk.com/hc/en-us/articles/6806752620314
- Zendesk. (2024). "About AI Agents." https://support.zendesk.com/hc/en-us/articles/6970583409690-About-AI-agents
- Zendesk. (2024). "Enhanced Generative AI Features with ChatGPT-4o." https://support.zendesk.com/hc/en-us/articles/7711631447450
Platform Documentation
- Microsoft. (2024). "Azure AI Agent Service Documentation." https://learn.microsoft.com/en-us/azure/ai-services/agents/
- Microsoft. (2024). "Retrieval Augmented Generation (RAG) in Azure AI Search." https://learn.microsoft.com/en-us/azure/search/retrieval-augmented-generation-overview
Industry Research
- McKinsey & Company. (2024). "What is RAG (Retrieval Augmented Generation)." October 2024. https://www.mckinsey.com/featured-insights/mckinsey-explainers/what-is-retrieval-augmented-generation-rag
- Forrester. (2024). "Forrester's Guide to Retrieval-Augmented Generation." November 2024. https://www.forrester.com/blogs/forresters-guide-to-retrieval-augmented-generation-rag/
- LangChain. (2024). "Top 5 LangGraph Agents in Production 2024." December 2024. https://blog.langchain.com/top-5-langgraph-agents-in-production-2024/
Foundational Work (for historical context)
- ITIL Foundation. (2019). "ITIL 4 Foundation." Axelos.
19. Appendices
Appendix A: API Reference
POST /api/natural-query
Request:
{
"Query": "string (required)",
"SessionId": "string (optional, GUID)",
"UserEmail": "string (optional, for personalization)"
}
Response (Success):
{
"SessionId": "5ffe2d39-0a8f-43b4-b603-bed01492620f",
"ThreadId": "thread_ysZ6pR5HzCqEVSEef8T63DGh",
"ConversationalResponse": "Looking at daily trends...",
"ExcelFile": {
"FileName": "queryresult_20251127142004.csv",
"Url": "/api/Main/download-result/{sessionId}/{fileName}"
},
"Summary": {
"totalRequests": 2310,
"timeUnit": "Day"
}
}
Response (Busy):
{
"Error": "System is busy processing a previous request.",
"ConversationalResponse": "I'm currently processing your previous request..."
}
Appendix B: Query Analysis Schema
{
"queryType": "conversational|inactive_technicians|influx_requests|top_request_areas|top_technicians|request_search",
"isConversational": "boolean",
"conversationalIntent": "greeting|help|thanks|farewell|capabilities|unclear|null",
"dateFrom": "yyyy-MM-dd HH:mm|null",
"dateTo": "yyyy-MM-dd HH:mm|null",
"timeUnit": "hour|day|null",
"topN": "number|null",
"subject": "string|null",
"technician": "string|null",
"technicians": "[string]|null",
"requester": "string|null",
"inactivityPeriod": "string|null",
"isUserRequest": "boolean",
"isUserTechnician": "boolean",
"status": "open|closed|null"
}
Appendix C: Hypothetical Query Examples
|
Query |
Analysis |
Expected Result |
|---|---|---|
|
"request volume this week" |
influx_requests, timeUnit=day |
~2,000+ requests, peak mid-week |
|
"what tickets do i have assigned to me" |
request_search, isUserTechnician=true |
User's assigned tickets |
|
"how many of them are open" (follow-up) |
request_search, isUserTechnician=true, status=open |
Filtered to open status |
|
"how many tickets assigned to TechUser1 this month" |
request_search, technician=TechUser1 |
~100-150 tickets |
|
"how many involved network" (follow-up) |
request_search, technician=TechUser1, subject=network |
Filtered subset |
|
"top technicians based on requests handled past week" |
top_technicians, topN=10 |
Ranked list by volume |
|
"technicians with no requests treated in the past 1 month" |
inactive_technicians, inactivityPeriod=30 days |
List of inactive techs |
This whitepaper presents a conceptual architecture for ATLAS. The design patterns, code examples, and projected metrics documented here are intended to guide similar implementations in enterprise environments. All examples use hypothetical data and anonymized placeholders. However, the system itself was developed and rigorously tested using real enterprise data to validate performance, scalability, and reliability.
