AI Support Guide

The Ticket Lifecycle

How a support ticket flows through Simpli services from arrival to resolution.

Simpli services form a pipeline. Each service handles one step of the support workflow — classification, response generation, sentiment monitoring, quality scoring, and operational metrics. You can adopt a single service or chain them all together. There is no orchestrator; each service is a standalone API you call when you need it.

This page walks through the full lifecycle of a support ticket and shows how the services connect at each stage.

The pipeline at a glance

Ticket arrives
  |
  v
Triage (/classify + /route)
  |-- category, urgency, sentiment, confidence
  |-- assigned team + agent
  |
  v
Agent works the ticket
  |
  |-- Reply (/api/v1/draft)         generates a draft response
  |-- Sentiment (/analyze)          monitors each customer message (parallel)
  |
  v
Ticket resolved
  |
  |-- QA (/evaluate)                scores the full conversation
  |-- Pulse (/metrics, /forecast)   aggregates operational data

Triage is the entry point. Reply and Sentiment run during the conversation. QA and Pulse run after resolution. Every arrow is an HTTP call you control.

Step 1: Classification and routing (Triage)

When a ticket arrives in your helpdesk, the first call goes to Triage. It reads the subject and body, then returns a structured classification.

Classify the ticket

POST /classify

{
  "subject": "Can't process payment",
  "body": "I've been trying to pay for 2 hours and keep getting error 500",
  "metadata": {"account_type": "premium"}
}

Response:

{
  "category": "billing",
  "urgency": "high",
  "sentiment": "negative",
  "confidence": 0.87
}

The confidence score tells you how certain the model is. Tickets below your confidence threshold can fall back to manual triage.

Route the ticket

Take the classification output and pass it straight to the routing endpoint.

POST /route

{
  "ticket_id": "T-1001",
  "category": "billing",
  "urgency": "high",
  "sentiment": "negative"
}

Response:

{
  "team": "billing",
  "agent": "A-012",
  "reason": "High urgency billing issue routed to specialist"
}

Routing rules are configurable via GET /rules. You define which categories map to which teams, set urgency-based priority overrides, and configure round-robin or load-balanced agent assignment.

At this point the ticket is classified, prioritised, and sitting in the right agent's queue — without any human touching it.

Step 2: Draft response generation (Reply)

The assigned agent opens the ticket and wants to respond. Instead of writing from scratch, they request a draft from Reply.

POST /api/v1/draft

{
  "ticket_id": "T-1001",
  "conversation": [
    {
      "role": "customer",
      "content": "I've been trying to pay for 2 hours and keep getting error 500"
    }
  ],
  "style": "friendly",
  "language": "en"
}

The role field uses the AuthorType enum: customer, agent, or system. The style parameter adjusts tone — use friendly, formal, or concise.

Response:

{
  "draft_id": "d-abc123",
  "draft": "I'm sorry you're running into payment issues — that's frustrating, especially after trying for two hours. I'm looking into the error now and will have an update for you shortly. In the meantime, could you let me know which payment method you're using?",
  "confidence": 0.82,
  "suggested_template": null,
  "language": "en"
}

The agent reviews the draft, edits if needed, and sends. After sending, they can submit feedback via POST /api/v1/feedback to improve future drafts:

{
  "draft_id": "d-abc123",
  "accepted": true,
  "edited_text": null
}

Reply is a co-pilot, not an auto-responder. The agent always has the final say.

Step 3: Sentiment monitoring (Sentiment)

While the conversation is active, each customer message is analysed for sentiment and escalation risk. This happens in parallel — it does not block the agent workflow.

POST /analyze

{
  "customer_id": "C-42",
  "text": "I've been trying to pay for 2 hours and keep getting error 500",
  "channel": "email"
}

The channel field accepts values from the Channel enum: email, chat, phone, social, or web.

Response:

{
  "score": -0.58,
  "label": "negative",
  "escalation_risk": 0.42,
  "triggers": []
}
  • score: ranges from -1.0 (very negative) to 1.0 (very positive)
  • label: positive (score 0.3 or above), neutral, or negative (score -0.3 or below)
  • escalation_risk: 0.0 to 1.0 — alerts are created when this reaches 0.5 or higher
  • triggers: escalation keywords detected in the text (e.g. "cancel", "lawsuit", "manager", "refund")

If the customer's frustration escalates mid-conversation:

{
  "score": -0.85,
  "label": "negative",
  "escalation_risk": 0.78,
  "triggers": ["manager", "unacceptable"]
}

At this point, GET /alerts surfaces it to team leads so they can intervene before the situation deteriorates further.

Sentiment analysis is keyword-based by default, which keeps latency under 50ms. For ML-based analysis, install the optional [ml] extra.

Step 4: Quality scoring (QA)

After the ticket is resolved, the full conversation is sent to QA for automated evaluation.

POST /evaluate

{
  "conversation_id": "T-1001",
  "agent_id": "A-012",
  "messages": [
    {
      "role": "customer",
      "content": "I've been trying to pay for 2 hours and keep getting error 500"
    },
    {
      "role": "agent",
      "content": "I'm sorry you're running into payment issues. I'm looking into the error now and will have an update shortly."
    },
    {
      "role": "agent",
      "content": "I found the issue — our payment processor had a temporary outage. It's resolved now. Could you try again?"
    },
    {
      "role": "customer",
      "content": "That worked, thank you!"
    }
  ],
  "rubric_id": "R-default"
}

Response:

{
  "conversation_id": "T-1001",
  "overall_score": 0.91,
  "dimensions": {
    "empathy": 0.93,
    "resolution": 0.90,
    "communication": 0.89
  },
  "coaching_notes": [
    "Strong empathy shown in opening response",
    "Quick root-cause identification",
    "Consider proactively offering a follow-up check"
  ]
}

Scores are per-dimension so agents get specific, actionable feedback rather than just a number. Coaching notes are generated by the LLM and highlight what went well and what could improve.

To see an agent's performance over time, pull their scorecard:

GET /scorecards/A-012

{
  "agent_id": "A-012",
  "average_score": 0.88,
  "total_reviews": 47,
  "trend": "improving",
  "weakest_dimension": "communication",
  "strongest_dimension": "empathy"
}

Step 5: Operational metrics (Pulse)

Pulse aggregates data across your support operation and provides real-time visibility for managers and team leads.

Current snapshot

GET /metrics

{
  "queue_depth": 23,
  "avg_wait_time_minutes": 4.5,
  "active_agents": 12,
  "open_tickets": 156,
  "sla_compliance_pct": 94.2
}

Volume forecasting

GET /forecast?hours=24

{
  "horizon_hours": 24,
  "predicted_volume": [45, 52, 61, 58, 43, 38],
  "confidence_interval": [
    {"lower": 38, "upper": 52},
    {"lower": 44, "upper": 60}
  ]
}

Use forecasts to plan staffing. If predicted volume spikes in 4 hours, you have time to adjust.

SLA compliance

GET /sla

Tracks compliance rates and predicts which tickets are at risk of breaching their SLA, so you can reprioritise before it happens.

Full integration example

Here is a complete Python script that chains all five services together. This is the full pipeline — from raw ticket text to quality score.

import httpx

TRIAGE = "http://localhost:8001"
REPLY = "http://localhost:8002"
SENTIMENT = "http://localhost:8003"
QA = "http://localhost:8004"
PULSE = "http://localhost:8005"

ticket_subject = "Can't process payment"
ticket_body = "I've been trying to pay for 2 hours and keep getting error 500"

# 1. Classify the incoming ticket
classification = httpx.post(f"{TRIAGE}/classify", json={
    "subject": ticket_subject,
    "body": ticket_body,
}).json()
print(f"Category: {classification['category']}, Urgency: {classification['urgency']}")

# 2. Route to the right team
routing = httpx.post(f"{TRIAGE}/route", json={
    "ticket_id": "T-1001",
    **classification,
}).json()
print(f"Assigned to: {routing['agent']} ({routing['team']} team)")

# 3. Generate a draft response for the agent
draft = httpx.post(f"{REPLY}/api/v1/draft", json={
    "ticket_id": "T-1001",
    "conversation": [
        {"role": "customer", "content": ticket_body}
    ],
    "style": "friendly",
}).json()
print(f"Draft (confidence {draft['confidence']}): {draft['draft'][:80]}...")

# 4. Monitor sentiment in parallel during the conversation
sentiment = httpx.post(f"{SENTIMENT}/analyze", json={
    "customer_id": "C-42",
    "text": ticket_body,
    "channel": "email",
}).json()
print(f"Sentiment: {sentiment['label']} ({sentiment['score']}), "
      f"Escalation risk: {sentiment['escalation_risk']}")

# 5. After resolution — score the conversation
score = httpx.post(f"{QA}/evaluate", json={
    "conversation_id": "T-1001",
    "agent_id": routing["agent"],
    "messages": [
        {"role": "customer", "content": ticket_body},
        {"role": "agent", "content": draft["draft"]},
        {"role": "customer", "content": "That worked, thank you!"},
    ],
}).json()
print(f"QA score: {score['overall_score']}{', '.join(score['coaching_notes'])}")

# 6. Check operational health
metrics = httpx.get(f"{PULSE}/metrics").json()
print(f"Queue depth: {metrics['queue_depth']}, SLA: {metrics['sla_compliance_pct']}%")

Each httpx.post / httpx.get call is independent. In production, you would call Sentiment asynchronously (it does not need to complete before the agent responds) and QA would run as a post-resolution webhook.

Which services do you need?

You do not have to deploy all five. Each service is independently deployable — start with one and add more as you see value.

ConfigurationServices usedBest for
Triage onlyTriageAuto-classification, reduce manual sorting
Triage + ReplyTriage, ReplyFaster agent responses with AI drafts
Triage + SentimentTriage, SentimentEscalation prevention, customer health tracking
Reply + QAReply, QADraft quality feedback loop
Full pipelineTriage, Reply, Sentiment, QA, PulseComplete AI-powered support operations

The most common starting point is Triage alone. Once your team sees tickets arriving pre-classified, the next step is usually Reply (to speed up responses) or Sentiment (to catch escalations early).

Latency expectations

All services respond fast enough for real-time use in a support workflow.

ServiceTypical latencyWhat drives it
Triage200-500msLLM classification call
Reply1-3sLLM generation (longer responses take longer)
Sentiment< 50msKeyword-based analysis, no LLM call
QA1-3sLLM evaluation of full conversation
Pulse< 50msDatabase aggregation queries

Sentiment and Pulse are the fastest because they do not make LLM calls. Triage, Reply, and QA depend on your LLM provider's response time — latency improves with faster models or local inference. See LLM Providers for configuration options.

Next steps

  • Integrations — Connect Simpli services to your existing helpdesk (Zendesk, Freshdesk, Intercom)
  • Quality Loop — Use QA scores to continuously improve Reply drafts and Triage accuracy

On this page