Automating Troubleshooting in Kognitos
What if a support investigation could start with a vague prompt and end with a deterministic, repeatable workflow?
That’s what I tested in Kognitos.
I started with a simple request:
Look at Sentry logs and replay for jerome@kognitos.com and tell me what this user was trying to do. Add the summary activity, error, and next steps.
From that one high-level instruction, Kognitos created an SOP that could investigate the issue step by step. Behind the scenes, it also generated SPY code so the workflow could move from an exploratory AI-driven draft into deterministic automation.
That shift is the interesting part. This was not just “AI gives me an answer.” It was “AI builds a troubleshooting workflow I can test, refine, and operationalize.”
What the SOP does
The workflow takes a user email, then:
- Pulls Sentry session replays from the last 24 hours
- Summarizes session behavior, pages visited, and frontend errors
- Identifies top transactions to understand what the user spent time doing
- Extracts the workspace ID from replay URLs
- Uses that workspace ID to query SigNoz for backend warnings and errors
- Produces a human-readable activity summary, error analysis, and prioritized next steps
In other words, it connects frontend behavior with backend signals and turns scattered telemetry into a support-ready narrative.
SOP:
Why this matters
Support and engineering teams often spend too much time doing mechanical investigation work:
- Watch a replay
- Check logs
- Infer intent
- Correlate timestamps
- Guess which backend errors matter
- Write up next steps
This SOP compresses that process into something far more systematic.
Instead of manually stitching together Sentry, SigNoz, and product context, Kognitos assembled a workflow that did the correlation automatically and produced a usable report.
What the system found
For workspace rVYWaOvLl49ZGeDix0mhG, the automation surfaced 13 backend error/warn logs, all from kognitos.quill.
The failures fell into a few clear categories:
1. Automation creation failures
The most serious issue was a backend defect in Grimoire:
create_automation - Failed to create automation in Grimoire: No active draft - call begin_draft() first
This directly blocked automation creation.
2. SPY code generation errors
The agent generated invalid SPY in several places:
- Function definitions without required annotations
- An invalid
urltype - Unsupported imports
These were code generation problems, not user mistakes.
3. File lifecycle issues
Some files were referenced before they existed in tmp/, which caused file-not-found failures.
4. Input ordering problems
The agent attempted to save a default input before the automation had defined any inputs with read_input().
5. Security policy violations
Some bash commands using pipes and redirection were blocked by policy.
The real takeaway
The most valuable outcome was not just the list of errors. It was the system’s ability to separate symptoms from root causes and prioritize next actions.
Here’s the ordered view:
P0: Fix the Grimoire draft lifecycle bug blockingcreate_automationP1: Fix Quill code generation so it produces valid SPYP1: Fix tool ordering so inputs are only set after they existP2: Improve error messaging and file lifecycle handling
That is what good troubleshooting should do: not just describe what broke, but identify what matters first.
From AI chat to reliable automation
What I like most about this flow is that it starts creatively and ends deterministically.
I can begin with an open-ended prompt, let the system assemble the investigation logic, test it in draft mode, and then promote it into a repeatable automation with observable runs.
That is the bridge between generative AI and operational reliability.
The runtime behavior itself worked once the defects were understood. The main blockers were not the idea of the automation, but the platform and codegen issues uncovered during execution.
And that is exactly why this kind of workflow is useful: it doesn’t just solve support problems, it helps expose product and platform gaps that teams can actually fix.





