Tuesday, April 21, 2026

Quick Kognitos Automation - picking a needle:customer info in the haystack:logs

 

The problem

As we were building the product, we kept hitting the same friction: there was no fast way to tie a user to their organization ID. That matters when something goes wrong for a specific org—we want to identify who to contact and act quickly, not chase breadcrumbs across tools.

We needed a way to proactively reach customers when org-related issues arise, with enough context to be helpful in the first message.

Where we started: a prompt to Kognitos

The first slice was intentionally small. We asked Kognitos to use the Sentry API (HTTP requests with Bearer token authentication) to:

  • Fetch the last 30 days of session replays from the organization’s replays endpoint (paginated, with a sensible limit and stats period).
  • Exclude anyone whose email contained our internal domain (kognitos.com), so we only saw real customers.
  • Update a Notion page with what we learned.

That single prompt was enough to prove the idea: session replays carry URLs and user context we could operationalize.

What we turned it into: a repeatable SOP

After a few rounds of iteration, Kognitos created a standard operating procedure from the chat prompts to keep the workflow consistent and auditable.

Overview

The pipeline pulls recent Sentry session replays, filters out internal users, and produces a table that maps each organization to account identifiers and the server hostname where those sessions were recorded.

Execution steps

  1. Fetch replays from Sentry. Call the Replays API for a rolling window (for example, up to 90 days and up to 1,000 results), scoped to the relevant project, authenticated with a Bearer token—matching how we already explore replays in the Sentry UI.
  2. Filter and extract org–user mappings. Drop internal addresses (e.g. kognitos.com or internal mail aliases). For each remaining replay, scan visited URLs for paths like /organizations/{id}/:
    • org_id — the segment right after /organizations/
    • server — the hostname from the absolute URL (e.g. app.us-1.kognitos.com)
    • account_name — derived from the user’s email: typically the domain stem for work email (e.g. al@wip.com → wip), or the full address for consumer providers like Gmail
    We emit one row per unique org_id, with multiple account names comma-separated and deduplicated. We skip relative URLs, URLs without /organizations/, and empty org IDs.
  3. Build org_users_table. Columns: account_name · server · org_id.
  4. Sync new orgs into Notion. Read existing rows from the target Notion database, compare org IDs to today’s run, and append a row for each org not yet present—same three columns as above.

The SOP becomes the living source of “who maps to which org, on which server,” without manual copy-paste from Sentry to Notion Table

Running it on a schedule

Kognitos now runs this flow every day. New customer orgs surface automatically; the table stays aligned with what we’re actually seeing in session replays.

Example output (illustrative)

Shape of the data—not production values in the notion:

account_nameserverorg_id
user@gmail.comapp.us-1.kognitos.comAKp0…
user2@gmail.comapp.us-1.kognitos.comBaCp…
abc-interiorsapp.us-1.kognitos.comCGwi…
maryXYZ@gmail.comapp.us-1.kognitos.comY0UU…
mihirZBC@gmail.comapp.us-1.kognitos.comZvJz…

Why it’s worth it

This isn’t glamour work—it’s glue. But the glue that connects observability (Sentry), customer context (email and org IDs from real sessions), and team workflow (Notion) is what lets support and engineering move in sync. One prompt started it; a clear SOP and a daily schedule keep it honest.

Kognitos Automation: Problem to a Solution

 As we were developing the product, we were missing a quick identification of the org ID to a user.  To fill in the gap and proactively reachout to the customer, if there is an issue with an Org ID, this automation was created.

Quick Prompt to Kognitos - 

Using the Sentry API (via HTTP with Bearer token auth), fetch the last 30 days of session replays from https://sentry.io/api/0/organizations/kognitos/replays/?limit=100&statsPeriod=30d, excluding any users whose email contains "kognitos.com". Update the notion page.

After few further chats, Kognitos built this SOP:

Overview

Fetches the last 30 days of Sentry session replays, filters out internal kognitos.com users, and produces a table mapping each organization to its account names and server.

Execution Steps

  1. Fetch session replays from Sentry

  1. Filter replays and extract org-user mappings

    • Exclude users whose email contains "kognitos.com" or "kogmail"

    • For each remaining replay, scan its visited URLs for /organizations/{id}/ paths:

      • org_id: the path segment immediately after /organizations/

      • server: the hostname from the absolute URL (e.g. app.us-1.kognitos.com)

      • account_name: derived from the user's email domain stem (e.g. al@wip.com → wip); Gmail users use their full email address

    • One row per unique org_id; multiple account names are comma-separated and deduplicated

    • Skip relative URLs (no hostname), URLs without /organizations/, and entries with empty org IDs

  1. Build and output the org_users_table

    • Assemble one row per org_id with columns:

      • account_name: comma-separated deduplicated account name stems

      • server: hostname where this org's sessions were recorded

      • org_id: unique org ID extracted from /organizations/{id}/ in the URL

  1. Sync new orgs to Notion table

    • Notion page: https://www.notion.so/kognitos/v2-Org-to-User-Information

    • Read the existing rows from the Notion table (page ID: xxxx-4b61-8021-95ad-ec7xxxxx,

    • table block ID: xxxx-4b61-8021-95ad-ec7xxxxx

    • Collect all org_ids already present in Notion; compare against today's results

    • Append a new table_row block for each org_id not yet in Notion

    • Columns: account_name | server | org_id

Notion Updated with this info and Kognitos is scheduled to run every day to get the new users info.
account_nameserverorg_id


venXYZ@gmail.comapp.us-1.kognitos.comAKp0bhpmTrUfzefOAb1ZA
divyZ@gmail.comapp.us-1.kognitos.comBaCpUxEoE3dm2WRcL3vCa
abc-interiorsapp.us-1.kognitos.comCGwi5tKnM9hlm3uZoEWei
maryXYZ@gmail.comapp.us-1.kognitos.comY0UUP4IOSoy8PpzuqeHkV
mihirXYZ@gmail.comapp.us-1.kognitos.comZvJzUYvLJQw8nB8hwquHb

Monday, April 13, 2026

Automating Kognitos troubleshooting in Kognitos

 

Automating Kognitos Troubleshooting in Kognitos

What if a support investigation could start with a vague prompt and end with a deterministic, repeatable workflow?

That’s what I tested in Kognitos.

I started with a simple request:

Look at Sentry logs and replay for jerome@kognitos.com, review signoz logs for those runs and tell me what this user was trying to do. Add the summary activity, error, and next steps.

From that one high-level instruction, Kognitos created an SOP that could investigate the issue step by step. Behind the scenes, it also generated SPY code, allowing the workflow to move from an exploratory AI-driven draft to deterministic automation.

That shift is the interesting part. This was not just “AI gives me an answer.” It was “AI builds a troubleshooting workflow I can test, refine, and operationalize.”


Here is to drinking our own 🍷

What the SOP does

The workflow takes a user's email, then:

  • Pulls Sentry session replays from the last 24 hours
  • Summarizes session behavior, pages visited, and frontend errors
  • Identifies top transactions to understand what the user spent time doing
  • Extracts the workspace ID from replay URLs
  • Uses that workspace ID to query SigNoz for backend warnings and errors
  • Produces a human-readable activity summary, error analysis, and prioritized next steps

In other words, it connects frontend behavior with backend signals and turns scattered telemetry into a support-ready narrative.

Actual SOP in Kognitos:





Why this matters

Support and engineering teams often spend too much time doing mechanical investigation work:

  • Watch a replay
  • Check logs
  • Infer intent
  • Correlate timestamps
  • Guess which backend errors matter
  • Write up next steps

This SOP compresses that process into something far more systematic.

Instead of manually stitching together Sentry, SigNoz, and product context, Kognitos assembled a workflow that automatically performed the correlation and produced a usable report.

What the system found

Actual Output

Recorded for future audits:


Final Thoughts: From AI chat to reliable automation

What I like most about this flow is that it starts creatively and ends deterministically.

I can begin with an open-ended prompt, let the system assemble the investigation logic, test it in draft mode, and then promote it into a repeatable automation with observable runs.

That is the bridge between generative AI and operational reliability.

The runtime behavior itself worked once the defects were understood. The main blockers were not the automation idea, but the platform and codegen issues uncovered during execution.

And that is exactly why this kind of workflow is useful: it doesn’t just solve support problems, it helps expose product and platform gaps that teams can actually fix.


New Additions:

I prompted Kognitos to add this: "Review the error details and check them in Linear via existing connection - https://linear.app/kognitos/search , provide me with a table of errors and its corresponding Linear ticket number with Key-Ticketnumber. Search the linear ticket with V2, V2.1, V2,2, oncallv2 labels across MAN, OC, ENG, KOG teams" which added the following to SOP.

 This generated the following output  for a specific user:



Tuesday, December 17, 2024

Mindful Software: Building Agentic Automations using GenAI

Currently, software development and automation are painful. The software or automation team has to complete almost 95% of the process, taking care of all corner cases or the tribal knowledge accumulated over the years. If the developer misses anything, it comes back as a bug, and only the software engineer or automation developer can fix it by including the corner case. In addition to fixing the code, this process has to go through the entire lengthy software development life cycle of change management, QA and deployment in the sandbox and production.

With Kognitos, our customers develop the basic logic for their processes using English syntaxes and improve their accuracy over time by adding learnings. The learnings could be a simple one-liner, new logic to address the corner case, or a new document type. Thus, we create a way to capture tribal knowledge methodically and keep the records forever.

Neuroplasticity: Kognitos's method is not new, but this is how our human brains are designed. Babies are born with fewer neural connections. Humans learn a lot from their surroundings, and other developed humans in the first few years. 



With Kognitos, when the system encounters a new condition/situation, the system prompts an exception and waits for the process owner to review the English exception. The business operations team provides guidance on addressing the new situation. Until the exception is handled, the process will not use any compute resources.

Kognitos supports multiple learnings for similar exceptions, and Gen AI guides the system to the best context based Learning for the current situation or document. Ref: https://caff-ai-nate.blogspot.com/2024/03/vector-databases.html 

Unlearning an obsolete condition:

It is as easy as deleting the Learning from the UI instead of having to rewrite the entire automation.

Example:

The Main Process was activated through an email titled "CarDealer---Customer-Service-PUBLISHED-to-review-an-email-7jxxxxx@sb.kognitos.com." As you can read, we request chat GPT to classify this email concisely and send an email accordingly.

get the email body as the email text

ask koncierge
  the openai model is "gpt-4o"
  the task is "Review {the email text} and {the email subject}and classify the email based on the following rules: For any email inquiring about when a new vehicle will be delivered, the output should be 'Vehicle Delivery Updates'. For any email about fuel for their car, the output should be 'EV Card Issues'. For any email with mileage questions, the output should be 'Mina'. For any email where the sender is stating that they have been in an accident or their vehicle has been damaged, the output should be 'Please refer abcd.sharepoint.com/dealing-in-an-accident'. For any email inquiring about service or repairs, the output should be 'Please refer abcd.sharepoint.com/how-to-service'. Be concise."
get the above as the output

split the email sender with
  the delimiter is "@"
get the above as the email values
get the first email value
get that as the username

send an email to the email sender where
  the subject is "RE: {the email subject}"
  the message is "Dear {the username},<br><br>Thank you for reaching out.<br><br>Based on the content of your message, I have determined that you should...<br><br><br>{the output}  <br><br><br>Thank you again for your inquiry. Please feel free to reach out with any further inquiries.<br><br>Cheers,<br>Kognitos<br><br><br>From: {the email sender}<br>Date: {the email date}<br>Subject: {the email subject}<br><br>{the email body}"

As we see, we forgot to add a condition to see what happens if the information is missing from the email,

Our excellent car salesman can answer the exception used in similar situations. (Learning)

What is AgenticAI?
Agentic AI is a type of AI-driven automation that allows AI agents to perform complex tasks independently and to adapt to changing situations. It can analyse data, recognise patterns, and make decisions without human intervention.

This forays into an Agentic AI solution for your automation. As Kognitos Automation develops, all these exceptions can be used to learn as much as possible without the pitfalls of current GenAI (hallucinations and lack of predictable outcomes). These human interventions, i.e., exceptions, can be learnt. Thus, the Kognitos process created for automation can become more Agentic as the written process and LLMs evolve.

Our product has all the elements of Agentic AI except that we require minimal human intervention when encountering a new situation that needs to be considered while implementing the Kognitos process. As the Kognitos system learns these exceptions, it will eventually be trained to become Agentic. Our Kognitos system can generate new processes with minimal human input as the LLM models evolve.

Nonetheless, we are enhancing the process development lifecycle through the SDLC feature. Stay tuned for more updates.

Watch this demo to understand how our platform interacts with SAP - https://www.kognitos.com/resources/videos/extracting-information-from-sap-sales-order-with-kognitos/ 

Thursday, September 12, 2024

Kognitos: Your AI-Powered Automation System


Introduction:

Kognitos is a revolutionary business automation platform that harnesses the power of Generative AI (GenAI) to streamline your workflows. Our intuitive, English-based interface empowers you to create and manage automation without complex coding.

Analogy:

It is like a human learning a new skill, like how most of us learn to drive.

  1. Read the Drivers Manual ( Connect to Kognitos Books)

  2. Learn to Drive the car with the learner's license ( Playground testing)

  3. Pass the driver’s test ( move it to process)

  4. Learn while driving on the new roads and conditions - new signal on the road (exception handling)

  5. Drive with less effort and zero accidents

Key Features:

  • Natural Language Automation: Describe your desired automation in plain English using our innovative FlexGrammar syntax.
  • Serverless Infrastructure: Leverage the efficiency and scalability of serverless architecture to reduce costs and complexity.
  • Third-Party Integrations: Seamlessly connect to your favorite tools and applications through our extensive library of integrations.
  • Patented Exception Handling: Kognitos learns from exceptions, adapting to new scenarios and avoiding costly downtime. 
  • Continuous Learning: Our platform improves its understanding of your unique processes, ensuring optimal performance over time with manual exception handling

How It Works:

Architecture 



  1. Create Automation: Use our user-friendly interface to define your automation in plain English.
  2. Test and Refine: Experiment with your automation in the playground to ensure it meets your specific needs.
  3. Deploy and Scale: Promote your playground to a process. These processes/automation can be triggered via email or scheduled. 
  4. Continuous Improvement: Kognitos learns from exceptions and adapts to new scenarios based on input, ensuring your automation remains effective forever.

Benefits:

  • Increased Efficiency: Automate repetitive tasks, allowing your team to focus on higher-value work.
  • Reduced Errors: Minimize human error and ensure accuracy in your processes.
  • Faster Time-to-Value: Quickly implement automation without extensive technical coding expertise.
  • Scalability: Easily adapt your automation to changing business needs.
  • Cost Savings: Leverage the efficiency of serverless SaaS Platform

Example Automation:

  • Sample Code(s): ( with our UI preview)
    • Extract doc related to a vendor and translate it into different languages
  • Sample 2:(connect to external app)

    Find the Name in the email body
    Get the above as the lead name
    Find the Title in the email body
    Get the above as the lead title
    Connect to Salesforce
    create a lead in Salesforce with
       the lead status is "New"
       the last name is the lead name
       the title is the lead title
       the lifecycle stage is "marketingqualifiedlead"



    Sample 3 ( with GPT prompts):
  • process each file as follows
  • get the file as a scanned document
  • get the document's lines
  •      ask koncierge
  •          the task is "{the lines} \n-----\n You will be provided with a questionnaire. Find the following information in the document: telephone number, e-mail, nationality. Print the telephone number, e-mail, nationality. No explanation necessary."
  •          the openai model is "gpt-4o"
  •          the rules are "do not include any explanation", "do not include any description", "in the case that a value is not found, just print 'value not in the document' for it. Do not ask for further guidance", "make sure the output is ONLY a json list of rows"
  •         the response format is "table"
  • create a table from the above answer

  • Conclusion:
    Thanks for reading this blog. Kognitos is moving the automation code to English, where the domain experts in accounting/finance and HR can write and manage their day-to-day automation tasks with minimal help from IT/programming experts.
    If you need more information about how Kognitos can help with your workflow/business automation, please visit http://www.kognitos.com.


    Please read the Kognitos Blogs from our CEO and other top-notch industry leaders: http://www.kognitos.com/blog.

Wednesday, March 27, 2024

Embedding AND Vector Databases - creating a long term memory




What Are Vector Databases? - Intelligent memory of the GenAI


While traditional databases store data in rows and columns, a vector database stores data as math vectors. Each piece of data is represented as a point in high-dimensional space, with hundreds or thousands of dimensions. This allows very sophisticated relationships between data points to be captured.


Searching and analyzing vector databases relies on vector mathematics and similarity calculations. By comparing vector positions, highly relevant results can be returned, even if there are no exact keyword matches.
Vector databases index and store the vector embeddings/tokens for faster retrieval at interactive speeds and similarity search with capabilities like CRUD (create, read, update, and delete) operations, horizontal scaling, and serverless.

Why Are Vector Databases Important for AI?


Vector databases are ideal for managing and extracting insights from the enormous datasets required to train modern AI models. Key advantages include:

In the midst of the Gen AI revolution, efficient data processing is crucial not only for GenAI but also for efficient semantic search. GenAI and semantic search rely on vector embeddings/tokens. This vector data representation carries semantic information critical for the AI to gain understanding and maintain a long-term memory they can draw upon when executing complex tasks.

Embeddings/Tokens

LLMs generate embeddings with many attributes or features linked to each other to represent different dimensions essential to understanding patterns, relationships, etc., making their representation challenging to manage.

That is why we need a specialized database to handle this data type. Vector databases like Pinecone meet this by offering optimized storage and querying capabilities for embeddings. Vector databases have the capabilities of a traditional database that are absent in standalone vector indexes and the specialization of dealing with vector embeddings, which traditional scalar-based databases lack.

Embeddings (arrays of numbers) represent data(words and images transformed into numerical vectors that capture their essences). For example, the phrase puppy and dog will have similar embeddings with vectors close to each other. These embeddings are stored on the vector DB.
Puppy = 0.3, 0.5, 0.9, 0.8, 0.4...]
Dog =[0.1,0.51, 0.6, 0.2, 0.8,,,]
Numbers depend on the ML algorithm and model.

If you can convert a text, sentence or image into many vectors, you can compare, detect, and find the closest cosine similarity, semantic similarity, etc.

OpenAI’s text embeddings measure the relatedness of text strings. Embeddings are commonly used for:

  • Search (where results are ranked by relevance to a query string)
  • Clustering (where text strings are grouped by similarity)
  • Recommendations (where items with related text strings are recommended)
  • Anomaly detection (where outliers with little relatedness are identified)
  • Diversity measurement (where similarity distributions are analyzed)
  • Classification (where text strings are classified by their most similar label)

Embedding Apps - GloVe, OpenAI, Word2Vec
Vector DBs are pinecone, Milvus, PgVector, Weaviate

here is how to create an embedding for the text "food" via an Open AI model 

curl https://api.openai.com/v1/embeddings \

  -H "Authorization: Bearer sk-“VPVgpYYi5znT3BlbkFJj0otiGN" \

  -H "Content-Type: application/json" \

  -d '{

    "input": "food"",                             

    "model": "text-embedding-ada-002",

    "encoding_format": "float"

  }'


More details: (Credits)

1. https://platform.openai.com/docs/api-reference/embeddings

2. good video course - explains the theory as well as the setting up of vector db


3. 
 3. https://www.youtube.com/watch?v=ySus5ZS0b94

Right size the vector DB:

Setting up Vector stores introduces new challenges. For example, correctly partitioning large data that cannot fit entirely in RAM in vector stores like Milvus is not easy. 
- Doing it poorly/under partitioning can result in some queries taking up too much RAM and bringing the service down.
- RAG responsiveness significantly depends on reducing the probes required to find relevant documents. So avoid over-partitioning as well

The Road Ahead

As GenAI moves into mainstream applications, vector databases' role will only grow. Their ability to organize and structure knowledge in a format tailored for AI aligns with the needs of next-gen generative models. 


Combining vector databases and transformers allows GenAI to understand language meaning rather than just keywords. This next-generation AI capability, powered by vector math, delivers such natural, intelligent conversations.







Friday, March 15, 2024

Data for AI - Storage, ETL, Prepare, Clean and update the data


Taking your good data to AI



The most commonly used phrases

  • Garbage in, Garbage Out 
  • Bad input produces bad output 
  • Output can be only as good as input. 

Soon: Ethically Sourced, Organically Raised, Grass Fed Data at a Higher Price.

If we properly source and manage the data, LLMs will be trained on the correct data, causing fewer hallucinations. Unremembering or Unlearning specific segments of LLM will be one of the significant facets of GenAI in future.

Teaching the kids wrong things is worse than not teaching them at all.

https://www.theverge.com/2016/3/24/11297050/tay-microsoft-chatbot-racist


Why do we need to be careful about source data?

1. Incorrect Information: This could lead to AI providing answers that could be disruptive. Need to be careful when prescribing steps for a problem that could lead to severe complications
2. PII and Secure Data: Inadvertently sharing the secure private data of one client to another client. Data Classification and Desensitization using GenAI to preprocess the data to be utilized by AI is becoming a significant business proposition. There are quite a few startups in this space.
3. Feeding Data driven by an agenda: IMHO, we all know about the Gemini fiasco that was providing results that were not truthful because the truth hurts or the truth is not politically correct
4. Properiterary/Copyright Data: How do we monetize and attribute these proprietary research data to the correct author and content creator to prevent plagiarism and reward the inventor? This would be another area of new startups.  
5. Using Publically Available Data has its downfall as well.
"Generative AI copyright battles have been brewing for over a year, and many stakeholders, from authors, photographers and artists to lawyers, politicians, regulators and enterprise companies, want to know what data trained Sora and other models — and examine whether they really were publicly available, properly licensed, etc."

The legal side is a big part of this, but let us review the technical side.

Here are some thoughts on data - types of data, storing, accessing, cleaning, preparing and updating the data

1) Structured Data: Structured data fits neatly into data tables and includes discrete data types such as numbers, short text, and dates.
2) Unstructured Data: Unstructured data, such as audio and video files and large text documents, doesn't fit neatly into a data table because of its size or nature.
3) How to store the data - Fast Storage like VAST and Pure stocks are rising as demand for low latency storage requirements increase
4) Sourcing the data without latency - primary data accessed by the business applications can't be used for observability using AI insights/analytics because it will impact the performance of the production business applications. Again, backup data can't used for analytics as it will generally be a few days older, and the answers will be aged. Databricks/Snowflake are pioneers in the Warehouse/DataLake and Lakehouse technologies with ETL pipelines using Apache Spark to manage both structured and unstructured data with the ability to run CPU-intensive queries on these data. This helps to replicate the data almost immediately for training LLM/analytics purposes.
5) Preparing the data for AI - 
     a) Improve the data quality, 
     b) integrate multiple data sources - Data integration can help you access and analyze more data, enrich your data with additional attributes, and reduce data silos and inconsistencies. ETL with data sync can help. Databricks is helpful for this.
     c) Data labelling: To label your data, you can use tools and techniques such as data annotation, classification, segmentation, and verification.
     d) Data augmentation can help with data scarcity, reduce bias, and improve data generalization and robustness.
     e) Data Governance: Data governance involves defining and implementing policies, processes, roles, and metrics to manage your data throughout its lifecycle. It can help you ensure that your data quality, integration, labelling, augmentation, and privacy are aligned with your AI objectives, standards, and best practices. You can use frameworks and platforms such as data strategy, stewardship, catalogue, and lineage to establish your data governance. 

6) Desensitizing the data for AI: To protect your data privacy, you can use tools and techniques such as data encryption, anonymization, consent, and audit.
7) Data management with proper Authentication/Authorization(IAM): Store and Isolate the data based on the users. Multitenancy and reducing cross-pollination of data without less cost. Having one LLM for each client will be an expensive proposition.
Secure-minded design to protect the data:
 Tier structure for LLMs—general, Domain-Specific, and private LLMs to protect the data or RAG/Grounding with hashed metadata embeddings in VectorDB.



Quick Kognitos Automation - picking a needle:customer info in the haystack:logs

  From Sentry Replays to Org IDs: Closing the Gap With Automation When product development outpaces internal tooling, a quick prompt to Kogn...