How to group and count similar field values

martineisenkoel · ‎11-20-2019

Hi,

Im looking for a way to group and count similar msg strings.
I have the following set of data in an transaction combinded event:

Servicename, msg
SVCA, hostnamexyz: AIX abc- asdf PARTIAL
SVCB, hostnamezyx: AIX abc- asdf PARTIAL
SVCA, hostnamezyx: AIX abc- asdf PARTIAL
SVCB, serice response error 3 of 3
SVCC, service response error of 3

What I would like to achive is a statistic like that:
hostname*: AIX abc- asdf PARTIAL - SVCA - 2
hostname*: AIX abc- asdf PARTIAL - SVCB - 1
service response error of 3 - SVCB -1
service response error of 3 - SVC -1

The values of the msg field arent known and cannot be predicted.

Is there any command/addon/performant way in SPL to do such a statistic based on some citeria like "at least 3 words in a field matching"?

Many thanks in advance!

martineisenkoel · ‎11-20-2019

thanks a lot for your tips!
Unfortunately I didnt phrase my question correctly.
The problem is that I dont know whats in the msg field. The lines above are just anonymised examples.
There are more than 500 different messages coming from various autonoumus monitoring systems where each individual admin could change a message any time.

Our main goal is to identify similar messages/events which are affecting more than one service.
For example similarity would mean to us at least 3 words are matching or 1 word matching and number of words are equal.

to4kawa · ‎11-20-2019

| makeresults
| eval _raw="Servicename, msg
SVCA, hostnamexyz: AIX abc- asdf PARTIAL
SVCB, hostnamezyx: AIX abc- asdf PARTIAL
SVCA, hostnamezyx: AIX abc- asdf PARTIAL
SVCB, serice response error 3 of 3
SVCC, service response error of 3"
| multikv forceheader=1
| table Servicename, msg
| rex field=msg "(?<key>response error|hostname)"
| stats count values(msg) as msg by key , Servicename

Hi, The key is a match for a specific word, and it is tabulated.
How about it?

KailA · ‎11-20-2019

Hello,

You will need to extract the relevant information you need in the msg field.
For example here

| makeresults 
| eval Servicename = "SVCA",msg = "hostnamexyz: AIX abc- asdf PARTIAL" 
| append 
    [| makeresults 
    | eval Servicename = "SVCB",msg = "hostnamezyx: AIX abc- asdf PARTIAL"] 
| append 
    [| makeresults 
    | eval Servicename = "SVCA",msg = "hostnamezyx: AIX abc- asdf PARTIAL"] 
| append 
    [| makeresults 
    | eval Servicename = "SVCB",msg = "service response error of 3"] 
| append 
    [| makeresults 
    | eval Servicename = "SVCC",msg = "service response error of 3"]
| table Servicename,msg
| rex field=msg "(?<newField>AIX.*PARTIAL)"
| eval newField = coalesce(newField,msg)
| stats count BY newField,Servicename

See this working example with your sample of data.
Let me know if it helps you 🙂

How to group and count similar field values

Data Management Digest – December 2025

Index This | What is broken 80% of the time by February?

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...

Join the Conversation

How to group and count similar field values

Data Management Digest – December 2025

Index This | What is broken 80% of the time by February?

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...