Splunk Search

Build table based on multiple results

j0k4b0
New Member

Hi,
I have an issue and have no idea how to solve.

There is a large log index. In this index are application logs amongst other things in the following format:
DATE [GroupID] [ProcessName] [ProcessId] [LogLevel]: Message

It's easy to build a table to check the logs formatted:

 \[*\]*\[(Error OR Warn OR Info)\] 
    | rex field=_raw " \[(?<CorrelationIdItem>.*)\]\ \[(?<ProcessModelNameItem>.*)\]\ \[(?<ProcessInstanceIdItem>.*)\]\ \[(?<Level>.*)\]: (?<Message>.*)"
    | dedup CorrelationIdItem ProcessInstanceIdItem ProcessModelNameItem Level Message 
    | table _time Level CorrelationIdItem ProcessInstanceIdItem ProcessModelNameItem Message 
    | sort -_time

Now I have a hard requirement:
I have to create a table with the Import file name, the numer of found records and have to count the "Finished"-Log entries.

Import file name:
It's a log entry like this: Starting import of file /Path/to/file.ending

Numer of Found Records
It's a log entry like this: 3 events read from file
(I will respect this requirement in step 2).

Finished-Events
Those are event logs. For each event there will be one output: "Finished Process instance in Flow Node"

All the information are grouped by "CorrelationIdItem".

Example Output:
Filename | Should count | Actually counted
file1.txt | 5 | 5
file2.txt | 10 | 8 -> so here I know there is something wrong
file3.txt | 12 | 12

I tried this to get the filename and the FinishedCount:

\[PG*\]*\[(Error OR Warn OR Info)\] Starting import of file
| rex field=_raw "\[(?<CorrelationIdItem>.*)\]\ \[(?<ProcessModelNameItem>.*)\]\ \[(?<ProcessInstanceIdItem>.*)\]\ \[(?<Level>.*)\]:\ Starting\ import\ of\ file\ (?<File>.*)"
| fields CorrelationIdItem ProcessInstanceIdItem ProcessModelNameItem Level File
[ search _raw="Finished Process instance in Flow Node" | stats count as countedFinished by CorrelationIdItem]
| table CorrelationIdItem File CountedFinished

I got all importing filenames correctly. But in the next step I need to count all Finished-Events for each file based on the CorrelationIdItem.

In SQL I would do something like this:

SELECT 
    filename, count(table2.id)
from table1

join table1 
   on table1.CorrelationIdItem=table2.CorrelationIdItem 
   where message like 'Finished Process instance in Flow Node%';

Any Idea how to solve in Splunk? What am I doing wrong?

0 Karma

nickhills
Ultra Champion

You might want to look at transaction for this.

If your logs are consistant in thier format, you could try something like
...|transaction CorrelationIdItem
This will group each "batch" into single events. Your existing regex can the be applied before the transaction events to extract multivalue values.

This will help with your counting requirement as transaction adds a new field eventcount which contains the number of events in the group.

This moves the problem to corrolating each batch with the filename and the exptected number of results, but with the benefit of knwing how you data actually looks, you may be able to use your existing approach.

If my comment helps, please give it a thumbs up!
0 Karma
Get Updates on the Splunk Community!

Unlock Database Monitoring with Splunk Observability Cloud

  In today’s fast-paced digital landscape, even minor database slowdowns can disrupt user experiences and ...

Purpose in Action: How Splunk Is Helping Power an Inclusive Future for All

At Cisco, purpose isn’t a tagline—it’s a commitment. Cisco’s FY25 Purpose Report outlines how the company is ...

[Upcoming Webinar] Demo Day: Transforming IT Operations with Splunk

Join us for a live Demo Day at the Cisco Store on January 21st 10:00am - 11:00am PST In the fast-paced world ...