Splunk Search

Can you help me create a search that efficiently filters consecutive events?

New Member

Some of my logs are generated via automatic jobs and I want to filter them away. What is the best way to filter away a sequence of consecutive events after sorting?

For example, these are my events:

sessionID | logNo | logText
sess1 | 1 | abc
sess1 | 2 | def
sess1 | 3 | ghi
sess2 | 1 | abc
sess2 | 2 | def
sess2 | 3 | ghi
sess3 | 1 | keep
sess3 | 2 | this
sess4 | 1 | abc
sess4 | 2 | def
sess4 | 3 | ghi
sess4 | 4 | something else

base search | sort sessionID, logNo | ...

The end result is that I only want to retain sess3 and sess4 events, while filtering away sess1 and sess2.
To explain, I have identified a set of logs that are generated automatically having the same sessionID.
This is what I intend to filter away from showing up.

logNo | logText
1 | abc
2 | def
3 | ghi

However, I do not want to filter away sess4 as it contains as additional logNo4 even though it's logNo 1 to 3 are what I want to filter away previously.

I have an idea to parse the events into something like:

sessionID | combined
sess1 | 1 abc 2 def 3 ghi
sess2 | 1 abc 2 def 3 ghi
sess3 | 1 keep 2 this
sess4 | 1 abc 2 def 3 ghi 4 something else

Then use a where combined!="1 abc 2 def 3 ghi" to filter away the automatic generated logs in its entirety.

The solution has to scale for at least 3..n consecutive events and the combined logText can be rather long to the tune of >1000 characters each.

If this is a good approach, how can I go about doing it? If not, are there any better and computationally efficient ways to achieve this?

Thanks in advance!

0 Karma


I'm not sure it's clear what your end goal is. About 75% of your objective seems really easy, with something like

| inputlookup answers.csv
| eval unique_strings = logNo . "-" . logText
| stats values(unique_strings) AS uniques BY sessionID
| eval uniques = mvjoin(uniques, ",")
| dedup uniques

That gives output

sessionID       uniques 
sess1   1-abc,2-def,3-ghi
sess3   1-keep,2-this
sess4   1-abc,2-def,3-ghi,4-and_this 

(Of course I obviously substituted a csv lookup of your data, comma separated instead of pipes, instead of calling wherever you get your data from).

Anyway, that's easy enough, but ... I'm just not quite sure where you are going. Sometimes knowing a why can help us get you there or find another solution to the problem.

(And I think we can get to what might be your final solution - I'm just not quite sure I understand the why of it yet)

But, maybe that alone helps?


0 Karma

New Member

thanks, I am looking to filter away a pre-known sequence of logs in the search. for example, I know that due to some automation scripts, there will always be a set of logs that looks something like:

some session ID X | 1 | abc
some session ID X | 2 | def
some session ID X | 3 | ghi

I want to filter this sequence of logs away as they are not useful and clog up the report.
By the pipes, I was trying to show that its different fields/columns.

0 Karma


Are these records time oriented? Using streamstats may also be an option (as part of the solution only).

0 Karma
Get Updates on the Splunk Community!

Build Scalable Security While Moving to Cloud - Guide From Clayton Homes

 Clayton Homes faced the increased challenge of strengthening their security posture as they went through ...

Mission Control | Explore the latest release of Splunk Mission Control (2.3)

We’re happy to announce the release of Mission Control 2.3 which includes several new and exciting features ...

Cloud Platform | Migrating your Splunk Cloud deployment to Python 3.7

Python 2.7, the last release of Python 2, reached End of Life back on January 1, 2020. As part of our larger ...