Splunk Search

How to filter data quickly

yuwtennis
Communicator

Hi!

I would like to get advice for search command.

I have a search that looks like below.

index=A | MySearch1 ....
| join type=outer fieldA [ search index=B | MySearch2 .... ]
| join type=outer fieldA [ search index=A | MySearch3 .... ]
| join type=outer fieldA [ search index=C | indexMySearch4 .... ]

I also have a list of fields which I want to search for. The list includes about 20,000 values
Fields
A
B
C
D
E
...

I want to use this list as a input and filter the result with just above values.
I understand that my choices are 1.Use Map, 2.Use inner join ,3. Use subsearch

But none of seems to work out...

I appreciate if someone can share experience of how to filter with thousands of values.

Thanks,
Yu

Tags (2)
0 Karma

yuwtennis
Communicator

Hello linu1988.

Thank you for the reply.

Since calling thousands of boolean expression causes the search performance extremely slow, I have decided to use the lookup by
calling in every join clause.

Thanks for the advice!

0 Karma

somesoni2
Revered Legend

You should be able to filter event from each index (for specific IDs) as follows:

index=A [|inputlookup YourIDLookup.csv | table fieldA | format]| MySearch1 .... 
| join type=outer fieldA [ search index=B [|inputlookup YourIDLookup.csv | table fieldA | format] | MySearch2 .... ] 
| join type=outer fieldA [ search index=A [|inputlookup YourIDLookup.csv | table fieldA | format]| MySearch3 .... ] 
| join type=outer fieldA [ search index=C [|inputlookup YourIDLookup.csv | table fieldA | format]| indexMySearch4 .... ]

This way each dataset to be joined will already be filtered for your specific IDs. However, there could a better solution (by avoiding joins) if you could provide the details about operations you're doing in each subsearch.

0 Karma

linu1988
Champion

same, easiest way to filter the required value is to tell which are the one not required.

e.g.

(index=a AND fieldA!="xyz") OR (...) OR (...)

OR

using lookup
(index=a AND ([|inputlookup lookup_file.csv|table fieldA]) OR ...

While using subsearch keep in mind there will be truncation if you have more than the defined limit (limits.conf)

0 Karma

yuwtennis
Communicator

Hi linu1988.

Thank you for the reply.

The point of my question was not summarize with certain field but filter using certain values in the fields.

So, for example, I have a lookup table like,

ID
1
2
3
4
5
...
10000

and I want to fetch the records that includes 1 - 10000 from index A ,B, C then summarize. What makes worse is that there are already millions of events in the index.

Reason why I was using so many join was all calculation inside the join has different perspectives so it is difficult to calculate in one stat.

Any ideas of how to filter above ID fastly?

Thanks,

0 Karma

MuS
Legend

Hi yuwtennis,

you should use join as last resort since there some down sides to this command. Mostly you can use stats, eventstats or streamstats. Since you're joining all sub searches on field fieldA I assume this field is available in all events and therefore you can use a search like @linu1988 wrote:

index=A OR index=B OR index=C | stats count by fieldA 

or

index=A OR index=B OR index=C | streamstats values(fieldA) AS fieldA | ....

cheers, MuS

0 Karma

yuwtennis
Communicator

Hi MuS.

Thank you for the reply.

I think I would need to clarify my question.
What I was intended was I wanted to know how to fetch records including certain values (please see below)

ID
1
2
3
4
....
10000

I have a list of above data as lookup and just want to fetch the records including above ID.

Any idea how to implement this without becoming the search slow?

0 Karma

linu1988
Champion

Hi Yu we could use a little specific information. The fields you want to use in the join sub searches can be filters as per your desired value. If all of them are having common fields then you can do a search like

index=a OR index=b ...|stats (fuction) by A

0 Karma
Get Updates on the Splunk Community!

Now Available: Cisco Talos Threat Intelligence Integrations for Splunk Security Cloud ...

At .conf24, we shared that we were in the process of integrating Cisco Talos threat intelligence into Splunk ...

Preparing your Splunk Environment for OpenSSL3

The Splunk platform will transition to OpenSSL version 3 in a future release. Actions are required to prepare ...

Easily Improve Agent Saturation with the Splunk Add-on for OpenTelemetry Collector

Agent Saturation What and Whys In application performance monitoring, saturation is defined as the total load ...