Splunk Search

How to filter data quickly

Communicator

Hi!

I would like to get advice for search command.

I have a search that looks like below.

index=A | MySearch1 ....
| join type=outer fieldA [ search index=B | MySearch2 .... ]
| join type=outer fieldA [ search index=A | MySearch3 .... ]
| join type=outer fieldA [ search index=C | indexMySearch4 .... ]

I also have a list of fields which I want to search for. The list includes about 20,000 values
Fields
A
B
C
D
E
...

I want to use this list as a input and filter the result with just above values.
I understand that my choices are 1.Use Map, 2.Use inner join ,3. Use subsearch

But none of seems to work out...

I appreciate if someone can share experience of how to filter with thousands of values.

Thanks,
Yu

Tags (2)
0 Karma

Communicator

Hello linu1988.

Thank you for the reply.

Since calling thousands of boolean expression causes the search performance extremely slow, I have decided to use the lookup by
calling in every join clause.

Thanks for the advice!

0 Karma

SplunkTrust
SplunkTrust

You should be able to filter event from each index (for specific IDs) as follows:

index=A [|inputlookup YourIDLookup.csv | table fieldA | format]| MySearch1 .... 
| join type=outer fieldA [ search index=B [|inputlookup YourIDLookup.csv | table fieldA | format] | MySearch2 .... ] 
| join type=outer fieldA [ search index=A [|inputlookup YourIDLookup.csv | table fieldA | format]| MySearch3 .... ] 
| join type=outer fieldA [ search index=C [|inputlookup YourIDLookup.csv | table fieldA | format]| indexMySearch4 .... ]

This way each dataset to be joined will already be filtered for your specific IDs. However, there could a better solution (by avoiding joins) if you could provide the details about operations you're doing in each subsearch.

0 Karma

Champion

same, easiest way to filter the required value is to tell which are the one not required.

e.g.

(index=a AND fieldA!="xyz") OR (...) OR (...)

OR

using lookup
(index=a AND ([|inputlookup lookup_file.csv|table fieldA]) OR ...

While using subsearch keep in mind there will be truncation if you have more than the defined limit (limits.conf)

0 Karma

Communicator

Hi linu1988.

Thank you for the reply.

The point of my question was not summarize with certain field but filter using certain values in the fields.

So, for example, I have a lookup table like,

ID
1
2
3
4
5
...
10000

and I want to fetch the records that includes 1 - 10000 from index A ,B, C then summarize. What makes worse is that there are already millions of events in the index.

Reason why I was using so many join was all calculation inside the join has different perspectives so it is difficult to calculate in one stat.

Any ideas of how to filter above ID fastly?

Thanks,

0 Karma

SplunkTrust
SplunkTrust

Hi yuwtennis,

you should use join as last resort since there some down sides to this command. Mostly you can use stats, eventstats or streamstats. Since you're joining all sub searches on field fieldA I assume this field is available in all events and therefore you can use a search like @linu1988 wrote:

index=A OR index=B OR index=C | stats count by fieldA 

or

index=A OR index=B OR index=C | streamstats values(fieldA) AS fieldA | ....

cheers, MuS

0 Karma

Communicator

Hi MuS.

Thank you for the reply.

I think I would need to clarify my question.
What I was intended was I wanted to know how to fetch records including certain values (please see below)

ID
1
2
3
4
....
10000

I have a list of above data as lookup and just want to fetch the records including above ID.

Any idea how to implement this without becoming the search slow?

0 Karma

Champion

Hi Yu we could use a little specific information. The fields you want to use in the join sub searches can be filters as per your desired value. If all of them are having common fields then you can do a search like

index=a OR index=b ...|stats (fuction) by A

0 Karma