Splunk Search

Exclude certain values in a field

aramakrishnan
New Member

I have two separate searches, both of which involve serial numbers that make a batch. I'm trying to compute the time taken for batch 1 (search 1=96 serial numbers) and batch 2 (search 2=100 serial numbers).

The issue is that I want to only include the serial numbers that are common to both searches, and from exporting raw data from both these searches on excel, it turns out that there are only 92 common serial numbers between the two searches.

(The search I'm doing for the batches of serial numbers is simply something like "NTEST$$$$B" and "NTEST$$$$A" where $$$$=any combination of numbers, and serial number is a parsable field on Splunk. For instance, for every NTEST1234A there exists an NTEST1234B. And this relationship is true only for 92 of the serial numbers - I verified this on Excel).

Ideally, this is what I'd like to do in raw terms..

"NTEST****A" NOT (....list of extra serial numbers that I want to exclude)| stats max(_time) as maxtime min(_time) as mintime | eval difference=maxtime-mintime

"NTEST****B" NOT (....list of extra serial numbers that I want to exclude)| stats max(_time) as maxtime min(_time) as mintime | eval difference=maxtime-mintime

I assume that the later part of my query will give me the time difference between the earliest event and the latest event for that search. The problem lies with the first part where I know I cannot exclude serial numbers by just typing NOT NTEST####A NTEST&&&&B and so on.

I'm OK with executing them as two separate queries, but I want to make sure I'm only using those 92 common devices. Thanks for any help with this!

Tags (2)
0 Karma

somesoni2
Revered Legend

Try this (assuming this SerialNumber is a field that is extracted in with values like "NTEST$$$$B" and "NTEST$$$$A". If not than first step will be extract those)

"NTEST*A" OR "NTESET*B" | table _time SerialNumber | rex field=SerialNumber "(?<SerialNumberOnly>.*)(?<Type>\w)" | eventstats values(Type) as Types by SerialNumberOnly | where mvcount(Types)=2 | fields - SerialNumberOnly Types | stats min(_time) as min max(_time) as max by Type
0 Karma

aramakrishnan
New Member

Hi @somesoni2, thanks for your prompt response! I have a couple questions.
(1) rex field=SerialNumber " "(? < SerialNumberOnly> . * ) (? < Type >\w)"" ---> what is SerialNumberOnly and Type? for instance, if my serial number is NTEST1234A, would I use "(? < N T E S T 1 2 3 4> . * ) (? < A > \ w)" ?
And if I have multiple such devices, how will I account for that?

To give you an idea of my data set, here's a smaller sample:
TYPE A TYPE B
NTEST1234A NTEST1234B
NTEST2345A NTEST2345B
NTEST3456A NTEST3456B
NTEST9999A NTEST8888B
NTEST1111A NTEST2222B

I basically only want to keep the first three pairs in both types and exclude serial numbers NTEST9999A, NTEST8888B, NTEST1111A, and NTEST2222B (....in the real data set there are about 8 such unique values)

(2) Also, should "Type" be a field? Coz it's not a parsable field on the logs I have.

Thanks so much for clarifying!

0 Karma

somesoni2
Revered Legend

I am assuming that you have your raw data like this

_time, SerialNumber, ...other fields if any
Time1,NTEST1234A,....
Time1,NTEST1234A,.... 
Time1,NTEST1234B,....
Time1,NTEST2345A,....
Time1,NTEST2345B,....
Time1,NTEST3456A,....
Time1,NTEST3456B,....
Time1,NTEST9999A,....
Time1,NTEST8888B,....
Time1,NTEST1111A,....
Time1,NTEST2222B,....

So basically you have fields called _time and SerialNumber. If you have different name for this field, use that.. From SerialNumber field I am extracting fields SerialNumber which will be NTEST$$$$ and Type which will be A OR B.

0 Karma
.conf21 Now Fully Virtual!
Register for FREE Today!

We've made .conf21 totally virtual and totally FREE! Our completely online experience will run from 10/19 through 10/20 with some additional events, too!