Splunk Search

How to generate a search that will sort an error according to its frequency?

dariux
New Member

Hi All,

I have an unidentified number of devices generating a Buffer error alarm any 125 seconds.
To find the error for each IP, I simply use this query:
e.g.
(10.299.299.27 "Receive Buffer Error Detected")

To find all these kind of errors in the network i use this query:

( index = dvn2 "Receive Buffer Error Detected") | dedup_raw

I would like to know if there is a query to find all the IPs in the network that are generating this error ONLY any 125 seconds?

Thanks everybody in advance

0 Karma
1 Solution

gokadroid
Motivator

Assuming that the events are indexed at the time they occurred so I am using ** _time** as a reference to calculate the sec's difference. Else it needs to be extracted and then difference needs to be calculated. Also I did a divide by 60 to get seconds out of time difference

index = dvn2 "Receive Buffer Error Detected" 
| rex "(?<ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})" 
| reverse
| sort ip
| autoregress _time as newTime p=1
| autoregress ip as newIp p=1
| eval timeDiff=if(ip=newIp,(_time - newTime)/60, 0) 
| table ip,  _time , timeDiff, newIp
| where timeDiff=125

Editing as per the comments:
Change the last three lines of above query as per the need in below format:

| eval timeDiff=if(ip=newIp,floor(_time - newTime), 0) 
| table ip,  _time , timeDiff, newIp
| where timeDiff=125

OR

| eval timeDiff=if(ip=newIp,(strptime(_time, "%Y-%m-%d %H:%M:%S") - strptime(newTime, "%Y-%m-%d %H:%M:%S"), 0) 
| table ip,  _time , timeDiff, newIp
| where timeDiff=125

View solution in original post

gokadroid
Motivator

Assuming that the events are indexed at the time they occurred so I am using ** _time** as a reference to calculate the sec's difference. Else it needs to be extracted and then difference needs to be calculated. Also I did a divide by 60 to get seconds out of time difference

index = dvn2 "Receive Buffer Error Detected" 
| rex "(?<ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})" 
| reverse
| sort ip
| autoregress _time as newTime p=1
| autoregress ip as newIp p=1
| eval timeDiff=if(ip=newIp,(_time - newTime)/60, 0) 
| table ip,  _time , timeDiff, newIp
| where timeDiff=125

Editing as per the comments:
Change the last three lines of above query as per the need in below format:

| eval timeDiff=if(ip=newIp,floor(_time - newTime), 0) 
| table ip,  _time , timeDiff, newIp
| where timeDiff=125

OR

| eval timeDiff=if(ip=newIp,(strptime(_time, "%Y-%m-%d %H:%M:%S") - strptime(newTime, "%Y-%m-%d %H:%M:%S"), 0) 
| table ip,  _time , timeDiff, newIp
| where timeDiff=125

niekita
Engager

Thank you very much, gokadroid, very helpful. Now the problem: I can filter out similar results coming in a certain period (let's say between 10-12 seconds), however I am loosing the last result from a sequence. If an event is logged 5 times I get only 4 times in a result, the 5th gets lost. Is there an easy way to solve that?

0 Karma

dariux
New Member

Thanks for your fast reply Gokadroid.

Running your query I got this error:

Error in 'rex' command: Encountered the following error while compiling the regex '

0 Karma

gokadroid
Motivator

try now...I have updated the regex, couple of brackets were here and there...

Also try to run the query without last | where timeDiff=125 . So that you can do appropriate divide in the eval timeDiff. I did a divide by 60 assuming the epoch time will be a seconds number for you. If its bigger we might need to divide additionally by 1000 etc depending on which level of "seconds" (milli, nano etc) the result shows up in,

dariux
New Member

Hi Gokadroid,

Thanks once again for your help.
Running the query without last 'timediff=125' i got something like this (it's just part of the results):

ip                                            _time                             timeDiff               newIp

1 10.60.0.8 2016-11-03 00:03:28 0

2 10.60.0.8 2016-11-03 00:03:28 0 10.60.0.8
3 10.60.0.8 2016-11-03 00:03:29 0.016667 10.60.0.8
4 10.60.0.8 2016-11-03 00:03:29 0 10.60.0.8
5 10.60.0.8 2016-11-03 00:03:35 0.100000 10.60.0.8
6 10.60.0.8 2016-11-03 00:03:35 0 10.60.0.8
7 10.60.0.8 2016-11-03 00:03:35 0 10.60.0.8
8 10.60.0.8 2016-11-03 00:03:35 0 10.60.0.8
9 10.60.0.8 2016-11-03 00:10:46 7.183333 10.60.0.8
10 10.60.0.8 2016-11-03 00:10:46 0 10.60.0.8
11 10.60.0.8 2016-11-03 00:11:13 0.450000 10.60.0.8
12 10.60.0.8 2016-11-03 00:11:13 0 10.60.0.8

Should I divide for 6000000 ?

The eval timeDiff results seems to be "to much accurate"
I don't want to have milliseconds in my search, but only seconds. Is that possible?

thanks

0 Karma

gokadroid
Motivator

Looking at these lines of your output it makes me believe time being returned is already seconds and we do not need to divide anything (That is if you divided by 60 in | eval timeDiff=if(ip=newIp,(_time - newTime)/60, 0)😞

1 10.60.0.8 2016-11-03 00:03:28 0
2 10.60.0.8 2016-11-03 00:03:28 0 10.60.0.8
3 10.60.0.8 2016-11-03 00:03:29 0.016667 10.60.0.8
4 10.60.0.8 2016-11-03 00:03:29 0 10.60.0.8
5 10.60.0.8 2016-11-03 00:03:35 0.100000 10.60.0.8

I would suggest change the following line in the query:
| eval timeDiff=if(ip=newIp,(_time - newTime)/60, 0)

Replace above line in query with this one if you wanna use floor:
| eval timeDiff=if(ip=newIp, floor(_time - newTime), 0)

or with this if u wanna use the strptime to make it seconds friendly
| eval timeDiff=if(ip=newIp, (strptime(_time, %Y-%m-%d %H:%M:%S) - strptime(newTime, %Y-%m-%d %H:%M:%S)), 0)

and then complete the search with
| table ip, _time , timeDiff, newIp
| where timeDiff=125

0 Karma

dariux
New Member

Thanks a lot! It works.

0 Karma

sundareshr
Legend

Try this to get list of IPs generating the error. Not sure I understand what you mean by "any 125 seconds"

index = dvn2 "Receive Buffer Error Detected" | rex "<?ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}})" | stats count by ip

This assumes the IP address occurs ONLY ONCE in your event. If it can occur more than once, please share some sample data

0 Karma

dariux
New Member

Hi Sundareshr,

Thanks a lot for your reply.

I got something like 150 devices in my network and almost all of them generate buffer error log during the day.

Only some of these devices are generating this buffer error log, with a frequency of 125 seconds.

e.g

ERR1
    10/23/16
**10:34:08.000 PM** 
<11>2016-10-24T09:34:08+11:00 10.245.241.21  tM8kCycle:  VRX-4Sch(2022) Slot=22 Receive Buffer Error Detected S3 L2

ERR2
    10/23/16
**10:35:33.000 PM** 
<11>2016-10-24T09:35:33+11:00 10.245.241.21  tM8kCycle:  VRX-4Sch(2022) Slot=22 Receive Buffer Error Detected S3 L2

I would like to find all the devices which are generating this log with the mentioned frequency of 125 seconds.

With this query I see all the bug error logs in the network (all the IPs) : index = dvn2 "Receive Buffer Error Detected"

I need to put another filter for the 125 seconds frequency.

Is that possible?

Hope I've described it better this time 🙂

Thanks once again.

0 Karma
Get Updates on the Splunk Community!

Data-Driven Success: Splunk & Financial Services

Splunk streamlines the process of extracting insights from large volumes of data. In this fast-paced world, ...

Video | Welcome Back to Smartness, Pedro

Remember Splunk Community member, Pedro Borges? If you tuned into Episode 2 of our Smartness interview series, ...

Detector Best Practices: Static Thresholds

Introduction In observability monitoring, static thresholds are used to monitor fixed, known values within ...