Splunk Search

find 5 "Errors" peak points by server

indeed_2000
Motivator

Hi

I  need to find 5 "Errors" peak points by server and sort by date

 

here is my spl:

index="myindex" err* | rex field=source "\/data\/(?<product>\w+)\/(?<date>\d+)\/(?<servername>\w+)"

| eventstats count as Errors by servername

 

 

expected output:

servername                               Time                                                           peak points Errors count

server1                                       2021-11-19 02:00:00,000          500  

                                                          2021-11-19 10:00:00,000         450

                                                          2021-11-19 18:00:00,000        300

                                                          2021-11-19 20:00:00,000        800

                                                          2021-11-19 23:00:00,000         9000

 

server2                                       2021-11-19 01:00:00,000         250

                                                          2021-11-19 03:00:00,000       480

                                                          2021-11-19 08:00:00,000        30000

                                                          2021-11-19 09:00:00,000        463

                                                          2021-11-19 10:00:00,000      100

 

Labels (5)
0 Karma
1 Solution

PickleRick
SplunkTrust
SplunkTrust
<your search for errors> | bin _time span=1h
| stats count by server _time
| sort server - count
| streamstats count as tempcount by server
| where tempcount <= 5
| table server _time count

If you want to group them by server, instead of the table at the end you can do

| stats list(_time) as _time list(count) as count by server

 

View solution in original post

PickleRick
SplunkTrust
SplunkTrust

I still don't understand what you want.

You parse the "product" field from the event yet don't use it.

What is this error peak? Is it a count of error occcurrences per host? Or is it some value from the event? How does it correspond to the timestamp?

0 Karma

indeed_2000
Motivator

here is the main goal:

I have couple of servers that when I search below spl it return green bar chart that show me in last 24 (each hour) how many error occured:

index="myindex" err*  earliest=-1d@d

 

in some hours in day we have high number of errors e.g (01:00AM, 18:00PM, 20:00PM)

just need to show them in table by server. top 5 error peak thats it.

 

0 Karma

PickleRick
SplunkTrust
SplunkTrust

OK. So for each server you want top 5 hours with highest error count? Is that it?

0 Karma

indeed_2000
Motivator

Exactly

0 Karma

PickleRick
SplunkTrust
SplunkTrust
<your search for errors> | bin _time span=1h
| stats count by server _time
| sort server - count
| streamstats count as tempcount by server
| where tempcount <= 5
| table server _time count

If you want to group them by server, instead of the table at the end you can do

| stats list(_time) as _time list(count) as count by server

 

ITWhisperer
SplunkTrust
SplunkTrust

It isn't clear over what time period you are measuring peaks but since your example has different hours and the times are close to the end of the hour I am going to assume you are counting error events by hour.

| gentimes start=-1 increment=1m 
| rename starttime as _time 
| eval server="server".mvindex(split("ABCD",""),random()%4)
| eval count=random()%10
| table _time server count



| bin _time as time span=1h
| eventstats sum(count) as total latest(_time) as peaktime by time server
| where _time=peaktime
| sort 0 server -total
| streamstats count as rank by server
| where rank < 6


| fieldformat peaktime=strftime(peaktime,"%F %T")
| fieldformat time=strftime(time,"%F %T")
0 Karma

indeed_2000
Motivator

time scope is daily and belong to yesterday. and you right counting error events by hour.

I try spl that you mention, but need to produce this:

servername                               Time                                                           peak points Errors count

server1                                       2021-11-19 02:00:00,000           500  

                                                          2021-11-19 10:00:00,000           450

                                                          2021-11-19 18:00:00,000          300

                                                          2021-11-19 20:00:00,000          800

                                                          2021-11-19 23:00:00,000          9000

 

server2                                       2021-11-19 01:00:00,000           250

                                                          2021-11-19 03:00:00,000         480

                                                          2021-11-19 08:00:00,000          30000

                                                          2021-11-19 09:00:00,000          463

                                                          2021-11-19 10:00:00,000        100

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

So what does your search look like when you apply the ideas from my solution and what results do you get and how do they not match with what you are after?

0 Karma

indeed_2000
Motivator

index="myindex" err* source="/data/product/*/server*/*"
| rex field=source "\/data\/(?<product>\w+)\/(?<date>\d+)\/(?<server>\w+)"

| eventstats count as counter
| table _time server counter

| bin _time as time span=1h
| eventstats sum(counter) as total latest(_time) as peaktime by time server
| where _time=peaktime
| sort 0 server -total
| streamstats count as rank by server
| where rank < 6

| fieldformat peaktime=strftime(peaktime,"%F %T")
| fieldformat time=strftime(time,"%F %T")



_time                                                       server    counter    peaktime                          rank    time                                           total
2021-11-19 23:29:34.658    Server    9827    2021-11-19 23:29:34    1    2021-11-19 22:30:00    2191421
2021-11-19 20:28:27.490    Server    9827    2021-11-19 20:28:27    2    2021-11-19 19:30:00    2053843
2021-11-19 20:28:27.490    Server    9827    2021-11-19 20:28:27    3    2021-11-19 19:30:00    2053843
2021-11-19 04:29:52.897    Server    9827    2021-11-19 04:29:52    4    2021-11-19 03:30:00    2014535
2021-11-19 21:29:38.376    Server    9827    2021-11-19 21:29:38    5    2021-11-19 20:30:00    1975227
2021-11-19 18:29:58.330    Server2    9827    2021-11-19 18:29:58    1    2021-11-19 17:30:00    2368307
2021-11-19 18:29:58.330    Server2    9827    2021-11-19 18:29:58    2    2021-11-19 17:30:00    2368307
2021-11-19 11:29:47.954    Server2    9827    2021-11-19 11:29:47    3    2021-11-19 10:30:00    2289691
2021-11-19 20:29:58.899    Server2    9827    2021-11-19 20:29:58    4    2021-11-19 19:30:00    2171767
2021-11-19 23:29:54.958    Server2    9827    2021-11-19 23:29:54    5    2021-11-19 22:30:00    2083324
2021-11-19 23:55:41.719    Server3    9827    2021-11-19 23:55:41    1    2021-11-19 23:30:00    452042
2021-11-19 02:29:20.484    Server3    9827    2021-11-19 02:29:20    2    2021-11-19 01:30:00    383253
2021-11-19 18:29:39.514    Server3    9827    2021-11-19 18:29:39    3    2021-11-19 17:30:00    324291
2021-11-19 03:19:41.949    Server3    9827    2021-11-19 03:19:41    4    2021-11-19 02:30:00    265329
2021-11-19 19:21:27.495    Server3    9827    2021-11-19 19:21:27    5    2021-11-19 18:30:00    265329
2021-11-19 02:28:42.524    Server4    9827    2021-11-19 02:28:42    1    2021-11-19 01:30:00    452042

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

What is it that you think eventstats count as counter is doing in this situation?

0 Karma

indeed_2000
Motivator

count number of errors, and seems it's incorrect.

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Yes, it is incorrect - try this

index="myindex" err* source="/data/product/*/server*/*"
| rex field=source "\/data\/(?<product>\w+)\/(?<date>\d+)\/(?<server>\w+)"

| bin _time as time span=1h
| stats count as total latest(_time) as _time by time server
| sort 0 server -total
| streamstats count as rank by server
| where rank < 6
0 Karma

indeed_2000
Motivator

fixed but I think span has not work correctly.

1637352000Server11312021-11-20 00:29:49.3601
1637355600Server13232021-11-20 01:28:59.7362
1637359200Server11362021-11-20 02:29:10.8343
1637362800Server11402021-11-20 03:29:58.3424

 

Expected:

 

1637352000Server11312021-11-20 00:00:00.0001
1637355600Server13232021-11-20 01:00:00.0002
1637359200Server11362021-11-20 02:00:00.0003
1637362800Server11402021-11-20 03:00:00.0004

 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Your original expected output had times at different points during the hour, which is what you got. If you wanted the times to be the beginning of the hour, this was in the time field. Please try to be more precise about what you are expecting the output to be.

0 Karma

indeed_2000
Motivator

Sorry about it, fix post.

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

That makes it simpler

index="myindex" err* source="/data/product/*/server*/*"
| rex field=source "\/data\/(?<product>\w+)\/(?<date>\d+)\/(?<server>\w+)"

| bin _time span=1h
| stats count as total by _time server
| sort 0 server -total
| streamstats count as rank by server
| where rank < 6
| table server _time count
0 Karma

indeed_2000
Motivator

current output:
server _time                                           count
Server1 2021-11-21 11:30:00 28
Server1 2021-11-21 05:30:00 25
Server1 2021-11-21 07:30:00 25
Server1 2021-11-21 10:30:00 25
Server2 2021-11-21 18:30:00 2061
Server2 2021-11-21 13:30:00 668
Server2 2021-11-21 12:30:00 562
Server2 2021-11-21 11:30:00 481
Server3 2021-11-21 17:30:00 110
Server3 2021-11-21 12:30:00 73
Server3 2021-11-21 07:30:00 61
Server3 2021-11-21 18:30:00 60

 

 

expected output:
server _time                                           count
Server1 2021-11-21 11:00:00 28
                    2021-11-21 05:00:00 25
                    2021-11-21 07:00:00 25
                    2021-11-21 10:00:00 25
Server2 2021-11-21 18:00:00 2061
                    2021-11-21 13:00:00 668
                    2021-11-21 12::00:00 562
                    2021-11-21 11:00:00 481
Server3 2021-11-21 17:00:00 110
                    2021-11-21 12:00:00 73
                    2021-11-21 07:00:00 61
                    2021-11-21 18:00:00 60

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust
index="myindex" err* source="/data/product/*/server*/*"
| rex field=source "\/data\/(?<product>\w+)\/(?<date>\d+)\/(?<server>\w+)"

| bin _time span=1h
| stats count as total by _time server
| sort 0 server -total
| streamstats count as rank by server
| where rank < 6
| stats list(_time) as _time list(count) as count by server
| table server _time count
0 Karma

indeed_2000
Motivator

return this:

server          _time                                                                                                                                                                  count
Server1     1637611200,1637568000,1637571600,1637575200,1637557200
Server2     1637611200,1637571600,1637568000,1637575200,1637560800

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Can you format these times to human readable

| eval time=strftime(_time,"%F %T")
0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...