Splunk Search

How to match on a field, then get the next x (10 in this case) number of events and count them?

dbcase
Motivator

Hi

I have data that looks like this

654660,"reboot","Reason: Uc-keypad hung","TCA200","5","TCA200_Quadra_MR7","20160908070550.624",
20690,"reboot","Reason: Unknown","TCA203","9","TCA203_Quadra_MR7_HF0","20160908163431.250",
20690,"reboot","Reason: Firmware upgrade","TCA203","9","TCA203_Quadra_MR7_HF0","20160908163257.629",

etc etc

What I would like to do is match on the Reason field where it is equal to Uc_keypad hung. That's easy enough. What I'd like to do from there is capture the next 10 or so events and count them. This is the part that I'm unclear on how to do.

The reason we are trying to do this is we have hardware that is locking up with the Uc_keypad hung message, but we don't know why, so we want to capture the events leading up to the lockup and see if there is some sort of pattern.

0 Karma
1 Solution

somesoni2
SplunkTrust
SplunkTrust

Give this a try (assuming field Reason is extracted. If not, replace Reason="Uc-keypad hung" with searchmatch("Reason: Uc-keypad hung") in the eval-if)

your base search giving events in reverse chronological order of time (latest event first) | eval sno=if(Reason="Uc-keypad hung",1,0) | accum sno | stats count by sno | where sno<=10

The last number (10) is the number of events that you want to show after the "Reason: Uc-keypad hung" event has happened.

View solution in original post

somesoni2
SplunkTrust
SplunkTrust

Give this a try (assuming field Reason is extracted. If not, replace Reason="Uc-keypad hung" with searchmatch("Reason: Uc-keypad hung") in the eval-if)

your base search giving events in reverse chronological order of time (latest event first) | eval sno=if(Reason="Uc-keypad hung",1,0) | accum sno | stats count by sno | where sno<=10

The last number (10) is the number of events that you want to show after the "Reason: Uc-keypad hung" event has happened.

dbcase
Motivator

Hmmmmm.....

My search is wide open

index=top10_1  | eval sno=if(Reason="Uc-keypad hung",1,0) | accum sno | stats count by sno | where sno<=10

I think the above may be a step in the right direction but still not there yet.

Let me try to clarify a bit....

Below is a UC-keypad hung with 1 event leading up to it (reason=reboot by server request) as you can see there are several hours between the two events (hence why I can't use time).

What I'd like to have happen is a table that lists
reboot by server request with a count of 1 (using this example.... Ideally the count would be much higher)

So something like this

Find a Uc-keypad hung event
retrieve the preceding 10 events
Count the reason
Display a table with 2 columns Reason and Count

9/8/16
5:50:42.121 AM  
1002210,"reboot","Reason: Uc-keypad hung","TCA203","5","TCA203_Quadra_MR7","20160908055042.121",
COMMAND_TYPE = 5 CPE_CONVERTED = 20160908055042.121 HARDWARE_MODEL = TCA203 HARDWARE_REV = 5 PREMISE = 1002210 PREMISE_FK = 1002210 REBOOT_REASON = Reason: Uc-keypad hung RESULTS = 20160908055042.121 STATUS = TCA203_Quadra_MR7 date_hour = 5 date_mday = 8 date_minute = 50 date_month = september date_second = 42 date_wday = thursday date_year = 2016 date_zone = local eventtype = external-referer  eventtype = visitor-type-referred host = datamine.icontrol.com index = top10_1 linecount = 1 source = /home/oracle/uckpd_workdir/Reboots_Summary.csv sourcetype = csv splunk_server = idx2.icontrol.splunkcloud.com timeendpos = 94 timestartpos = 76 unix_category = all_hosts unix_group = default


9/8/16
1:49:53.928 AM  
1002210,"reboot","Reason: Reboot by server request","TCA203","5","TCA203_Quadra_MR4_NHF_HF","20160908014953.928",
COMMAND_TYPE = 5 CPE_CONVERTED = 20160908014953.928 HARDWARE_MODEL = TCA203 HARDWARE_REV = 5 PREMISE = 1002210 PREMISE_FK = 1002210 REBOOT_REASON = Reason: Reboot by server request RESULTS = 20160908014953.928 STATUS = TCA203_Quadra_MR4_NHF_HF date_hour = 1 date_mday = 8 date_minute = 49 date_month = september date_second = 53 date_wday = thursday date_year = 2016 date_zone = local eventtype = external-referer  eventtype = visitor-type-referred host = datamine.icontrol.com index = top10_1 linecount = 1 source = /home/oracle/uckpd_workdir/Reboots_Summary.csv sourcetype = csv splunk_server = idx2.icontrol.splunkcloud.com timeendpos = 111 timestartpos = 93 unix_category = all_hosts unix_group = default
0 Karma

somesoni2
SplunkTrust
SplunkTrust

Will the index=top10_1 have many other events before the Uc-hung event? May be we can add some filter to just show the reboot related events

index=top10_1 reboot  | eval sno=if(Reason="Uc-keypad hung",1,0) | accum sno | eventstats count by sno | where sno<=10 | stats count by Reason
0 Karma

dbcase
Motivator

It could have just 1 but likely there are several, hence the need to capture 10

Here is one that has several leading up to the keypad hung. Maybe another way to approach this is something like this:

Get the event with UC-Keypad hung, then get the PREMISE ID (1065710 in the below data) then search by PREMISE ID and _time (I know I know, but if I can get it to work by time then maybe I can tinker with it) where _time is -1hr. Any thoughts on going about it that way?

9/8/16
4:04:46.183 AM  
1065710,"reboot","Reason: Uc-keypad hung","TCA203","5","TCA203_Quadra_MR7","20160908040446.183",
COMMAND_TYPE = 5 CPE_CONVERTED = 20160908040446.183 HARDWARE_MODEL = TCA203 HARDWARE_REV = 5 PREMISE = 1065710 PREMISE_FK = 1065710 REBOOT_REASON = Reason: Uc-keypad hung RESULTS = 20160908040446.183 STATUS = TCA203_Quadra_MR7 date_hour = 4 date_mday = 8 date_minute = 4 date_month = september date_second = 46 date_wday = thursday date_year = 2016 date_zone = local eventtype = external-referer  eventtype = visitor-type-referred host = datamine.icontrol.com index = top10_1 linecount = 1 source = /home/oracle/uckpd_workdir/Reboots_Summary.csv sourcetype = csv splunk_server = idx2.icontrol.splunkcloud.com timeendpos = 94 timestartpos = 76 unix_category = all_hosts unix_group = default
9/8/16
3:47:01.000 AM  
1065710,111002029477,"","","1065710.1471316426","",6,6,"","Z","restore","","","","B","","Y",2500,2800,"-31/-31","255/255","20160901000135.505","20160901000135.867",1,"090909",
ACTOR_ID = 6 ALARM_REQUEST_ID =   ALARM_REQ_STATUS =   ALARM_SESSION_FK =   ARM_SRC_TYPE =   CHANNEL = B COMMAND_TYPE = 1065710.1471316426 CONTACT_ID =   CPE_CONVERTED = 20160901000135.505 CPE_GEN_ID = 1065710.1471316426 CREATION_DATE = 20160901000135.867 CRITICAL_FLAG =   EVENT_SUB_TYPE = restore EVENT_TYPE = Z MARKED = Y MSG =   PERIPHERAL_CPE_ID =   PREMISE = 1065710 PREMISE_FK = 1065710 PREMISE_RANK = 1 RESULTS = 6 RESULT_TIME = Z SENSOR_BATTERY_VOLTAGE = 2800 SENSOR_CPE_ID = 6 SENSOR_NEAR_FAR_RF = -31/-31 SENSOR_NEAR_FAR_SIGNAL = 255/255 SENSOR_TEMPERATURE = 2500 SE_ID = 111002029477 TICKET_CODE = 090909 ZONE_CPE_ID = 6 eventtype = external-referer  eventtype = visitor-type-referred host = datamine.icontrol.com index = top10_1 linecount = 1 source = /home/oracle/uckpd_workdir/Security_Events.csv sourcetype = csv splunk_server = idx3.icontrol.splunkcloud.com timestamp = none unix_category = all_hosts unix_group = default
9/8/16
3:47:01.000 AM  
1065710,111002219445,"","","1065710.1471316427","",6,6,"","Z","fault","","","","B","","Y",2500,2800,"-31/-31","255/255","20160901000227.212","20160901000227.527",2,"090909",
ACTOR_ID = 6 ALARM_REQUEST_ID =   ALARM_REQ_STATUS =   ALARM_SESSION_FK =   ARM_SRC_TYPE =   CHANNEL = B COMMAND_TYPE = 1065710.1471316427 CONTACT_ID =   CPE_CONVERTED = 20160901000227.212 CPE_GEN_ID = 1065710.1471316427 CREATION_DATE = 20160901000227.527 CRITICAL_FLAG =   EVENT_SUB_TYPE = fault EVENT_TYPE = Z MARKED = Y MSG =   PERIPHERAL_CPE_ID =   PREMISE = 1065710 PREMISE_FK = 1065710 PREMISE_RANK = 2 RESULTS = 6 RESULT_TIME = Z SENSOR_BATTERY_VOLTAGE = 2800 SENSOR_CPE_ID = 6 SENSOR_NEAR_FAR_RF = -31/-31 SENSOR_NEAR_FAR_SIGNAL = 255/255 SENSOR_TEMPERATURE = 2500 SE_ID = 111002219445 TICKET_CODE = 090909 ZONE_CPE_ID = 6 eventtype = external-referer  eventtype = visitor-type-referred host = datamine.icontrol.com index = top10_1 linecount = 1 source = /home/oracle/uckpd_workdir/Security_Events.csv sourcetype = csv splunk_server = idx3.icontrol.splunkcloud.com timestamp = none unix_category = all_hosts unix_group = default
9/8/16
3:47:01.000 AM  
1065710,111002234885,"","","1065710.1471316428","",6,6,"","Z","restore","","","","B","","Y",2500,2800,"-31/-31","255/255","20160901000231.141","20160901000231.704",3,"090909",
ACTOR_ID = 6 ALARM_REQUEST_ID =   ALARM_REQ_STATUS =   ALARM_SESSION_FK =   ARM_SRC_TYPE =   CHANNEL = B COMMAND_TYPE = 1065710.1471316428 CONTACT_ID =   CPE_CONVERTED = 20160901000231.141 CPE_GEN_ID = 1065710.1471316428 CREATION_DATE = 20160901000231.704 CRITICAL_FLAG =   EVENT_SUB_TYPE = restore EVENT_TYPE = Z MARKED = Y MSG =   PERIPHERAL_CPE_ID =   PREMISE = 1065710 PREMISE_FK = 1065710 PREMISE_RANK = 3 RESULTS = 6 RESULT_TIME = Z SENSOR_BATTERY_VOLTAGE = 2800 SENSOR_CPE_ID = 6 SENSOR_NEAR_FAR_RF = -31/-31 SENSOR_NEAR_FAR_SIGNAL = 255/255 SENSOR_TEMPERATURE = 2500 SE_ID = 111002234885 TICKET_CODE = 090909 ZONE_CPE_ID = 6 eventtype = external-referer  eventtype = visitor-type-referred host = datamine.icontrol.com index = top10_1 linecount = 1 source = /home/oracle/uckpd_workdir/Security_Events.csv sourcetype = csv splunk_server = idx3.icontrol.splunkcloud.com timestamp = none unix_category = all_hosts unix_group = default
9/8/16
3:47:01.000 AM  
1065710,111002962773,"","","1065710.1471316433","",6,6,"","Z","fault","","","","B","","Y",2500,2800,"-30/-31","255/255","20160901000550.201","20160901000550.611",4,"090909",
ACTOR_ID = 6 ALARM_REQUEST_ID =   ALARM_REQ_STATUS =   ALARM_SESSION_FK =   ARM_SRC_TYPE =   CHANNEL = B COMMAND_TYPE = 1065710.1471316433 CONTACT_ID =   CPE_CONVERTED = 20160901000550.201 CPE_GEN_ID = 1065710.1471316433 CREATION_DATE = 20160901000550.611 CRITICAL_FLAG =   EVENT_SUB_TYPE = fault EVENT_TYPE = Z MARKED = Y MSG =   PERIPHERAL_CPE_ID =   PREMISE = 1065710 PREMISE_FK = 1065710 PREMISE_RANK = 4 RESULTS = 6 RESULT_TIME = Z SENSOR_BATTERY_VOLTAGE = 2800 SENSOR_CPE_ID = 6 SENSOR_NEAR_FAR_RF = -30/-31 SENSOR_NEAR_FAR_SIGNAL = 255/255 SENSOR_TEMPERATURE = 2500 SE_ID = 111002962773 TICKET_CODE = 090909 ZONE_CPE_ID = 6 eventtype = external-referer  eventtype = visitor-type-referred host = datamine.icontrol.com index = top10_1 linecount = 1 source = /home/oracle/uckpd_workdir/Security_Events.csv sourcetype = csv splunk_server = idx3.icontrol.splunkcloud.com timestamp = none unix_category = all_hosts unix_group = default
9/8/16
3:47:01.000 AM  
1065710,111003018209,"","","1065710.1471316434","",6,6,"","Z","restore","","","","B","","Y",2500,2800,"-31/-31","255/255","20160901000605.340","20160901000605.796",5,"090909",
ACTOR_ID = 6 ALARM_REQUEST_ID =   ALARM_REQ_STATUS =   ALARM_SESSION_FK =   ARM_SRC_TYPE =   CHANNEL = B COMMAND_TYPE = 1065710.1471316434 CONTACT_ID =   CPE_CONVERTED = 20160901000605.340 CPE_GEN_ID = 1065710.1471316434 CREATION_DATE = 20160901000605.796 CRITICAL_FLAG =   EVENT_SUB_TYPE = restore EVENT_TYPE = Z MARKED = Y MSG =   PERIPHERAL_CPE_ID =   PREMISE = 1065710 PREMISE_FK = 1065710 PREMISE_RANK = 5 RESULTS = 6 RESULT_TIME = Z SENSOR_BATTERY_VOLTAGE = 2800 SENSOR_CPE_ID = 6 SENSOR_NEAR_FAR_RF = -31/-31 SENSOR_NEAR_FAR_SIGNAL = 255/255 SENSOR_TEMPERATURE = 2500 SE_ID = 111003018209 TICKET_CODE = 090909 ZONE_CPE_ID = 6 eventtype = external-referer  eventtype = visitor-type-referred host = datamine.icontrol.com index = top10_1 linecount = 1 source = /home/oracle/uckpd_workdir/Security_Events.csv sourcetype = csv splunk_server = idx3.icontrol.splunkcloud.com timestamp = none unix_category = all_hosts unix_group = default
9/8/16
3:47:01.000 AM  
1065710,111004391033,"","","1065710.1471316445","",6,6,"","Z","fault","","","","B","","Y",2500,2800,"-32/-31","255/255","20160901001228.356","20160901001228.685",6,"090909",
ACTOR_ID = 6 ALARM_REQUEST_ID =   ALARM_REQ_STATUS =   ALARM_SESSION_FK =   ARM_SRC_TYPE =   CHANNEL = B COMMAND_TYPE = 1065710.1471316445 CONTACT_ID =   CPE_CONVERTED = 20160901001228.356 CPE_GEN_ID = 1065710.1471316445 CREATION_DATE = 20160901001228.685 CRITICAL_FLAG =   EVENT_SUB_TYPE = fault EVENT_TYPE = Z MARKED = Y MSG =   PERIPHERAL_CPE_ID =   PREMISE = 1065710 PREMISE_FK = 1065710 PREMISE_RANK = 6 RESULTS = 6 RESULT_TIME = Z SENSOR_BATTERY_VOLTAGE = 2800 SENSOR_CPE_ID = 6 SENSOR_NEAR_FAR_RF = -32/-31 SENSOR_NEAR_FAR_SIGNAL = 255/255 SENSOR_TEMPERATURE = 2500 SE_ID = 111004391033 TICKET_CODE = 090909 ZONE_CPE_ID = 6 eventtype = external-referer  eventtype = visitor-type-referred host = datamine.icontrol.com index = top10_1 linecount = 1 source = /home/oracle/uckpd_workdir/Security_Events.csv sourcetype = csv splunk_server = idx3.icontrol.splunkcloud.com timestamp = none unix_category = all_hosts unix_group = default
9/8/16
3:47:01.000 AM  
1065710,111004413501,"","","1065710.1471316446","",6,6,"","Z","restore","","","","B","","Y",2500,2800,"-31/-31","255/255","20160901001234.500","20160901001234.924",7,"090909",
ACTOR_ID = 6 ALARM_REQUEST_ID =   ALARM_REQ_STATUS =   ALARM_SESSION_FK =   ARM_SRC_TYPE =   CHANNEL = B COMMAND_TYPE = 1065710.1471316446 CONTACT_ID =   CPE_CONVERTED = 20160901001234.500 CPE_GEN_ID = 1065710.1471316446 CREATION_DATE = 20160901001234.924 CRITICAL_FLAG =   EVENT_SUB_TYPE = restore EVENT_TYPE = Z MARKED = Y MSG =   PERIPHERAL_CPE_ID =   PREMISE = 1065710 PREMISE_FK = 1065710 PREMISE_RANK = 7 RESULTS = 6 RESULT_TIME = Z SENSOR_BATTERY_VOLTAGE = 2800 SENSOR_CPE_ID = 6 SENSOR_NEAR_FAR_RF = -31/-31 SENSOR_NEAR_FAR_SIGNAL = 255/255 SENSOR_TEMPERATURE = 2500 SE_ID = 111004413501 TICKET_CODE = 090909 ZONE_CPE_ID = 6 eventtype = external-referer  eventtype = visitor-type-referred host = datamine.icontrol.com index = top10_1 linecount = 1 source = /home/oracle/uckpd_workdir/Security_Events.csv sourcetype = csv splunk_server = idx3.icontrol.splunkcloud.com timestamp = none unix_category = all_hosts unix_group = default
0 Karma

somesoni2
SplunkTrust
SplunkTrust

Something like that is possible. Give this a try (assuming both Reason and PremiseID is extracted, if not need to set that up/use inline rex)

index=top10_1 Reason="Uc-keypad hung" | table _time PremiseID | eval endtime=_time+1 | eval starttime=relative_time(_time,"-1h")  | map maxsearches=10 search="search index=top10_1 PremiseID=$PremiseID$ " earliest=$starttime$ latest=$endtime$ "
0 Karma

dbcase
Motivator

Looks like there is some sort of syntax error but I've not used map before so I'm not sure how to correct it.

The error is Error in 'map' command: Unable to find saved search 'maxsearches=10'

0 Karma

somesoni2
SplunkTrust
SplunkTrust

My bad. Just corrected the syntax (was missing search= portion)

0 Karma

dbcase
Motivator

Thank you! We are closer but still not quite there I think

There is unbalanced quotes so I took out the ones just behind $PremiseID$ (not sure if that is the right way to go so the query looks like this

index=top10_1"Uc-keypad hung" | table _time PremiseID | eval endtime=_time+1 | eval starttime=relative_time(_time,"-1h")  | map maxsearches=10 search="search index=top10_1 PremiseID=$PremiseID$  earliest=$starttime$ latest=$endtime$ "

and Splunk complains

Error in 'map': Did not find value for required attribute 'PremiseID'.

0 Karma

dbcase
Motivator

Tried this one too

index=top10_1 "Uc-keypad hung" | table _time PremiseID | eval endtime=_time+1 | eval starttime=relative_time(_time,"-1h")  | map maxsearches=10 search="search index=top10_1 PremiseID="$PremiseID$"  earliest=$starttime$ latest=$endtime$ "

The buzzer sounded with this error message

Error in 'map': Did not find value for required attribute 'PremiseID'.

0 Karma

dbcase
Motivator

ah HA!!!

THis works!

index=top10_1 "Uc-keypad hung" | table _time PREMISE | eval endtime=_time+1 | eval starttime=relative_time(_time,"-1h")  | map maxsearches=10 search="search index=top10_1 PREMISE="$PREMISE$"  earliest=$starttime$ latest=$endtime$ "

sundareshr
Legend

I need to use the accum command more.

sjalexander
Path Finder

I think a transaction like the following will suit your need:

| transaction hangten endswith="*Reason: Uc-keypad hung*" maxevents=10

The thing I am not clear on from your description is the requirement to count the events - what does the end event look like? are you looking for a duration?

All that said, there's built-in functionality in the search GUI that lets you dig in to your data - if you click on the timestamp of an event in search, you can look at events before, after, or at the time of the event, and it'll let you specify a timespan in milliseconds-weeks units.

0 Karma

dbcase
Motivator

I tried that one but all Splunk returns on the transactions is the Uc-keypad hung events. Which is what I need as the last piece. But leading up to the Uc-keypad hung events could be things like, firmware upgrade, reboot by user, etherloss, acpowerloss, etc etc. These are the ones that I need to count so we can see if (example) if everytime we have a uc-keypad hung we also have an etherloss. These lead up events could occur over a few seconds, minutes or even hours, which is why I need the window to be event count vs time.

0 Karma

sjalexander
Path Finder

OK, that depends on the search - it sounds like you are filtering out everything but the "Uc-keypad hung" messages. If so, all you will get in the transaction is "Uc-keypad hung" messages. Can you broaden your search to include the other events?

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...