Splunk Enterprise

Search for every 1 hour ago window and alert hourly

mathiasy123
Path Finder

Greeting,

I want to search for data every 1 hour ago window, let say today at 11:00 AM, so the search will look at data from 10:00 AM until 11:00 AM. I tried it on the search and it did not return anything, but I have data at 10:05 AM, 10:10 AM, 10:15 AM.  

What I want is:

I want to find the user ID that has more than 5 transactions hourly, that's why I tried to use bin _time span=1h, to count how many transactions in the range 1 hour. The alert will run an hourly and search the data one hour ago from the current hour, for the example now is 11:00 AM, the alert will check the data 1 hour ago which is 10:00 AM, and so on. If there is a user ID that has more than 5 transactions, it will alert it.

So my problems are:

A. configure the search for a time window (check every 1 hour ago from the current hour)

B. configure the alert

this is my search and time range configuration:

mathiasy123_1-1593751211676.png

this is my alert:

mathiasy123_2-1593751294897.png

 

 

 

 

0 Karma
1 Solution

bowesmana
SplunkTrust
SplunkTrust

@mathiasy123 

When you index data in Splunk, it will extract a timestamp from the data. This is a fundamental part of how Splunk works.

All searches will use that time to extract data. Splunk also will have a field called _indextime, which is the time the data is indexed, but generally this not that useful and not often used. _time should ALWAYS be something that you would want to use when searching for data pertaining to the generation time of the event itself, which in your case would seem to be transactionDate.

If you store your data Tuesday, 2 June 2020 09:00:00 then _indextime will reflect that storage time, but if that data stored has a transactionDate of Tuesday, 2 June 2020 08:43:52, then you would normally plan to have that reflected in Splunk's _time field.

I suggest you read through this to get a better handle on the use of _time and how to ensure it's extracted correctly.

https://docs.splunk.com/Documentation/Splunk/8.0.4/Data/Configuretimestamprecognition

Data is stored internally in Splunk based on buckets which represent a time and this is how Splunk finds your data when you make a search, whether it is done via an alert or an ad hoc search.

 

View solution in original post

0 Karma

bowesmana
SplunkTrust
SplunkTrust

If you are trying to run a schedule on the hour with cron, it would be more sensible to run it at 1 or two minutes past the hour as you are assuming that data for transactions at 10:59:59.999 will be ingested and indexed in Splunk before the search runs.

You say your search does not work, but I can see from your search (line 1 not visible) that you are manipulating _time based on transaction date.

So, does your transaction date differ to the timestamp of the data in Splunk. If so, this may account for why you are not seeing data, as you are asking Splunk to search from 10-11 AM for the event timestamp then manipulating _time after that.

If your search window is only 1 hour then you do not need to bin at all as 

| stats values(transactionID) as TransactionId dc(transactionId) as CountedTransaction by userID
| where CountedTransaction>5

would work

Hope this helps

0 Karma

mathiasy123
Path Finder

Hi, @bowesmana 

 

Yes, my transaction date and default _time Splunk can be differ, that's why I want to set transaction date as a _time default in Splunk.

 

Okay, so this is my new search:

mathiasy123_0-1593760731432.png

Is it right?

Should I define the time range in here?:

mathiasy123_1-1593760759242.png

Another problem, when I remove the where filter, why the data from yesterday displayed too? It should be only data at 13:00 or 01:00 PM.

Does the time range use the default _time old values instead of replaced _time with transactionDate?

mathiasy123_0-1593760895958.png

 

Tags (1)
0 Karma

bowesmana
SplunkTrust
SplunkTrust

@mathiasy123 

The search time range specified in the time picker for the alert will always be the one that controls what data comes from the stored data in Splunk. The _time field of stored data is really important, so is there a good reason why your _time field is NOT the transaction date? Does your _time field have some other meaning that you need?

If you have an event with _time as 12:30, but the transaction_date at 11:30 then a search from 12:00-13:00 will not find that transaction.

In the alert definition you can either set the time as the beginning of hour/current hour, in which case you will get a 1 hour window. In you are search to 'Now', which means you will get anywhere from 1-2 hours.

Using 

earliest=-h@h latest=@h

is the same as searching the previous exact hour window (using 'snap to')

The fact that you are seeing yesterday's transactions would indicate that your mixing of time is a problem.

If you are looking to search previous hour then you will get all records with the _time of previous hour, however, when you then replace _time with the calculated transactionDate, those times will be whatever they are.

I suggest you look at whether your existing use of _time is correct.

 

0 Karma

mathiasy123
Path Finder

@bowesmana 

Okay, so the alert is using the _time right?

But I still have some understanding issue:

1.) If I stored my data, let say Tuesday, 2 June 2020 09:00:00 then the _time should be 02/06/2020 09:00:00?

2.) In this case, my expectation is that Splunk alert wants to check the search based on transactionDate instead of _time, why? because there is a chance where the _time can differ from transactionDate (my focus in on transactionDate). 

0 Karma

bowesmana
SplunkTrust
SplunkTrust

@mathiasy123 

When you index data in Splunk, it will extract a timestamp from the data. This is a fundamental part of how Splunk works.

All searches will use that time to extract data. Splunk also will have a field called _indextime, which is the time the data is indexed, but generally this not that useful and not often used. _time should ALWAYS be something that you would want to use when searching for data pertaining to the generation time of the event itself, which in your case would seem to be transactionDate.

If you store your data Tuesday, 2 June 2020 09:00:00 then _indextime will reflect that storage time, but if that data stored has a transactionDate of Tuesday, 2 June 2020 08:43:52, then you would normally plan to have that reflected in Splunk's _time field.

I suggest you read through this to get a better handle on the use of _time and how to ensure it's extracted correctly.

https://docs.splunk.com/Documentation/Splunk/8.0.4/Data/Configuretimestamprecognition

Data is stored internally in Splunk based on buckets which represent a time and this is how Splunk finds your data when you make a search, whether it is done via an alert or an ad hoc search.

 

0 Karma

mathiasy123
Path Finder

@bowesmana 

 

Ah! I see now, so basically the _time is looking for if they're any timestamp in my log file, if the timestamp exists in log file then it would be the _time value, am I correct? 

 

How about if the case is there are 2 timestamps in one log file? let say I have transactionDate field with format YY-mm-dd HH:MM:SS and logDate field with format dd-mm-yy HH:MM:SS which is a timestamp, what the _time value would be? 

0 Karma

bowesmana
SplunkTrust
SplunkTrust

Go read that link about timestamp recognition. The timestamp you want is the one you will configure Splunk to recognise

 

0 Karma

mathiasy123
Path Finder

@bowesmana 

 

Okay, thanks for the deep explanation! Big thanks to you!

0 Karma
Get Updates on the Splunk Community!

Splunk Education - Fast Start Program!

Welcome to Splunk Education! Splunk training programs are designed to enable you to get started quickly and ...

Five Subtly Different Ways of Adding Manual Instrumentation in Java

You can find the code of this example on GitHub here. Please feel free to star the repository to keep in ...

New Splunk APM Enhancements Help Troubleshoot Your MySQL and NoSQL Databases Faster

Splunk Observability has two new enhancements to make it quicker and easier to troubleshoot slow or frequently ...