Splunk Search
Highlighted

For data indexed after the hour, how do I prevent events from showing up for the wrong day?

Builder

We are pulling in data from the previous hour at 5 minutes after the current hour. This is because the source data will not be complete until after the hour is up. It's an API using REST.

We are then using 1|eval time=time-3601 to skew the "stats" function at the end of the search to place it in the right time slot. So, data pulled in at 01:05 will show up in the 00:00 line. That works fine and well, except when used in combination with the time range picker default "Today" or "Yesterday" stats is showing data from the 23:00 hour of the previous day. Obviously this is because the data was indexed at 00:05, and so "Today" is going to show it. How might I fix this?

alt text

Here's the big nasty search.

index=main (source=rest://rigor_ally_hourly)
| spath 
|eval _time=_time-360  
| eval MaintEnd=strptime ("2015/11/1 12:00:00 AM","%Y/%m/%d %H:%M:%S %p")  
| eval MaintStart=strptime ("2015/11/1 12:00:00 AM" , "%Y/%m/%d %H:%M:%S %p")  
| where _time > MaintEnd OR _time < MaintStart     
| rename stats.avg_response_time AS AvgResponseTime , stats.errors AS Errors, uptime.percentage_uptime AS Uptime, stats.max_response_time AS MaxResponse, stats.min_response_time AS MinResponse , uptime.run_count AS RunCount
| eval Source=case(source="rest://rigor_ally_uptime","Rigor Ally", source="rest://Apollo_Tufts_Uptime","Apollo Tufts", source="rest://rigor_ally_hourly","Rigor Hourly", source="rest://rigor_ally_hourly1","Rigor Hourly1",1=1,"Other")
| bucket _time span=1h  
| eval Max=(MaxResponse/1000)  
| eval Min=(MinResponse/1000)  
| eval AvgResponse=(AvgResponseTime/1000) 
| eval Time=strftime(_time,"%m/%d/%y %H:%M") 
|eventstats count(Uptime) as TotalEvents by _time 
|eventstats sum(Uptime) as SumUpTime by _time 
|eval UpPercent=(SumUpTime/(TotalEvents))  
| stats last(RunCount) avg(UpPercent) avg(Uptime) last(TotalEvents) last(SumUpTime)  sum(RunCount) count(eval(Errors>=1)) as E2 last(AvgResponse) avg(AvgResponse) max(Max) as WorstResponse min(Min) as BestResponse avg(Max) as AvgMax avg(Min) as AvgMin  by Source, _time
0 Karma
Highlighted

Re: For data indexed after the hour, how do I prevent events from showing up for the wrong day?

Legend

I assume that the actual data has no time information in it? How are you collecting the data?

0 Karma
Highlighted

Re: For data indexed after the hour, how do I prevent events from showing up for the wrong day?

Builder

Actually, it does. have an epoch time stamp in it (1453453200000). The data is being collected via the REST Api (http://dev.splunk.com/restapi)

Would editing a props.conf for this input and shifting it's timezone by 1 hour work?

The raw data looks like:

{"graph":{"xAxis":{"dateTimeLabelFormats":{"day":"%b %e","hour":"%l:%M%P","month":"%b %e","year":"%b","week":"%b %e"},"type":"datetime","title":{"text":""}},"credits":{"enabled":false},"series":[{"color":"#C0392B","name":"Downtime","data":[{"x":1453453200000.0,"y":0.0,"interval":"hour"}]},{"color":"#2ECC71","name":"Uptime","data":[{"x":1453453200000.0,"y":1.0,"interval":"hour"}]}],"yAxis":{"tickWidth":1,"min":0,"labels":{"format":"{value}%","enabled":true},"title":{"text":""},"gridLineWidth":0},"legend":{"enabled":false},"title":{"text":""},"exporting":{"filename":"Apollo-Ally-Uptime Uptime History: 01/22/2016","chartOptions":{"subtitle":{"text":"01/22/2016"},"title":{"text":"Apollo-Ally-Uptime Uptime History"}},"enabled":false},"chart":{"zoomType":"x","type":"column"},"plotOptions":{"column":{"stacking":"percent"},"series":{"point":{"events":{}},"groupPadding":0.05,"pointPadding":0,"cursor":"pointer"},"area":{"marker":{"enabled":false},"stacking":"percent"}},"tooltip":{"shared":true,"useHTML":true,"followPointer":true}},"uptime":{"run_count":4,"maximum_response_time":8049,"minimum_response_time":4628,"percentage_uptime":100.0,"average_response_time":5851},"stats":{"max_response_time":8049,"run_count":4,"percentage_uptime":100.0,"avg_response_time":5851,"errors":0,"min_response_time":4628}}
0 Karma
Highlighted

Re: For data indexed after the hour, how do I prevent events from showing up for the wrong day?

Builder

This epoch time matches up to the time the data is supposed to be in.

0 Karma
Highlighted

Re: For data indexed after the hour, how do I prevent events from showing up for the wrong day?

SplunkTrust
SplunkTrust

Use a script to read the data from the API and write that data to a file, including an appropriate timestamp at the front of each line in a format Splunk will pick up easily. Use cron to schedule that script at 5 minutes after the hour, then have Splunk monitor the file that's created.

Be sure to either include that file in logrotate or have the script delete and recreate it each time (I'd recommend the former).

You could also do something very similar with a scripted input.

View solution in original post

Highlighted

Re: For data indexed after the hour, how do I prevent events from showing up for the wrong day?

Builder

Yep.. Ended up having to make a bash script to grab the data, we also had to tell Splunk to wait a few minutes before reading the file since the timestamp is a different operation.. Also, since this was done for multiple feeds, it does a loop based on the input in feeds.txt.

# Fetch Script, Jon Duke 5-12-16
# This is cron'd under the splunk account. It runs at 15 * * * *.
# Add a line to feeds.txt to ingest data

#Change Directory
cd /home/splunk/scripts

#Create Timestamp
date -d '1 hour ago' "+%m/%d/%Y %H:%M:%S" >/home/splunk/scripts/tmp/timestamp.txt

#Loop Logic
while read line; do
   feedNum="$(echo "$line" | cut -d_ -f1)"
   wholeString="$(echo "$line")"
   cat /home/splunk/scripts/tmp/timestamp.txt >>/home/splunk/scripts/rigor/$wholeString
   wget https\://my.APISITE.com/reports/uptimes/${feedNum}.xml?\&api_key=REDACTED\&location=all\&start_date=recent_hour -O ->>/home/splunk/scripts/rigor/$wholeString;
done < feeds.txt
Highlighted

Re: For data indexed after the hour, how do I prevent events from showing up for the wrong day?

SplunkTrust
SplunkTrust

Thank you very much for providing the details of the solution! That will surely help the next person who stumbles across this question and answer!

0 Karma