Hello, we have a proxy network appliance running Websense, sending its logs via syslog to Splunk,
We have a data latency alert configured to alert if latency is large,
search $search_args$ _index_earliest=-1d@d _index_latest=@d | eval lag_sec = (_indextime-_time) | eval lag_hrs = lag_sec/(60*60) | eval delay_hrs = if( lag_hrs > 0.5, lag_hrs, "") | eval future_sec = if( lag_sec < -1, -1*lag_sec, "") | eval containsGap = if(delay_hrs!="" OR future_sec!="", "true", "false") | stats max(delay_hrs), max(future_sec), count(eval(containsGap="true")) as countGaps, count(_raw) as countEvents, by splunk_server index host sourcetype source | eval pecentGaps = countGaps / countEvents*100 | where pecentGaps>5 | sort host, sourcetype, source
We started to get large latency (2 hour (7200 seconds) gap between received events timestamp and when theyre indexed) in last few days, and I am trying to determine whats causing this,
We dont have a forwarder on this network device, and we arent seeing any additional network bottlenecks or traffic. Where can I look to troubleshoot data integrity latency?
This is almost always due to incorrect interpretation of TimeZones (usually because there are no TZ values in the timestamps and there is no
TZ= in any
props.conf so each indexer uses the
TZ value of its host OS (which shouldn't be, but might be, different on each indexer).
I checked the indexer, it has the host configured with the right TZ
[root@cgysplunk01 /opt/splunk]# cat ./etc/system/local/props.conf
TZ = America/Edmonton
The indexer itself is EST TZ
[root@cgysplunk01 /opt/splunk]# cat /etc/sysconfig/clock