Splunk Search

Is there a way in Splunk to change the timezone or a raw event from UTC to EST or vica versa?

djreschke
Communicator

I am trying to correlate 2 different logs one is in EST and the is in UTC. 

The UTC logs, I have tried to specific the time zone as UTC in props and then let my user timezone preference do the conversion. Props was only updated on the search head. 

[soucretype]

TZ = UTC

This does not seem to work and i have read different post saying that updating the props file should work but not for historical events. 

Is there a way to do this indexed events? 

Labels (3)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

If you want to change the values of events stored in indexes - there is no way to do that other than to "delete" and re-ingest.

You can however dynamically re-calculate it in search-time by

<your search> | eval _time=_time+<your_offset_in_seconds

It ain't pretty. Quite the opposite. And you have to remember to do this every time you use this data. And have to remember when to use it if you fix your ingestion.

0 Karma

djreschke
Communicator

@PickleRick  So is the change that needs to made on the indexing side or would a change in the log format be the best route. 

 

The Event is being indexed at the right time but since the is no time zone specified in the logs ie. 2021-11-22 08:35:55 versus 2021-11-22:08:35:55 -0500, event time is showing as 08 AM EST when it is indexed at 08 AM UTC. 

_time = 2021-11-22 08:38:18

IndexTime = 11/22/2021 03:38:49

index = index

diff = 0

 

Search used: 

 

index=index
| rename _indextime as IndexTime
| eval diff=IndexTime-_time
| convert ctime(IndexTime) as IndexTime
| eval diff=if(diff < 0, "0", diff)
| table _time IndexTime diff index

Tags (1)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

_time is rendered by default in the timezone you have set in your user's configuration and is shown without the timezone information but if you expand a single event, you can see in the line with the _time field value a string representation including your timezone.

So I don't quite understand.

Can you please show a single event expanded?

djreschke
Communicator

@PickleRick 

 

Here is the expanded log and the event for reference. I included indextime so you can see the difference. 

 

c_ip = 2.2.2.2
cs_method = GET
cs_uri_stem = /test/test.aspx
date_hour = 7
date_wday = tuesday
date_zone = local
host = machine
index = test
sc_status = 404
sc_substatus = 0
IndexTime = 11/23/2021 02:06:29
X_FORWARDED_FOR = 99.99.99.99
blnStartHere = True
cs_Cookie_ = -
cs_Referer_ = -
cs_User_Agent_ = Mozilla/5.0+(Windows+NT+10.0;+WOW64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/96.0.4664.45+Safari/537.36
cs_bytes = 0
cs_username = -
cs_version = HTTP/1.1
date = 2021-11-23
s_ip = 1.0.0.1
s_port = 999
s_sitename = W3SVC2
sc_bytes = 0
sc_win32_status = 0
time = 07:06:28
time_taken = 98
_time = 2021-11-23T07:06:28.000-05:00


Event

2021-11-23 07:06:28 W3SVC2 host 2.2.2.2 GET /test/test.aspx 999 - 1.0.0.1. HTTP/1.1 Mozilla/5.0+(Windows+NT+10.0;+WOW64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/96.0.4664.45+Safari/537.36 - - test.com 404 0 0 0 0 99.99.99.99

0 Karma

PickleRick
SplunkTrust
SplunkTrust

OK. So assuming that you didn't have too much delay on the ingestion queue, you have:

IndexTime = 11/23/2021 02:06:29 (I assume that it's rendered in your local timezone)

_time = 2021-11-23T07:06:28.000-05:00

Whereas in raw event you have 2021-11-23 07:06:28

From what you wrote earlier, the IndexTime is the right time for the event. That means that the raw event which is sent apparently in UTC was improperly intepreted as your local timezone based on local settings of the ingesting/parsing component (HF/Indexer).

Maybe (but I have never tried it and anyway it's a very very ugly idea) you could eval the _time as a calculated field from the original _time value but it would be a horrible idea. Anyway, the timestamp is already stored with the event so regardless of the timezone you set in your user's settings, the event is already indexed at wrong time and you can't change it.

The proper solution (but applying only to newly ingested events) is to set a proper timezone for this source/sourcetype/host on the ingesting component (first HF or Indexer that's receiving and parsing data).

djreschke
Communicator

@PickleRick 

 

Correct no delay in indexing. The raw log is in UTC, but because there is not timezone follower, the search head is interpreting it as EST.  Indextime is the correct time of the event, so wouldn't that indicatie that the event is being indexed at the right time, and the search head is miss interrupting the based on the props.conf. 

 

So there is no why to change the historical events from what you are telling. Would a better fix be change the format in the logs from the original source. 

How would I accomplish this. The indexers do not have any custom props deployed from the sourcetype, unless I am missing something? Data for this log is ingested from a UF that monitors a file directory.

"The proper solution (but applying only to newly ingested events) is to set a proper timezone for this source/sourcetype/host on the ingesting component (first HF or Indexer that's receiving and parsing data)."

 

0 Karma

PickleRick
SplunkTrust
SplunkTrust

No. Once the raw logs are indexed, they stay in that form forever (or at least until their retention time is over). You could "reingest" them using collect command but that (unless you're using sourcetype of "stash" IIRC, which is not very useful) consumes license again.

So you'd still have to ingest the events again but this time you'd create them from already existing events.

The UF doesn't parse events. UF sends the data to either HF or - if you're not using HF's - straight to indexer(s). In that case you need to set your timezone for this sourcetype,source or host options on indexers.

https://docs.splunk.com/Documentation/Splunk/8.2.3/Data/HowSplunkextractstimestamps

PickleRick
SplunkTrust
SplunkTrust

Sorry, did some testing and it seems that if you  | collect into an index, you cannot modify the indexed time of the original event. If the search results do not contain timestamp at all, one is generated for them. But if they do, it is retained. So no way of recalculating and reingesting for existing events.

So it seems that - if you still have the old files (the events are read from files as I remember, right?), it would be best to stop the input (or the whole forwarder), reconfigure the source, sourcetype or host on the first HF or indexer, as I said before, so it parses the data properly and then - depending on whether you really badly want the old data properly timed either just re-enable input/restart UF or

1) | delete old data from index (or simply delete and recreate index if you have no other data in it and want to reclaim space)

2) reset metadata for the monitored files on the UF

3) re-enable input/restart UF so the files get re-ingested

djreschke
Communicator

@PickleRick 

 

Thank you for doing some additional testing and validation. I'll need to kick this up the flag pole to get the fix implemented. Kind of sucks that there are more then one way that Splunk reads time. I would think you would want to go off index time for everything and then change the time zone based off the users location. i know there are other variables like time zone of the log source. 

 

Thanks for all of the back and forth.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

No. It's not that easy. (and is often more complicated with some other SIEM/log management solutions :-)).

Index time is one thing and it's just a "technical" timestamp connected to the indexing process itself and it tells you only when the event was written to index. That's all. And that's stored in _indextime.

But _time is the event time which obviously could have happened way before indexing (for example if you're reading whole day's logfile at once). So event time is typically parsed from the event itself or - in some cases, like HEC - might be provided independently to the raw event contents. Therefore usually splunk applies parsing rules to the event and reads timestamp from there but if the time format does not contain timezone information some timezone info must be inferred in order for splunk to decide when this event really happened. That's why there are some rules as to how splunk applies timezone information to event if none is given in the event and/or in the configuration for given source/sourcetype/host.

It is quite logical (although seems kinda chaotic at first).

Anyway, it's usually a good practice performance-wise to define time format for sources because then splunk just parses time using provided format and doesn't have to guess losing valuable CPU time.

0 Karma
Get Updates on the Splunk Community!

Optimize Cloud Monitoring

  TECH TALKS Optimize Cloud Monitoring Tuesday, August 13, 2024  |  11:00AM–12:00PM PST   Register to ...

What's New in Splunk Cloud Platform 9.2.2403?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.2.2403! Analysts can ...

Stay Connected: Your Guide to July and August Tech Talks, Office Hours, and Webinars!

Dive into our sizzling summer lineup for July and August Community Office Hours and Tech Talks. Scroll down to ...