Solved: How do I keep date fields from doubling at ingesti...

Nanuk · ‎03-29-2020

So I'm working on a project where i'm ingesting csv files.

These file's time stamp can't be read until I pass the:

Timestamp format = %Y-%m-%dT%H:%M:%S.%z
Timestamp fields = time (which isn't in the csv, but shows up in the "Set Source Type" page of "Add Data" )

Without it I get the errors:
- Could not use strptime to parse timestamp from "2020 february tuesday 11 3 2 local 18 2020-02-18T11:03:02.000-0600".
- Could not use regex to parse timestamp from "11 3 2 local 18 2020-02-18T1".

my problem is that after ingestion i get double fields of every time related field.
i.e.
date_year = 2020
date_year = 2020
date_month = february
date_month = february
date_wday = tuesday
date_wday = tuesday
date_mday = 18
date_mday = 18
date_hour = 11
date_hour = 11
date_minute = 3
date_minute = 3
date_second = 6
date_second = 6
date_zone = local
date_zone = local

It still shows up in the fields sidebar as just single fields, but when I add them as an interesting field they double in the events page or when I list them using a table command.

I've tried also setting Extraction to current time in "Set Source Type" page of "Add Data", but I still get double.

*sample of when I table the date fields

*sample of what the field dates look like in events

Nanuk · ‎03-30-2020

Figured it out.

Seems I wasn't given a raw log file, but an extracted table of a sourcetype from and index. So it included the fields that Splunk generates on ingestion. Once I went through the log and deleted those fields it solved my problem.

Thanks @woodcock for your take regarding this, it tipped me in the right direction.

View solution in original post

Nanuk · ‎03-30-2020

Figured it out.

Seems I wasn't given a raw log file, but an extracted table of a sourcetype from and index. So it included the fields that Splunk generates on ingestion. Once I went through the log and deleted those fields it solved my problem.

Thanks @woodcock for your take regarding this, it tipped me in the right direction.

woodcock · ‎03-29-2020

I don't know what is causing this HOWEVER I do know that these fields are FAR less "interesting" than you think that they are. These fields are for DEBUGGING purposes; not for general use. They are leftovers of the timesamping and timezoning process and are PRE-NORMALIZED meaning that they are NOT for general use. If you do not believe me then run this search and see for yourself:

index=* OR index=_* date_hour=*
| eval real = strftime(_time, "%A %B %d %T %Y %z")
| rex field=real "^(?<real_wday>\S+)\s+(?<real_month>\S+)\s+(?<real_mday>\S+)\s+(?<real_hour>\d+):(?<real_minute>\d+):(?<real_second>\d+)\s+(?<real_year>\S+)\s+(?<real_zone>\S+)"
| table real* date*
| foreach  real* date* [ eval <<FIELD>> = lower(<<FIELD>>)]
| stats count count(eval(real_wday!=date_wday)) AS diff_wday count(eval(real_month!=date_month)) AS diff_month count(eval(real_mday!=date_mday)) AS diff_mday count(eval(real_hour!=date_hour)) AS diff_hour count(eval(real_minute!=date_minute)) AS diff_minute count(eval(real_second!=date_second)) AS diff_second count(eval(real_year!=date_year)) AS diff_year count(eval(real_zone!=date_zone)) AS diff_zone

Just think about it for a second. If you have a different personal Time zone setting than somebody else and you both run the same search for "yesterday", you are GUARANTEED to get different answers because the events brought back will be different AND because you are using date_* fields that are pre-normalized. NEVER use the date_* fields; always calculate your own.

How do I keep date fields from doubling at ingestion?

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Introducing ITSI 5.0: Unified Visibility and Actionable Insights

Inside Splunk Agent Observability: Understanding Agent Behavior, Tokens & Costs

From Data to Insight: Announcing the Winners of the Splunk Dashboard Contest

Join the Conversation