I have a Json file which contains a "date" field. The date field in my data that can either be of format %Y-%m-%d %H:%M:%S (e.g. 2025-05-23 9:35:35 PM) or %Y-%m-%d (e.g. 2025-05-23). The only way to ingest this Json is via manual ingestion.
When trying to set the _time field on ingest, setting the timestamp format to %Y-%m-%d %H:%M:%S will fail and default to the wrong _time value for date fields with format %Y-%m-%d. However, setting timestamp to format %Y-%m-%d won't capture the HMS part. Was there a way to coalesce these so that it will check if HMS is present, and if so, then apply %Y-%m-%d %H:%M:%S format? Or is there a workaround so at least the data ingestion for _time is accurate?
Hi @cherrypick
This sounds like a job for INGEST_EVAL - There are great examples at https://github.com/silkyrich/ingest_eval_examples/blob/master/default/transforms.conf from Rich Morgan @ Splunk.
However for your specific example, the following config should hopefully work, this works by checking the time format first before then setting as required:
== props.conf ==
[yourSourceType]
TRANSFORMS-setCustomTime = setJSONTime
== transform.conf ==
[setJSONTime]
INGEST_EVAL = _time=if(match(date, "\d{4}-\d{2}-\d{2} \d{1,2}:\d{2}:\d{2} [AP]M"), strptime(date, "%Y-%m-%d %I:%M:%S %p"), strptime(date, "%Y-%m-%d"))
🌟 Did this answer help you? If so, please consider:
Your feedback encourages the volunteers in this community to continue contributing
Interestingly I did consider this as I posted it and decided not to do := based on the examples of setting _time from Rich’s examples at https://github.com/silkyrich/ingest_eval_examples/blob/master/default/transforms.conf which only use = (but oddly only on _time fields), I wonder why this is the case? I might check but I thought it wasn’t possible to set a multi value field for _time (and _raw) using ingest eval?
Again, this does not work because this filters events during search time and not using shared time picker.
Let's say I have an event that has been indexed with date 2025-04-01 but I ingest it so that _time is 2025-04-02 (so date and _time is mismatched), if I use a timechart command to filter alerts over 2025-04-01, this event will not appear on the timechart because it is first filtered on _time. Even if I specify timechart by date, this event will not appear.
My core issue is how to ensure _time and date fields are the same in the index (NOT SEARCH TIME) when ingesting data with mismatched formats.
I need this to be done at ingest and not during search. The reason is that Splunk first filters on _time so not having the correct _time values will filter out results that shouldn't.
Splunk cannot natively parse multiple timestamp formats for the same field at index time—it only allows a single TIME_FORMAT per sourcetype.
If you can preprocess or route events differently, you can assign different sourcetypes based on the date format
# props.conf
[test_json]
TRANSFORMS-set_sourcetype = set_sourcetype_datetime, set_sourcetype_dateonly
[test_json_datetime]
TIME_PREFIX = "date":\s*"
TIME_FORMAT = %Y-%m-%d %I:%M:%S %p
[test_json_dateonly]
TIME_PREFIX = "date":\s*"
TIME_FORMAT = %Y-%m-%d
# transforms.conf
[set_sourcetype_datetime]
REGEX = "date":\s*"\d{4}-\d{2}-\d{2} \d{1,2}:\d{2}:\d{2} [AP]M"
DEST_KEY = MetaData:Sourcetype
FORMAT = sourcetype::test_json_datetime
[set_sourcetype_dateonly]
REGEX = "date":\s*"\d{4}-\d{2}-\d{2}"
DEST_KEY = MetaData:Sourcetype
FORMAT = sourcetype::test_json_dateonly
Regards,
Prewin
Splunk Enthusiast | Always happy to help! If this answer helped you, please consider marking it as the solution or giving a kudos. Thanks!
Then you can try this
props.conf
[json_splunk]
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)
TIME_PREFIX = "date":\s*"
TIME_FORMAT = %Y-%m-%d %I:%M:%S %p
MAX_TIMESTAMP_LOOKAHEAD = 60
TRANSFORMS-normalize = fix_date_field, fix_time_hour
transforms.conf
[fix_date_field]
REGEX = ("date":\s*")(\d{4}-\d{2}-\d{2}|\d{2}-\d{2}-\d{2})(")
FORMAT = $1$2 12:00:00 AM$3
DEST_KEY = _raw
[fix_time_hour]
REGEX = ("date":\s*".*?\s)(\d{1})(:\d{2}:\d{2}\s(?:AM|PM))
FORMAT = $10$2$3
DEST_KEY = _raw
output:
Sample events which i tried:
{"date": "2025-05-23 9:35:35 PM", "event": "Login"}
{"date": "2025-05-23", "event": "Logout"}
{"date": "2025-05-24 10:15:00 AM", "event": "Login"}
{"date": "2025-05-24", "event": "Logout"}
{"date": "2025-05-25 11:45:00 AM", "event": "Update"}
{"date": "2025-05-25", "event": "Login"}
{"date": "2025-05-26 12:00:00 PM", "event": "Logout"}
{"date": "2025-05-26", "event": "Update"}
{"date": "2025-05-27 1:30:00 PM", "event": "Login"}
{"date": "2025-05-27", "event": "Logout"}
SPL with Dummy Data Using makeresults
| makeresults count=10
| streamstats count as id
| eval raw_json=case(
id=1, "{\"date\": \"2025-05-23 9:35:35 PM\", \"event\": \"Login\"}",
id=2, "{\"date\": \"2025-05-23\", \"event\": \"Logout\"}",
id=3, "{\"date\": \"2025-05-24 10:15:00 AM\", \"event\": \"Login\"}",
id=4, "{\"date\": \"2025-05-24\", \"event\": \"Logout\"}",
id=5, "{\"date\": \"2025-05-25 11:45:00 AM\", \"event\": \"Update\"}",
id=6, "{\"date\": \"2025-05-25\", \"event\": \"Login\"}",
id=7, "{\"date\": \"2025-05-26 12:00:00 PM\", \"event\": \"Logout\"}",
id=8, "{\"date\": \"2025-05-26\", \"event\": \"Update\"}",
id=9, "{\"date\": \"2025-05-27 1:30:00 PM\", \"event\": \"Login\"}",
id=10, "{\"date\": \"2025-05-27\", \"event\": \"Logout\"}"
)
| spath input=raw_json
| eval parsed_time = if(match(date, "\\d{4}-\\d{2}-\\d{2} \\d{1,2}:\\d{2}:\\d{2} [APMapm]{2}"),
strptime(date, "%Y-%m-%d %I:%M:%S %p"),
strptime(date, "%Y-%m-%d"))
| eval _time = parsed_time
| table _time, date, event
| makeresults count=4
| streamstats count AS row
| eval _raw=case(
row=1, "{\"date\":\"2025-05-23 21:35:35\"}",
row=2, "{\"date\":\"2025-05-22\"}",
row=3, "{\"date\":\"2025-05-21 15:20:00\"}",
row=4, "{\"date\":\"2025-05-20\"}"
)
| spath
| eval _time=if(match(date, "\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}"),
strptime(date, "%Y-%m-%d %H:%M:%S"),
strptime(date, "%Y-%m-%d"))
| table date _time