Getting Data In

Timestamping different formats in same sourcetype

baseballnut8200
Explorer

All...

Looking to see if anyone has any thoughts on trying to bring in different timestamp formats inside of the same sourcetype.  I am working on an issue where we are bringing Crowdstrike data where they are just dumping data into S3 bucket.  Some of the data comes into buckets that have specific directories, so I can set sourcetyping at the source level for those:   However we have some data coming into the same bucket and the same file, but they may have different formats.  Examples of what we are seeing:

"modified_time":"2022-01-10T23:58:25.865570789Z"

"timestamp":"2022-01-21T20:37:37Z"

We have tried defining a datetime.xml and have used the following props settings:

[crowdstrike:edr]

LINE_BREAKER = ([\r\n]+)

MAX_TIMESTAMP_LOOKAHEAD = 30

SHOULD_LINEMERGE = false

#TIME_FORMAT = %s%3N

TIME_PREFIX = "timestamp":|"modified_time":|"_time":|"Time":

#TIME_PREFIX = timestamp

DATETIME_CONFIG = /etc/apps/fmac_crowdstrike_props/datetime.xml

TRANSFORMS-filter-edr-splunkd = crowdstrike_filter_splunk,crowdstrike_filter_splunkforwarder,crowdstrike_filter_endofprocess

TRUNCATE = 999999

disabled = false

kv_mode = json

Please let me know if you have any thoughts on this or ideas that will help.  Thanks!

Labels (1)
0 Karma

baseballnut8200
Explorer

Right... so I read the article and felt like this might be a good solution.  I have implemented this on our testing box, but now the events are getting stamped with the index time.  It seems like the DATETIME_CONFIG=CURRENT is winning, and that the transforms are not doing what I am expecting.  Here are the props and transform that I am using below, but maybe I am missing something:

Props:

[crowdstrike:edr]
DATETIME_CONFIG = CURRENT
LINE_BREAKER = ([\r\n]+)
MAX_TIMESTAMP_LOOKAHEAD = 30
SHOULD_LINEMERGE = false
TIME_PREFIX = \"timestamp\":|\"modified_time\":|\"_time\":|\"Time\":
TRUNCATE = 999999
disabled = false
kv_mode = json
TRANSFORMS-extract_date = multiple_timestamp_format

transforms:

[multiple_timestamp_format]
INGEST_EVAL= _time=case(isnotnull(strptime(_raw, "%Y-%m-%dT%H:%M:%S.%QZ")), strptime(_raw, "%Y-%m-%dT%H:%M:%S.%QZ"), isnotnull(strptime(_raw,"%s%3N")), strptime(_raw, "%s%3N"))

Just let me know what you think...  Thanks!

PickleRick
SplunkTrust
SplunkTrust

Just for clarity I'd lose the isnotnull and use coalsece instead. It's more readable that way.

Also if your parsing is not working (and thus you're getting index time into them), you can add some constant "fallback" at the end so it always matches and see if it's because the EVAL maches wrongly or is it that it's not run at all.

Like

INGEST_EVAL= _time=coalesce(strptime(_raw, "%Y-%m-%dT%H:%M:%S.%QZ"), strptime(_raw, "%Y-%m-%dT%H:%M:%S.%QZ"), strptime(_raw,"%s%3N"), strptime(_raw, "%s%3N"), 1)

This way if none of the strptime produces a non-null result, your event should get indexed in 1970 🙂

0 Karma

isoutamo
SplunkTrust
SplunkTrust
Hi
when you are configuring _time recognition manually, (usually) it's best to set "DATETIME_CONFIG = " to avoid any surprises.
r. Ismo
0 Karma

baseballnut8200
Explorer

Right...  according to the .conf presentation, you are supposed to set the DATETIME_CONFIG = CURRENT, which is what I tried.  I also commented out the DATETIME_CONFIG to see if that would help, but no luck there.  I can try setting the DATETIME_CONFIG="" to see what that gets, but not sure that gets me what I am looking for.  Will let you know...

0 Karma

isoutamo
SplunkTrust
SplunkTrust
Hi
You could use INGEST_EVAL on props.conf with needed if/case etc. syntax to select/calculate correct value for _time. If I recall right there is conf presentation where this is an one example.
r. Ismo

PickleRick
SplunkTrust
SplunkTrust

Yup. If I remember correctly, date parsing is relatively early on the processing queue so you can't modify the message prior to timestamp extraction. Therefore you can only modify it "post-mortem" 😉 with ingest-time eval.

You want this pdf - https://conf.splunk.com/files/2020/slides/PLA1154C.pdf

Especially, pages 26+

Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...