We are collecting various data from security equipment.
The data is being stored in index=sec_A and received as sourtype=A_syslog.
Here, in the props.conf setting, several data are filtered as follows, and the data is stored by dividing it into different source types and indexes.
[A_syslog]
TRANSFORMS-<class_A> = a, b, c, d
TRANSFORMS-<class_B> = e, f, g
Here, I want to add additional data to be filtered by b, but these data are different from the data currently being collected and timestamp REGEX, so I think I need to collect them in a different way.
Is there a way to specify a different timestamp value only for the data being added while the data collection is continuing?
Other already commented this. So only some additions and clarifications.
In Splunk you should think that one sourcetype is one lexical format of event. So if events have two different field amount, field order or even differently formatted timestamps or timestamps are in different places you should have separate sourcetypes for those.
As @livehybrid shows you can extract and use different timestamp formats and evaluate those correctly with INGEST_EVAL. There are couple of examples in community and also some .conf presentations have some additional examples.
The easiest way to test this is just ingest those into your test environment/test indexes and then use SPL and eval in one line to check how you can get correct format. You could see e.g.
Those contains some examples.
Also be sure if you need to use := instead of =.
r. Ismo
I'm not 100% sure what you want to do and you're being quite vague about it. As @livehybrid already said, there are some ways to overwrite the default timestamp recognition but I'll add to it that it's needlessly complicated, might be difficult to maintain and adds extra load on the indexers since the timestamp has to be parsed twice out of the event.
While dynamical routing to another index is a pretty common thing, recasting one general sourcetype to "subsourcetypes" which are slightly differently parsed into fields in search-time is also not unusual. But spllitting single sourcetype/source/host stream into completely differently treated events is typically an indication that someone didn't bother to properly classify and split the data upstream (like reading whole /var/log/messages or getting syslog from the whole environment as "syslog" sourcetype).
Hi @blanky
It can get pretty complicated trying to extract two different timestamp formats from the same sourcetype - but it isnt impossible.
You could try something like this:
== transforms.conf ==
[yourSourcetype]
TRANSFORM-overwriteTime = overwriteTime
== props.conf ==
[overwriteTime]
INGEST_EVAL = _time=coalesce(strptime(substr(_raw,0,25),"%Y-%m-%d %H:%M:%S"),_time)
This would try and extract the time using the format provided out of the first 25 characters of the _raw event (adjust accordingly) and if that fails it falls back on _time previously determined).
This allows you to overwrite the _time extraction for your other data. You can develop this further depending on the various events coming in if necessary.
For more context on this check out Richard Morgan's fantastic props/transforms examples at https://github.com/silkyrich/ingest_eval_examples/blob/master/default/transforms.conf#L9
For time format variables see https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Commontimeformatvariables
🌟 Did this answer help you? If so, please consider:
Your feedback encourages the volunteers in this community to continue contributing