Getting Data In

Set index time settings (timestamp, linebreak etc) for sourcetype set by transforms

salem34
Path Finder

Hi Ninjas

Im struggling with the following scenario:

I have a heavy forwarder whos collecting a merged data stream called "generic_sourcetype". For example, this stream consists of the following events (format wise):

Event 1

Sep 24 18:22:16 - 209.160.24.63 - - [24/Sep/2017:18:22:16.885] "GET /product.screen?productId=WC-SH-A02&JSESSIONID=SD0SL6FF7ADFF4953 HTTP 1.1" 200 3878 "http://www.google.com" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.5 (KHTML, like Gecko) Chrome/19.0.1084.46 Safari/536.5" 349 sourcetype:a

Event 2

Sep 24 00:15:03 - 209.160.24.63 - - Thu Sep 24 2017 00:15:02.554 www1 sshd[4747]: Failed password for invalid user jabber from 118.142.68.222 port 3187 ssh2 sourcetype:b

This comes in as one merged data stream (no i cant influence that) - so i built a "routing" with transforms.conf on the heavy forwarder like:

props.conf

[generic_sourcetype]
TRANSFORMS-route_st = route_st_a, route_st_b

transforms.conf

[route_st_a]
REGEX = sourcetype:a
FORMAT = sourcetype::a
DEST_KEY = MetaData:Sourcetype

[route_st_b]
REGEX = sourcetype:b
FORMAT = sourcetype::b
DEST_KEY = MetaData:Sourcetype

So far so good, this config works fine and i got the two sourcetypes indexed properly. Now the problem I have is the following:
Those two events have a detailed timestamp after the header with milliseconds which i want to use as the indexed timestamp. So i configured parsing settings in props.conf for both of the sourcetypes (a+b) on the heavyforwarder like:

props.conf

[a]
TIME_PREFIX = $ProperSetting
TIME_FORMAT = $ProperSetting
MAX_TIMESTAMP_LOOKAHEAD = $ProperSetting

[b]
TIME_PREFIX = $ProperSetting
TIME_FORMAT = $ProperSetting
MAX_TIMESTAMP_LOOKAHEAD = $ProperSetting

Testing those settings by adding a oneshot with the dedicated sourcetype set during input shows that my configs are correct and the correct timestamp for both events is extracted.

But somehow it does not work with my generic stream, it does split it up but it ignores my timestamp configuration and keeps indexing the first timestamp for both events.

So it seems that the heavyforwarder assigns a timestamp automatically for the generic_sourcetype and then processes the transfomrs for the sourcetype filtering but then sends the events directly instead of "re-parse" them with the given settings for the new sourcetype.

Is this the way splunk handles this kind of data? Or am I missing something (or somewhere)?

Thanks as always

0 Karma

maciep
Champion

I'm pretty sure that is how splunk handles the data. The timestamp recognition happens before your transforms is called and it won't re-evaluate after the new sourcetype is assigned. Typically, the sourcetype assignment is more of a last step to prepare for specific field extractions over on the search head (or indexed extractions)

There's a very nice flow chart here:
https://wiki.splunk.com/Community:HowIndexingWorks

0 Karma
Get Updates on the Splunk Community!

Synthetic Monitoring: Not your Grandma’s Polyester! Tech Talk: DevOps Edition

Register today and join TekStream on Tuesday, February 28 at 11am PT/2pm ET for a demonstration of Splunk ...

Instrumenting Java Websocket Messaging

Instrumenting Java Websocket MessagingThis article is a code-based discussion of passing OpenTelemetry trace ...

Announcing General Availability of Splunk Incident Intelligence!

Digital transformation is real! Across industries, companies big and small are going through rapid digital ...