Getting Data In

How to properly split fields from syslog events that were split to different sourcetypes?

StephenD1
Path Finder

This is a followup question to the solution on this thread:

https://community.splunk.com/t5/Getting-Data-In/create-multiple-sourcetypes-from-single-syslog-sourc...

I'm trying to do exactly what the original question asked but I need to apply different DELIM/FIELDS values to the different sourcetypes I create this way.

The solution says that once the new sourcetype is created "...just use additional transforms entries with regular expressions that fit the specific subset of data..." does this mean that if I want to further extract fields from the new sourcetype I can only do that using TRANSFORMS from that point forward or would I be able to put a new stanza further down in the props.conf for [my_new_st] and use additional REPORTs or EXTRACTs that only apply to that new sourcetype?

For example, can I do something like the following?:
Description: first split the individual events based on the value regex-matched on the 5th field then do different field extracts for each of the new sourcetypes. 

 

 

props.conf:

[syslog]
TRANSFORMS-create_sourcetype1 = create_sourcetype1
TRANSFORMS-create_sourcetype2 = create_sourcetype2

[sourcetype1]
REPORT-extract = custom_delim_sourcetype1

[sourcetype2]
REPORT-extract = custom_delim_sourcetype2

 

 

 

 

 

transforms.conf:

[create_sourcetype1]
REGEX = ^(?:[^ \n]* ){5}(my_log_name_1:)\s
DEST_KEY = MetaData:Sourcetype
FORMAT = sourcetype::sourcetype1

[create_sourcetype2]
REGEX = ^(?:[^ \n]* ){5}(my_log_name_2:)\s
DEST_KEY = MetaData:Sourcetype
FORMAT = sourcetype::sourcetype2

[custom_delim_sourcetype1]
DELIMS = " "
FIELDS = d_month,d_date,d_time,d_source,d_logname,d_info,cs_url,cs_bytes,cs_port

[custom_delim_sourcetype2]
DELIMS = " "
FIELDS = d_month,d_date,d_time,d_source,d_logname,d_info,cs_username,sc_http_status

 

 

 

0 Karma
1 Solution

PickleRick
SplunkTrust
SplunkTrust

Something like that.

Explanation - Splunk works (except for all the maintenance stuff that happens behind the scenes) generally in two pipelines.

One set of things happens during event's ingestion - so called index-time operations. And after the event is indexed there are search-time operations which happen during searching from indexes and further processing.

So during indexing you rewrite the sourcetype metadata field using TRANSFORMs. The event is getting indexed with the new sourcetype.

Then when you search the event it is getting parsed according to the sourcetype-defined search-time extractions (REPORT and EXTRACT settings). And they are defined separately for each of "new" sourcetypes.

This is actually a quite typical use case - split a "combined" sourcetype during indexing into separate ones and define different search-time configurations for those sourcetypes.

View solution in original post

PickleRick
SplunkTrust
SplunkTrust

Something like that.

Explanation - Splunk works (except for all the maintenance stuff that happens behind the scenes) generally in two pipelines.

One set of things happens during event's ingestion - so called index-time operations. And after the event is indexed there are search-time operations which happen during searching from indexes and further processing.

So during indexing you rewrite the sourcetype metadata field using TRANSFORMs. The event is getting indexed with the new sourcetype.

Then when you search the event it is getting parsed according to the sourcetype-defined search-time extractions (REPORT and EXTRACT settings). And they are defined separately for each of "new" sourcetypes.

This is actually a quite typical use case - split a "combined" sourcetype during indexing into separate ones and define different search-time configurations for those sourcetypes.

StephenD1
Path Finder

Ok, got it. So if I'm understanding you correctly, configs similar to my example should work to split my syslog events based on the regex during index-time and then when Splunk goes back to process the REPORT/EXTRACTs it should match fields to the new sourcetypes at search-time based on the already indexed sourcetypes from the TRANSFORMS, correct?

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Yes. During ingestion you overwrite the original sourcetype. Since then Splunk has no idea of the original sourcetype whatsoever. During search time it behaves the same as if you'd ingested it with the new sourcetypes from scratch. Splunk has no idea during search time what happens during index-time. It only sees indexed effects of the index-time operations.

Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Unlock What’s Next: The Splunk Cloud Platform at .conf25

In just a few days, Boston will be buzzing as the Splunk team and thousands of community members come together ...