Getting Data In

Put more than one data source into one index

tfechner
Path Finder

We have 4 servers running applications that should log into splunk.
Logtypes are :
2x apache = sourcetype=access_combined
1app x log4j = sourcetype=log4j
1app x log4j = sourcetype=log4j but different contents

inputs.conf will be
udp:4444
sourcetype=????

Question:
Is splunk able to select the correct parser? How is this done?
Which sourcetype do I have to enter in the inputs.conf?
How can I expand the field extraction in props/transforms if I do notknow the sourcetype for this special input? (I may have some other log4j inputs, too)

0 Karma
1 Solution

jconger
Splunk Employee
Splunk Employee

Technically speaking, you can do something like this:

props.conf

[source::udp:4444]
TRANSFORMS-set_sourcetype = set_sourcetype_access_combined, set_sourcetype_log4j, set_sourcetype_something_else

transforms.conf

[set_sourcetype_access_combined]
DEST_KEY = MetaData:Sourcetype
REGEX = some regex that matches your access_combined data
# For example, the REGEX used for Cisco ASA matching is REGEX = %ASA-\d-\d{6}
# Check out the Splunk Add-on for Cisco ASA for more examples.
FORMAT = sourcetype::access_combined

[set_sourcetype_log4j]
DEST_KEY = MetaData:Sourcetype
REGEX = some regex that matches your log4j data
FORMAT = sourcetype::log4j

[set_sourcetype_something_else]
DEST_KEY = MetaData:Sourcetype
REGEX = some regex that matches your other data (you get the idea by now...)
FORMAT = sourcetype::your_sourcetype

But, as @FrankVI pointed out, this can be cumbersome over time to maintain the regexes if you data changes.

View solution in original post

0 Karma

jconger
Splunk Employee
Splunk Employee

Technically speaking, you can do something like this:

props.conf

[source::udp:4444]
TRANSFORMS-set_sourcetype = set_sourcetype_access_combined, set_sourcetype_log4j, set_sourcetype_something_else

transforms.conf

[set_sourcetype_access_combined]
DEST_KEY = MetaData:Sourcetype
REGEX = some regex that matches your access_combined data
# For example, the REGEX used for Cisco ASA matching is REGEX = %ASA-\d-\d{6}
# Check out the Splunk Add-on for Cisco ASA for more examples.
FORMAT = sourcetype::access_combined

[set_sourcetype_log4j]
DEST_KEY = MetaData:Sourcetype
REGEX = some regex that matches your log4j data
FORMAT = sourcetype::log4j

[set_sourcetype_something_else]
DEST_KEY = MetaData:Sourcetype
REGEX = some regex that matches your other data (you get the idea by now...)
FORMAT = sourcetype::your_sourcetype

But, as @FrankVI pointed out, this can be cumbersome over time to maintain the regexes if you data changes.

0 Karma

tfechner
Path Finder

ok this might come with some limitations but helps me in a similar context where I have to split the events from one datasource into different indexes based on regex (distinguish user access rights)

0 Karma

FrankVl
Ultra Champion

Thanks for that elaboration on what can be done @jconger 🙂

It is not just cumbersome though. If the events require different index-time processing (e.g. different TIME_FORMAT setting, different LINE_BREAKER, etc.) it will simply be impossible to use a combined input like this, as you cannot use meta field overrides to influence the index-time config that get's applied.

E.g. following your example, if you would add the following to props.conf, it would get completely ignored for this data.

[access_combined]
TIME_FORMAT = foo
LINE_BREAKER = yada

[log4j]
TIME_FORMAT = bar
LINE_BREAKER = bla
0 Karma

FrankVl
Ultra Champion

I would really suggest figuring out a different solution than sending several very different log types to a single UDP input. At least send different log types to different ports, but even better: just put a Universal Forwarder on each of those servers and read the logs locally from files.

Although there are ways to set a generic sourcetype in inputs.conf and then use props and transforms to override that based on event content, that gets messy quite quickly and certain index time configurations (like TIME_FORMAT) will not be able to use the overridden sourcetype.

0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...