I'm getting logs from a dockerized in-house developed application and ingesting them into Splunk.
There are 3 types of logs, coming into the log file:
1. Application logs (single line, internal format)
2. UWSGI logs (multiline)
3. ModSecurity serial logging (multiline)
The logs are forwarded to remote syslog server, and then ingested into Splunk with universal forwarder. While those logs are in different formats I want to separate them into different indexes for different processing approaches.
Is there any good documentation piece/forum post/tutorial/anything that describes effective way to separate different log types from a mixed source?
Thank you!
Hi @rubenmuradyan,
you don't need to put logs in different indexes, usually alog is recorded in a different index whan there's a different retention or access grants.
Instead, you have to associate a diferent sourcetype to each kind of logs because they have a different format and because sourcetype is the real differentiator between logs (non index), remember that Splunk isn't a DB where the difference is the table.
Anyway, the correct approach is to override sourcetype on Indexers or (if present) on Heavy Forwarders, following the instructions at https://docs.splunk.com/Documentation/Splunk/8.2.5/Data/Advancedsourcetypeoverrides
in few words, you have to find a regex to identify each kind of log and then create for each destination sourcetype a stanza in:
props.conf
[origin_sourcetype]
TRANSFORMS-sourcetype = override_sourcetype1, override_sourcetype2, override_sourcetype3
and in transforms.conf
[override_sourcetype1]
REGEX = sourcetype1_regex
FORMAT = sourcetype::sourcetype1
DEST_KEY = MetaData:Sourcetype
[override_sourcetype2]
REGEX = sourcetype2_regex
FORMAT = sourcetype::sourcetype2
DEST_KEY = MetaData:Sourcetype
[override_sourcetype3]
REGEX = sourcetype3_regex
FORMAT = sourcetype::sourcetype3
DEST_KEY = MetaData:Sourcetype
Rememeber that these conf files must be on Indexers or, when present, on Heavy Forwarders.
Then remember to reboot Splunk on the modified server.
Ciao.
Giuseppe
Hi @rubenmuradyan,
you don't need to put logs in different indexes, usually alog is recorded in a different index whan there's a different retention or access grants.
Instead, you have to associate a diferent sourcetype to each kind of logs because they have a different format and because sourcetype is the real differentiator between logs (non index), remember that Splunk isn't a DB where the difference is the table.
Anyway, the correct approach is to override sourcetype on Indexers or (if present) on Heavy Forwarders, following the instructions at https://docs.splunk.com/Documentation/Splunk/8.2.5/Data/Advancedsourcetypeoverrides
in few words, you have to find a regex to identify each kind of log and then create for each destination sourcetype a stanza in:
props.conf
[origin_sourcetype]
TRANSFORMS-sourcetype = override_sourcetype1, override_sourcetype2, override_sourcetype3
and in transforms.conf
[override_sourcetype1]
REGEX = sourcetype1_regex
FORMAT = sourcetype::sourcetype1
DEST_KEY = MetaData:Sourcetype
[override_sourcetype2]
REGEX = sourcetype2_regex
FORMAT = sourcetype::sourcetype2
DEST_KEY = MetaData:Sourcetype
[override_sourcetype3]
REGEX = sourcetype3_regex
FORMAT = sourcetype::sourcetype3
DEST_KEY = MetaData:Sourcetype
Rememeber that these conf files must be on Indexers or, when present, on Heavy Forwarders.
Then remember to reboot Splunk on the modified server.
Ciao.
Giuseppe
Thank you so much @gcusello, that really helped.
Perhaps you know (or know the correct documentation piece) how to make a difference between single line entries and multiline ones, coming in the same logfile?
I'm not sure if it is a good idea to combine two types of regexs (single line and multiline) for one log. Additionally the default Splunk approach - to set the start of multiline event with timestamp will obviously not work with modsecurity entries: they do not have prepending timestamps for entries, at leat for the serial audit logging.
Thank you!
Hi @rubenmuradyan,
you have to use a multiline sourcetype (using SHOULD_LINEMERGE=True) for the original sourcetype, so you can manage both the situations, then in the sourcetype overriding you can set the correct sourcetype.
Ciao.
Giuseppe
Thank you so much, @gcusello !