Getting Data In

How to use LINE_BREAKER from one source with multiple sourcetypes?

Path Finder

Hello,

I'd like to use LINE_BREAKER and SHOULD_LINEMERGE for logs coming from a unique source but the logs are related to multiple devices.

inputs.conf

[tcp://34065]
connection_host = none
host = us_forwarder
index = index1
source = us_forwarder

props.conf

[us_forwarder]
## PA, Trend Micro, Fireeye logs
TRANSFORMS-sourcetype_and_host_override = us_paloalto_hostoverride, us_paloalto_sourcetypeoverride, us_fireeye_hostoverride, us_fireeye_sourcetypeoverride
TZ = US/Eastern

So the FireEye's logs needs to be reassembled because I receive them in JSON format but not the others (Paloalto/trendmicro):

SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)(^\{.*\}$)

If I add these 2 lines to the stanza in the props.conf I get all the Palo Alto's logs aggregated into 1 entry.

I'm not able to figure out how I can achieve this? any idea? thanks.

0 Karma
1 Solution

Influencer

@lquinn is correct, LINE_BREAKER is used in the Parsing pipeline which is prior to TRANSFORMS being used in the Typing pipeline. For an in depth look at all the steps your data takes from input to index check out this reference: https://wiki.splunk.com/Community:HowIndexingWorks

As you have found, this is where we can find sage advice from a 1984 film:
Don't cross the streams

So how do we get around this... I can immediately see two potential options, but there might be more:

1) Have your Palo Altos and Fireeye send to different TCP ports so you can assign different sourcetypes at input time, and thus can use different LINE_BREAKER (and TRANSFORM) settings

2) Assuming this is syslog, don't send syslog directly into Splunk, rather setup a syslog server, and write to files on disk split by host name, then have the universal forwarder pick up and assign different sourcetypes based on the origin host (if you have a naming convention for your network devices and appropriate PTR records this helps immensely). Fellow SplunkTrust member @starcher covers this in a blog post: http://www.georgestarcher.com/splunk-success-with-syslog/

View solution in original post

Influencer

@lquinn is correct, LINE_BREAKER is used in the Parsing pipeline which is prior to TRANSFORMS being used in the Typing pipeline. For an in depth look at all the steps your data takes from input to index check out this reference: https://wiki.splunk.com/Community:HowIndexingWorks

As you have found, this is where we can find sage advice from a 1984 film:
Don't cross the streams

So how do we get around this... I can immediately see two potential options, but there might be more:

1) Have your Palo Altos and Fireeye send to different TCP ports so you can assign different sourcetypes at input time, and thus can use different LINE_BREAKER (and TRANSFORM) settings

2) Assuming this is syslog, don't send syslog directly into Splunk, rather setup a syslog server, and write to files on disk split by host name, then have the universal forwarder pick up and assign different sourcetypes based on the origin host (if you have a naming convention for your network devices and appropriate PTR records this helps immensely). Fellow SplunkTrust member @starcher covers this in a blog post: http://www.georgestarcher.com/splunk-success-with-syslog/

View solution in original post

Path Finder

ok thanks it's now clear it will never work the way I thought.

The fireeye and PA logs are first collected by a standalone splunk instance on site so I will ask to modify the forwarding config to send logs on different TCP_ROUTING output.

0 Karma

Path Finder

Well my inputs.conf is this:

[tcp://34065]
connection_host = none
index = index1
source = us_forwarder
sourcetype = us_forwarder

my props.conf

[us_forwarder]
TRANSFORMS-sourcetype_and_host_override = paloalto_hostoverride, fireeye_source_and_sourcetypeoverride
TZ = US/Eastern

I receive logs from fireeye and paloalto on the same tcp input. However Fireeye's logs are sent in JSON format and without any specific configuration they appear in multiple lines whereas the PA's logs are displayed properly (one log=one entry).

So I have to do something to display fireeye's log as one entry and this is why I want to use the LINE_BREAKER feature but perhaps there is another way to achieve this?

0 Karma

Contributor

I'm pretty sure the LINE_BREAKER setting will be applied before your sourcetype transform. Also I assume you assign sourcetype=us_forwarder in your inputs.conf?

SplunkTrust
SplunkTrust

Share some sample entries for each type of logs.

0 Karma