Getting Data In

Double sourcetype overriding

gcusello
SplunkTrust
SplunkTrust

Hi at all,

I have a problem similar to one already solved by @PickleRick   in a previous question:

I have a flow from a concentrator (another one that the previous) that sends many logs to my HF by syslog.

It's a mixture or many kinds of logs (Juniper, Infoblox, Fortinet, etc...).

I have to override the sourcetype value, assigning the correct one, but the issue is that, after the related add-ons has to do a new override of the sourcetype value and this second override doesn't work.

I'd like to have an hint about the two solutions I thought or a different one:

  • if it's possibe, how to make a double sourcetype overriding?
  • do you think that it's better to modify the Add-Ons avoiding the first override?

Thank you for your help.

Ciao.

Giuseppe

Tags (1)
0 Karma
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi @tscroggins ,

this solution isn't applicable to my situation because we are receiving one data flow, from one host with all the mixed data, so I cannot apply the sourcetype.

I worked (and solved) analyzing data and identifying the kind of data sources, then I took the related add-ons (Juniper, cisco:ios, cisco:ise, proofpoint, etc...) and I modified all the props.conf using the tranformations from the usual sourcetype (e.g. fgt_log) in the sourcetype I have i my dataflow.

In this way I parsed all the data flows.

Anyway, thank you fro your help.

Ciao.

Giuseppe

View solution in original post

0 Karma

tscroggins
Influencer

Hi @gcusello,

Have you considered creating one new local tcp (or udp) input and corresponding syslog output per source and using _SYSLOG_ROUTING in a transform to send the events to the new local inputs for re-parsing? The new inputs would have a default sourcetype setting defined for the source. Splunk's syslog outputs have hard-coded queue maxSize values, so you'll need to increase parallelIngestionPipeline to scale if needed.

                    +-- type1 --- outputA -------+
| |
syslog --> input1 --+-- type2 --- outputB ----+ |
| | |
+-- type3 --- outputC -+ | |
| | |
+---------------------------------------+ | |
| +---------------------------------------+ |
| | +---------------------------------------+
| | |
| | +-> inputA --- - - -
| |
| +----> inputB --- - - -
|
+-------> inputC --- - - -

You may also be able to do the same with an additional local splunktcp input, output group, _TCP_ROUTING, and a modified default route to re-parse cooked data, but I'd be wary of unintentional feedback loops.

gcusello
SplunkTrust
SplunkTrust

Hi @tscroggins ,

could you better describe your solution?

In two load balanced  HF I have:

  • one source: tcp://10514
  • one sourcetype: syslog
  • one index: network

these values are assigned to logs in one inputs.conf:

[tcp://10514]
sourcetype = syslog
index = network
disabed = 0

then I have the props.conf and transforms.conf I already shared (three tries!) that should transform e.g. the syslog sourcetype to fgt_log and then in fortigate_traffic, fortigate_log, fortigate_utm, fortigate_event based on regex.

The first transformation (fgt_log) correctly works, but not the second one.

Ciao.

Giuseppe

0 Karma

tscroggins
Influencer

Hi @gcusello,

Sorry for the delay. Did you find a working solution? My suggestion was something like:

# inputs.conf

[tcp://10514]
sourcetype = syslog
index = network

[udp://10515]
index = network
sourcetype = infoblox:port

[udp://10516]
index = network
sourcetype = juniper

[udp://10517]
index = network
sourcetype = fgt_log

# outputs.conf

[syslog:infoblox]
server = localhost:10515
type = udp
priority = NO_PRI

[syslog:juniper]
server = localhost:10516
type = udp
priority = NO_PRI

[syslog:fortinet]
server = localhost:10517
type = udp
priority = NO_PRI

# props.conf

[source::tcp:10514]
TRANSFORMS-reroute_syslog = route_infoblox, route_juniper, route_fortinet

# transforms.conf

[route_infoblox]
REGEX = \<\d+\>\w+\s+\d+\s+\d+:\d+\d+:\d+\s+\w+-dns-\w+
DEST_KEY = _SYSLOG_ROUTING
FORMAT = infoblox

[route_juniper]
REGEX = ^\<\d+\>\d+\s+\d+-\d+-\d+\w+:\d+:\d+\.\d+\w(\+|-)\d+:\d+\s\w+-edget-fw
DEST_KEY = _SYSLOG_ROUTING
FORMAT = juniper

[route_fortinet]
REGEX = ^\<\d+\>date\=\d+-\d+-\d+\s+time\=\d+:\d+:\d+\s+devname\=\"[^\"]+\"\s+devid
DEST_KEY = _SYSLOG_ROUTING
FORMAT = fortinet

All events sent to the 10514/tcp input will hit the specified transforms. On match, the event will be reroute to one of the udp inputs using _SYSLOG_ROUTING. If the default syslog output queue size (97 KiB) isn't large enough, you can scale by increasing parallelIngestionPipelines (and resources if the HF performs other functions). I haven't tried increasing the syslog output queue size in some time, but it was hard-coded in the past.

You can also use tcp inputs and type = tcp in syslog outputs, but when forwarding packets locally, the risk of loss comes from buffer/queue overruns, not the network.

All that said, rsyslog or syslog-ng (my preference) installed on the same host is a better solution. You can preferably write and monitor files, or you can relay to local Splunk tcp/udp inputs. If you use files, you'll need adequate local storage for buffering and e.g. logrotate to manage retention. Both rsyslog and syslog-ng have mature and robust parsing languages.

gcusello
SplunkTrust
SplunkTrust

Hi @tscroggins ,

this solution isn't applicable to my situation because we are receiving one data flow, from one host with all the mixed data, so I cannot apply the sourcetype.

I worked (and solved) analyzing data and identifying the kind of data sources, then I took the related add-ons (Juniper, cisco:ios, cisco:ise, proofpoint, etc...) and I modified all the props.conf using the tranformations from the usual sourcetype (e.g. fgt_log) in the sourcetype I have i my dataflow.

In this way I parsed all the data flows.

Anyway, thank you fro your help.

Ciao.

Giuseppe

0 Karma

tscroggins
Influencer

The solution above used a single external input (10514/tcp) and transforms to route events to three internal inputs (10515-10517/udp) based on content, but I'm glad you worked it out! In practice, I use syslog-ng.

tscroggins
Influencer

(In the example solution, you'll also need to add input/output configuration and/or parsing to strip unwanted extra timestamps and hosts from syslog messages.)

0 Karma

PickleRick
SplunkTrust
SplunkTrust

What's your actual config regarding those overrides?

I have a feeling that you should rather get a decent syslog receiver in front of your HF 😉

gcusello
SplunkTrust
SplunkTrust

Hi @PickleRick,

thank you for your support.

I tried three different configurations:

at first, in a dedicated app:

#props.conf
[source::tcp:5514]
TRANSFORMS-00_hostname = set_hostname_fortinet
TRANSFORMS-10_sourcetype = set_sourcetype_infoblox
TRANSFORMS-11_sourcetype = set_sourcetype_juniper
TRANSFORMS-12_sourcetype = set_sourcetype_fortinet

#Transforms.conf
######################### Sourcetype #######################
# infoblox:port
[set_sourcetype_infoblox]
REGEX = \<\d+\>\w+\s+\d+\s+\d+:\d+\d+:\d+\s+\w+-dns-\w+
FORMAT = sourcetype::infoblox:port
DEST_KEY = MetaData:Sourcetype

# Infoblox #
[set_sourcetype_juniper]
REGEX = ^\<\d+\>\d+\s+\d+-\d+-\d+\w+:\d+:\d+\.\d+\w(\+|-)\d+:\d+\s\w+-edget-fw
FORMAT = sourcetype::juniper
DEST_KEY = MetaData:Sourcetype

# Infoblox #
[set_sourcetype_fortinet]
REGEX = ^\<\d+\>date\=\d+-\d+-\d+\s+time\=\d+:\d+:\d+\s+devname\=\"[^\"]+\"\s+devid
FORMAT = sourcetype::fgt_log
DEST_KEY = MetaData:Sourcetype

############################## hostname ############################
# Infoblox #
[set_hostname_fortinet]
REGEX = devname\=\"([^\"]+)\"
FORMAT = host::$1

then I tried (following your previous hint), in a dedicated app:

#props.conf
[source::tcp:5514]
TRANSFORMS-00_hostname = set_hostname_fortinet
TRANSFORMS-10_sourcetype = set_sourcetype_infoblox
TRANSFORMS-11_sourcetype = set_sourcetype_juniper
TRANSFORMS-12_sourcetype = set_sourcetype_fortinet
TRANSFORMS-50_drop_dead = drop_dead_infoblox
TRANSFORMS-51_drop_dead = drop_dead_juniper
TRANSFORMS-52_drop_dead = drop_dead_fortinet

#Transforms.conf
######################### Sourcetype #######################
# infoblox:port
[set_sourcetype_infoblox]
REGEX = \<\d+\>\w+\s+\d+\s+\d+:\d+\d+:\d+\s+\w+-dns-\w+
CLONE_SOURCETYPE = infoblox:port

# Infoblox #
[set_sourcetype_juniper]
REGEX = ^\<\d+\>\d+\s+\d+-\d+-\d+\w+:\d+:\d+\.\d+\w(\+|-)\d+:\d+\s\w+-edget-fw
CLONE_SOURCETYPE = juniper

# Infoblox #
[set_sourcetype_fortinet]
REGEX = ^\<\d+\>date\=\d+-\d+-\d+\s+time\=\d+:\d+:\d+\s+devname\=\"[^\"]+\"\s+devid
CLONE_SOURCETYPE = fgt_log

############################## hostname ############################
# Infoblox #
[set_hostname_fortinet]
REGEX = devname\=\"([^\"]+)\"
FORMAT = host::$1

############################# original log removing ################
# infoblox:port
[drop_dead_infoblox]
REGEX = \<\d+\>\w+\s+\d+\s+\d+:\d+\d+:\d+\s+\w+-dns-\w+
FORMAT = nullQueue
DEST_KEY = queue

# Infoblox #
[drop_dead_juniper]
REGEX = ^\<\d+\>\d+\s+\d+-\d+-\d+\w+:\d+:\d+\.\d+\w(\+|-)\d+:\d+\s\w+-edget-fw
FORMAT = nullQueue
DEST_KEY = queue

# Infoblox #
[drop_dead_fortinet]
REGEX = ^\<\d+\>date\=\d+-\d+-\d+\s+time\=\d+:\d+:\d+\s+devname\=\"[^\"]+\"\s+devid
FORMAT = nullQueue
DEST_KEY = queue

third try,

in the apps Splunk_TA_fortinet_fortigate I added the same transformation present in the Add-On starting from the syslog sourcetype:

props.conf
[syslog]
TRANSFORMS-force_sourcetype_fgt = force_sourcetype_fortigate
SHOULD_LINEMERGE = false
EVENT_BREAKER_ENABLE = true

and in the app Splunk_TA_juniper

#props.conf 
[syslog]
SHOULD_LINEMERGE = false
EVENT_BREAKER_ENABLE = true
TRANSFORMS-force_info_for_juniper = force_host_for_netscreen_firewall,force_sourcetype_for_netscreen_firewall,force_sourcetype_for_junos_idp_structured,force_sourcetype_for_junos_idp,force_sourcetype_for_junos_aamw,force_sourcetype_for_junos_secintel,force_sourcetype_for_junos_firewall_structured,force_sourcetype_for_junos_firewall, force_sourcetype_for_junos_snmp,force_sourcetype_for_junos_firewall_rpd

 But anyway I continue to have only the three first overrided sourcetypes and the original syslog sourcetype:

  • syslog
  • infoblox:port
  • fgt_log
  • juniper

Instead all the other sourcetypes transformed in the Add-Ons.

About your question, for the moment I still using Splunk as syslog receiver but I have in my mind to use rsyslog.

Ciao.

Giuseppe

0 Karma

PickleRick
SplunkTrust
SplunkTrust

OK. So I assume your logs come in on the 5514 port with sourcetype syslog, right?

So if you cast them to another sourcetype using the CLONE_SOURCETYPE mechanism and then drop the original instance (the syslog one) from the processing queue, you can't use props for the syslog sourcetype.

To be fully honest, I showed you how the CLONE_SOURCETYPE thingy works and told you how it can be put to use in a relatively simple use case but in my opinion it's non-maintainable and completely not scalable solution so I'd probably not use it in production.

Also remember that CLONE_SOURCETYPE applies to all events matching by sourcetype, source or host so you can't limit it by regex. So if you clone it four times, you'll have to do much uglier filtering in the "destination" sourcetypes. Which again makes it not scalable.

BTW, if you want rsyslog instead of the usual SC4S, I can help 🙂

Get Updates on the Splunk Community!

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...

.conf24 | Learning Tracks for Security, Observability, Platform, and Developers!

.conf24 is taking place at The Venetian in Las Vegas from June 11 - 14. Continue reading to learn about the ...

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...