Getting Data In

Forcing sourcetype on Heavy Forwarder

Na_Kang_Lim
Path Finder

Hi,

So I have a HF instance, which receive multiple types of syslog on many different ports. Ideally, you would have a different port for each sourcetype, however, due to misconfigurations on some servers, we ended up having multiple types of syslog received on the same udp://514 port.

Since changing port require reconfig and we have to make new network change request, we figured that we could change the sourcetype base on regex to minimize work.

I look into some of the app that actually do this kind of changing sourcetype thing, and I found cisco app actually have this config so I mimic it.

So here are my configurations on HF:

$SPLUNK_HOME/etc/system/local/inputs.conf:

[udp://514]
acceptFrom = <many_hosts>
index = my_syslog
sourcetype = syslog

$SPLUNK_HOME/etc/system/local/transforms.conf:

[force_sourcetype_for_peplink]
DEST_KEY = MetaData:Sourcetype
REGEX = ^(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s+(?:0?[1-9]|[12]\d|3[01])\s+(?:[01]\d|2[0-3]):[0-5]\d:[0-5]\d\s+(?:pepline_host1|pepline_host2|pepline_host3)\s+(?=.*[A-Z][A-Za-z]+:\s).*
FORMAT = sourcetype::peplink

$SPLUNK_HOME/etc/system/local/props.conf:

[syslog]
TRANSFORMS-force_sourcetype_for_peplink = force_sourcetype_for_peplink

After HF, we have our data route into an Indexer cluster.

However, upon restart the HF, I saw nothing changed. The logs from the peplink_host* were still having its sourcetype as syslog.

So what could be the reason here?

 

 

 

0 Karma
1 Solution

Na_Kang_Lim
Path Finder

It turned out the problem was due to a misconfiguration in the sending log pipeline, the log was not sent through a Heavy Forwarder but directly into the Indexer.

And it that case, of course the transforming configurations have to be on the Indexer itself

View solution in original post

0 Karma

Na_Kang_Lim
Path Finder

It turned out the problem was due to a misconfiguration in the sending log pipeline, the log was not sent through a Heavy Forwarder but directly into the Indexer.

And it that case, of course the transforming configurations have to be on the Indexer itself

0 Karma

isoutamo
SplunkTrust
SplunkTrust
On additional comment.
Don't use $SPLUNK_HOME/etc/system/local directory for (almost) any configuration. You cannot change those any other way than updating this manually (or your external configuration tool).
It's always better to create a separate/own Splunk app for those and then install those into HF/UF. In that way you can keep better control for those.
0 Karma

livehybrid
SplunkTrust
SplunkTrust

Hi @Na_Kang_Lim 

Can you please confirm, is it your HF which is listening on port 514 to receive this data? Or is it arriving from another host?

 

0 Karma

Na_Kang_Lim
Path Finder

Yeah it is definitely my HF listening on port udp:514.

When I run search on the data, it said the source is udp:514, that port is not opened on our Indexers, and the host is the IP of the device. Moreover, like I mention before, the network is quite strict so it just cannot randomly send log to any other host. If it weren't strict, we would have just changed the forwarding log config on the device to a different port and resolve the issue by specifying the sourcetype for that port stanza.

0 Karma

PickleRick
SplunkTrust
SplunkTrust
WRITE_META = <boolean>
* Whether or not the Splunk platform writes REGEX values to the _meta 'DEST_KEY'.
* When the Splunk platform writes REGEX values to metadata, those REGEX values
  become index-time field extractions.
* This setting is required for all index-time field extractions except for
  those where "DEST_KEY = _meta." See the description of the 'DEST_KEY' setting
  for more information.
* Where applicable, set "WRITE_META = true" instead of setting "DEST_KEY = 
  _meta".
* A value of "true" means that the Splunk platform writes REGEX values to 
  the _meta DEST_KEY. In other words, the platform writes REGEX values to
  metadata.  
* A value of "false" means that the Splunk platform does not write 
  REGEX values to metadata.
* Default: false

Na_Kang_Lim
Path Finder

Added that but nothing changed. Since I specified DEST_KEY and the field I am writing to is sourcetype (which already existed), I also don't think WRITE_META is needed.

I think the problem is with my HF. What settings do I have to check to make sure that my HF actually applying the props and transforms.conf?

0 Karma

PickleRick
SplunkTrust
SplunkTrust

As I understand, you're receiving data directly on that HF's network port, right? So your props should work if only they are properly specified and match the data.

So either your stanza doesn't match the sourcetype (which seems unlikely), something else is overwriting your transform definition (again - unlikely because you have pretty unique transform class name), or your regex is not matching the data.

0 Karma

Na_Kang_Lim
Path Finder

Maybe the problem is with my Indexer? Since my data is forwarding from HF to Indexer and is indexed there.

I have this question in mind, what would happen if I have the same app on both my HF and Indexer? Does that mean the data is parsed twice?

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Doubt it. The events are generally parsed on the first "heavy" component in event's path. Since you're receiving your data on a HF, it's getting parsed there and is sent to indexers as parsed data.

The parsed data is not parsed anymore unless you use rulesets. But normal transforms are not applied on parsed data.

Anyway:

1) I'd first check if the transform is called at all - change REGEX to just a single dot so that it matches any event and see if your sourcetype gets overwritten for all events.

2) Seems strange that you're receiving with HF on a 514 port. I don't recall Splunk getting CAP_NET_BIND_SERVICE capability. Are you running splunkd as root?

0 Karma

Na_Kang_Lim
Path Finder

Now I believe the problem is on my HF, which maybe I somehow configured it wrong.

1, I have created another stanza in props.conf with [host::<peplink_host1_ip>] with a single TRANSFORMS, which I created almost the same as the previous, except for the regex, which is now only a single dot (.). So this should apply to all the logs from that host, but still, nothing changes.

2, Yeah, I know it is not recommended, but I am running splunk as root

So can you give me some commands or docs, on how to check if my HF is configured correctly? Like how to check my license? Because if what I am thinking is right, there must be license and configurations so that the instance do the transforms since it is was Splunk charged you for? I will be charged for parsing and indexing, not forwarding, right? Now I feel like it is working just as a "big" forwarder.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Wait.

What does your input say? What is the connection_host set to?

Remember that if you're using the built-in syslog sourcetype, it calls a transform which overrides the host field so what you're seeing in the resulting indexed event might not be what is set at the beginning of the ingestion pipeline (when Splunk decides which props/transforms to apply).

0 Karma

Na_Kang_Lim
Path Finder

For that udp://514, I set connection_host = ip

0 Karma

livehybrid
SplunkTrust
SplunkTrust

Out of interest @Na_Kang_Lim  - what happens if you dont specify the sourcetype in inputs.conf and then use:

[source::udp:514]
TRANSFORMS-force_sourcetype_for_peplink = force_sourcetype_for_peplink

🌟 Did this answer help you? If so, please consider:

  • Adding karma to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

0 Karma

Na_Kang_Lim
Path Finder

I tried as you suggested but it still did not work.

Any suggestions for troubleshooting is appreciated.

Since switching to a source stanza did not work either, maybe there is something wrong with my HF configurations in general?

0 Karma

Na_Kang_Lim
Path Finder

Oh I forgot to add this,

But to check if the regex is working properly, what I did what use the regex command

index=my_syslog host=peplink_host*
| regex _raw="<above_regex>"
| stats count by host

 And I could see that the regex was working just fine

0 Karma

VLaw
Splunk Employee
Splunk Employee

I believe your use case is related to the following Docs, it has detailed instructions how to override the sourcetypes. This should be configured on Heavy Forwarder in your case. It requires a splunkd restart on HF and the change applies to new events only after restart. 

https://help.splunk.com/en/splunk-enterprise/get-started/get-data-in/9.4/configure-source-types/over... 

0 Karma

Na_Kang_Lim
Path Finder

It is exactly what I was doing, but it did not work in my case.

So I think there is something wrong with how I set up the HF, like it does not 'know' that it should do the parsing also, now I feel like it is only a really big forwarder

0 Karma
Get Updates on the Splunk Community!

Building Reliable Asset and Identity Frameworks in Splunk ES

 Accurate asset and identity resolution is the backbone of security operations. Without it, alerts are ...

Cloud Monitoring Console - Unlocking Greater Visibility in SVC Usage Reporting

For Splunk Cloud customers, understanding and optimizing Splunk Virtual Compute (SVC) usage and resource ...

Automatic Discovery Part 3: Practical Use Cases

If you’ve enabled Automatic Discovery in your install of the Splunk Distribution of the OpenTelemetry ...