Getting Data In

HF Data routing - syslog to index / sourcetype by host

tmarlette
Motivator

I have a heavy forwarder that I am receiving an array of data on from port 514. In this case, I would like to break out esxi syslog data, and I can this via REGEX quite easily, however when I make the configurations in the HF, it doesn't seperate the data before sending to the indexers. I would like to keep the 'parsing' load off of my indexers if at all possible.

Here is my config on my HF:
props.conf

[syslog_pool]
TRANSFORMS-index_assign = esx_index
TRANSFORMS-sourcetype_assign = esx_syslog_sourcetype

Transforms.conf

[esx_index]
REGEX = myesxhost01
DEST_KEY = _MetaData:Index
FORMAT = main

[esx_syslog_sourcetype]
REGEX = myesxhost01
DEST_KEY = _MetaData:Sourcetype
FORMAT = vmw-syslog

Is there something I'm doing wrong, or am I just not understanding how the parsing works?

0 Karma
1 Solution

tmarlette
Motivator

With the help of my Splunk engineering team I have figured this out. These are the final settings I have used on my HF to send data to my indexers on different indexes.

props.conf

[source::udp:514]
TRANSFORMS-esx_handling = esx_index,esx_syslog_sourcetype

transforms.conf

[esx_index]
REGEX = myHostName
FORMAT = main
DEST_KEY = _MetaData:Index


[esx_syslog_sourcetype]
REGEX = myHostName
FORMAT = vmw-syslog
DEST_KEY = MetaData:Sourcetype

View solution in original post

0 Karma

tmarlette
Motivator

With the help of my Splunk engineering team I have figured this out. These are the final settings I have used on my HF to send data to my indexers on different indexes.

props.conf

[source::udp:514]
TRANSFORMS-esx_handling = esx_index,esx_syslog_sourcetype

transforms.conf

[esx_index]
REGEX = myHostName
FORMAT = main
DEST_KEY = _MetaData:Index


[esx_syslog_sourcetype]
REGEX = myHostName
FORMAT = vmw-syslog
DEST_KEY = MetaData:Sourcetype
0 Karma

starcher
SplunkTrust
SplunkTrust

Sorry had to repost using different answers account. So to go back to the discussion.

Splunk cannot natively perform as well as a native syslog service like syslog-ng or rsyslog. If you are having restart issues then I suspect you are not log rotating your files and blacklisting the tgz extension via the inputs definitions. This will cause the UF to check files that haven't changed and won't change again and track them. That has checksum performance issues for the UF on start and ulimits issues for the OS.

My recommendation is log rotate them to archive and blacklist the extension. If storage is an issue mount cheap storage and have log rotate also move the files out of the live monitored path after say a day or two so the UF has a chance to pick them up.

So I stand by my suggestion that using a native syslog service with a solid configuration and log archival is the best solution. If you don't have cheap storage point to mount you can always rely on the Splunk indexed data retention methods if they suit your requirements.

http://www.georgestarcher.com/splunk-success-with-syslog/
If you are writing to folder structures based on a good naming convention you don't have to edit the inputs every time you add a new device. So edits to index/sourcetype are minimal for new devices of a type you already have defined.

0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...