Getting Data In

What is recommended to monitor multiple files in the same directory and assign them each different sourcetypes?

jwalzerpitt
Influencer

We have the opportunity to pull our EZproxy logs from the vendor via SFTP. During my testing in development, I uploaded the three individual files separately into a test index, created unique sourcetypes, and performed field extractions, and then verified the field extractions/regexes worked.

With that, what is the best way to monitor the files? I was planning on creating three separate directories, one for each different log file, and then configuring Splunk to monitor the files while dumping them all into the same index, but I've been reading about props.conf/transforms.conf, and I know I can dump the three different files into the same directory (daily SFTP process in the morning) and have Splunk monitor the files that way, but I'm trying to figure out how to set the props.conf and transforms.conf files to handle this.

Any help would be appreciated.

Thx,
Jeff

cpetterborg
SplunkTrust
SplunkTrust

I'm going to give you some partial files from my configuration that do what you want (separate different files placed in the same directory into different sourcetypes AND have sftp'd files come it with dates in their filenames, but have the source be the same for all of them - as I assume that will be the way you have the data come across). This is what I believe to be the best way to do it. I'm only giving you two different file types, but you can expand this concept to as many file types as you like.

inputs.conf (goes on forwarder):

[monitor:///opt/niku/clarity/logs/*access*.log]
sourcetype = clarity_access

[monitor:///opt/niku/clarity/logs/*system.log]
sourcetype = clarity_system

transforms.conf (foes on indexer):

#  strip date from source name
[name_date_strip]
# Remove -YYYY-MM-DD  style date from the filename
# /opt/niku/clarity/logs/app-access-2014-03-25.log
#
SOURCE_KEY = MetaData:Source
DEST_KEY = MetaData:Source
REGEX = source::(/\w{3}/\w+/clarity/\w+/\S+access)\-\d{4}-\d{2}-\d{2}(.*)
FORMAT = source::$1$2

props.conf (goes on indexer):

# clarity *-access
[clarity_access]
LINE_BREAKER = ([\r\n]+)
NO_BINARY_CHECK = 1
SHOULD_LINEMERGE = false
TIME_FORMAT = %d/%b/%Y:%H:%M:%S %Z
TIME_PREFIX = \|\[
MAX_TIMESTAMP_LOOKAHEAD = 50
KV_MODE=none
TRUNCATE = 999999
TRANSFORMS-namedatestrip = name_date_strip

# clarity *-system.log
[clarity_system]
LINE_BREAKER = ([\r\n]+)
NO_BINARY_CHECK = 1
SHOULD_LINEMERGE = false
TIME_FORMAT = %Y/%m/%d %H:%M:%S.%N
TIME_PREFIX = ^
MAX_TIMESTAMP_LOOKAHEAD = 25
KV_MODE=none
TRUNCATE = 999999

From these you can see that you have a way to strip timestamps from the filename for the source as it gets indexed and you can set different files to go to different sourcetypes.

jwalzerpitt
Influencer

Thx for the reply and the info as I greatly appreciate it,

0 Karma

rphillips_splk
Splunk Employee
Splunk Employee

Jeff -
You would want to install the Splunk Universal Forwarder on your server where the files are residing. Then you will need to configure your inputs.conf (http://docs.splunk.com/Documentation/Splunk/6.2.2/admin/inputsconf#inputs.conf.example) to define what files or directories you want to monitor and outputs.conf (http://docs.splunk.com/Documentation/Splunk/latest/Admin/outputsconf?r=searchtip#outputs.conf.exampl...) to define where to send that data (ie: indexers).

You don't need to put the files in different directories as you can assign a sourcetype name for each file via the monitor stanza in inputs.conf

example: (configure this in inputs.conf on your forwarder. Splunk must be restarted after you make the change)

[monitor:///var/log/fileA.log]
sourcetype = sourcetypeA
disabled = 0
index = yourindexname
(if index is undefined it will default to the main index)

[monitor:///var/log/fileB.log]
sourcetype = sourcetypeB
disabled = 0
index = yourindexname

[monitor:///var/log/fileC.log]
sourcetype = sourcetypeC
disabled = 0
index = yourindexname

jwalzerpitt
Influencer

Thx for the reply and the info as I greatly appreciate it,

0 Karma

markthompson
Builder

I don't really get what you're trying to do here, you can set up blacklists/whitelists and set different Host values?

0 Karma

jwalzerpitt
Influencer

I've looked at how some apps, such as Symantec for Splunk, used the props.conf and transforms.conf files to distinguish between different sourcetypes within the same index.

I'm trying to figure out/understand how Splunk would know which sourcetype to apply to which file if all three files were loaded into the same directory. So every day when the three files are copied into the same directory, Splunk would need to parse/ingest the files as following:

fileA needs to equal sourcetypeA
fileB needs to equal sourcetypeB
fileC needs to equal sourcetypeC

0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...