Getting Data In

What are the best practices for huge log directory with files that are overwritten with each entry?

twinspop
Influencer

We are using Informatica software. The logs produced are dumped into 1 directory. Currently there are 1000+ log files produced from various runs. Each file needs to be consumed as 1 event. A new run will overwrite the log file from a similar run.

My current inputs entry:

[monitor:///apps/informatica/powercenter91/server/infa_shared/SessLogs/*.log]
host = etltest1
ignoreOlderThan = 1d
index = main
sourcetype = etl_logs

And props entry:

[source:*SessLogs/*.log]
CHECK_METHOD = modtime

Is this the optimum config?

Thanks!

Tags (2)

jayannah
Builder

I think the below configuration should work if the standard timestamp is used in the log which is automatically recognized by SPlunk. Otherwise, please give the sample log file, we shall give you the time stamp recognition configuration required in props.conf

Since you want entire file content as 1 line, SHOULD_LINEMERGE=false will combines multi lines (if any) into single line and if the characters are more than 10,000, then TRUNCATE=0 will tell splunk indexer not to truncate the event at 10,000 characters.

                 props.conf
                 [etl_logs]
                 SHOULD_LINEMERGE = false
                 TRUNCATE=0
                 CHECK_METHOD = modtime
0 Karma
Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Dynamic formatting from XML events

This challenge was first posted on Slack #puzzles channelFor a previous puzzle, I needed a set of fixed-length ...

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

  🚀 Your data just got a serious AI upgrade — are you ready? Say hello to the Agentic Era with the ...

Stronger Security with Federated Search for S3, GCP SQL & Australian Threat ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...