Getting Data In

Batching gzipped files residing in 4 directories into Splunk, is there a way to run parallel batches on a Splunk 6.2.6 Linux universal forwarder?

lisaac
Path Finder

I am batching gzipped files into Splunk. The files reside in 4 directories. Splunk, per splunkd.log, appears to be reading only the files in the first batch statement. Is there a way to run parallel batches?

I have a Linux 64 bit Universal Forwarder with Splunk 6.2.6. I have set maxKBPS to 0 in limits.conf, and I have reniced the Splunk UF prdata2sses to a priority of -20 on the Linux VM.

I have batch statements listed as follows:

[batch:///leroylogs2/multicast/archive/data11/*PROD*.gz]
[batch:///leroylogs2/multicast/archive/data12/*PROD*.gz]
[batch:///leroylogs2/multicast/archive/data21/*PROD*.gz]
[batch:///leroylogs2/multicast/archive/data22/*PROD*.gz]
[batch:///leroylogs2/multicast/archive/data11/*CAS*.gz]
[batch:///leroylogs2/multicast/archive/data12/*CAS*.gz]
[batch:///leroylogs2/multicast/archive/data21/*CAS*.gz]
[batch:///leroylogs2/multicast/archive/data22/*CAS*.gz]

10-28-2015 09:37:37.372 -0700 INFO  ArchiveProcessor - Finished processing file '/leroylogs2/multicast/archive/data11/2015-08-29-08_30-PRODtrans_svs.log.gz', removing from stats
10-28-2015 09:37:37.433 -0700 INFO  ArchiveProcessor - handling file=/leroylogs2/multicast/archive/data11/2015-08-29-07_10-CAStrans_svs.log.gz
10-28-2015 09:37:37.434 -0700 INFO  ArchiveProcessor - reading path=/leroylogs2/multicast/archive/data11/2015-08-29-07_10-CAStrans_svs.log.gz (seek=0 len=32496625)
10-28-2015 09:37:37.551 -0700 WARN  TcpOutputProc - The event is missing source information. Event :
10-28-2015 09:37:38.655 -0700 ERROR ArchiveContext - From archive='/leroylogs2/multicast/archive/data11/2015-08-29-07_10-CAStrans_svs.log.gz':  gzip: stdout: Broken pipe
0 Karma

anekkanti_splun
Splunk Employee
Splunk Employee

In Pre 6.3: The only way to read archives/files in parallel is by spawning multiple instances of splunk on the forwarder.
Splunk 6.3 release has a new feature where you can spawn multiple ingestion pipelines, where each pipeline can read one archive/file independently. So essentially with multiple ingestion pipelines, splunk will read multiple archives/files in parallel.
Documentation: http://docs.splunk.com/Documentation/Splunk/6.3.0/Indexer/Pipelinesets

0 Karma

lisaac
Path Finder

I suppose that I could run 2 UFs on the same host, but I would prefer to skip this approach.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...

Design, Compete, Win: Submit Your Best Splunk Dashboards for a .conf26 Pass

Hello Splunkers,  We’re excited to kick off a Splunk Dashboard contest! We know that dashboards are a primary ...

May 2026 Splunk Expert Sessions: Security & Observability

Level Up Your Operations: May 2026 Splunk Expert Sessions Whether you are refining your security posture or ...