Getting Data In

Batching gzipped files residing in 4 directories into Splunk, is there a way to run parallel batches on a Splunk 6.2.6 Linux universal forwarder?

lisaac
Path Finder

I am batching gzipped files into Splunk. The files reside in 4 directories. Splunk, per splunkd.log, appears to be reading only the files in the first batch statement. Is there a way to run parallel batches?

I have a Linux 64 bit Universal Forwarder with Splunk 6.2.6. I have set maxKBPS to 0 in limits.conf, and I have reniced the Splunk UF prdata2sses to a priority of -20 on the Linux VM.

I have batch statements listed as follows:

[batch:///leroylogs2/multicast/archive/data11/*PROD*.gz]
[batch:///leroylogs2/multicast/archive/data12/*PROD*.gz]
[batch:///leroylogs2/multicast/archive/data21/*PROD*.gz]
[batch:///leroylogs2/multicast/archive/data22/*PROD*.gz]
[batch:///leroylogs2/multicast/archive/data11/*CAS*.gz]
[batch:///leroylogs2/multicast/archive/data12/*CAS*.gz]
[batch:///leroylogs2/multicast/archive/data21/*CAS*.gz]
[batch:///leroylogs2/multicast/archive/data22/*CAS*.gz]

10-28-2015 09:37:37.372 -0700 INFO  ArchiveProcessor - Finished processing file '/leroylogs2/multicast/archive/data11/2015-08-29-08_30-PRODtrans_svs.log.gz', removing from stats
10-28-2015 09:37:37.433 -0700 INFO  ArchiveProcessor - handling file=/leroylogs2/multicast/archive/data11/2015-08-29-07_10-CAStrans_svs.log.gz
10-28-2015 09:37:37.434 -0700 INFO  ArchiveProcessor - reading path=/leroylogs2/multicast/archive/data11/2015-08-29-07_10-CAStrans_svs.log.gz (seek=0 len=32496625)
10-28-2015 09:37:37.551 -0700 WARN  TcpOutputProc - The event is missing source information. Event :
10-28-2015 09:37:38.655 -0700 ERROR ArchiveContext - From archive='/leroylogs2/multicast/archive/data11/2015-08-29-07_10-CAStrans_svs.log.gz':  gzip: stdout: Broken pipe
0 Karma

anekkanti_splun
Splunk Employee
Splunk Employee

In Pre 6.3: The only way to read archives/files in parallel is by spawning multiple instances of splunk on the forwarder.
Splunk 6.3 release has a new feature where you can spawn multiple ingestion pipelines, where each pipeline can read one archive/file independently. So essentially with multiple ingestion pipelines, splunk will read multiple archives/files in parallel.
Documentation: http://docs.splunk.com/Documentation/Splunk/6.3.0/Indexer/Pipelinesets

0 Karma

lisaac
Path Finder

I suppose that I could run 2 UFs on the same host, but I would prefer to skip this approach.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...