Getting Data In

File Monitor: Avoid indexing files during the copy process

BMacher
Path Finder

Hello,

I have a problem with the file monitor. I guess it's not the right way using it, but I don't know any other method for this kind of import.

CSV-files with a size of 1 up to 20 MBytes are periodically copied into a directory from a remote server, where I have a file monitor on. The Problem is, they are indexed during the copy process as binary events. Can I force Splunk to wait until the copy process ends before it starts the index process.

Thank you 🙂

0 Karma
1 Solution

BMacher
Path Finder

I only had to set NO_BINARY_CHECK to false, which was true for some reason.

View solution in original post

0 Karma

BMacher
Path Finder

I only had to set NO_BINARY_CHECK to false, which was true for some reason.

0 Karma

mayurr98
Super Champion

Hey there are some settings in inputs.conf that you can configure:
You can set appropriate integer values accordingly.

pollPeriod = <integer>
* How often, in seconds, to check a directory for changes.
* Defaults to 3600 seconds (1 hour).

interval = <integer>
* How often, in seconds, to poll for new data.
* This setting is required, and the input will not run if the setting is
  not present.
* The recommended setting depends on the Performance Monitor object,
  counter(s) and instance(s) that you define in the input, and how much
  performance data you require.
  * Objects with numerous instantaneous or per-second counters, such
    as "Memory," "Processor" and "PhysicalDisk" should have shorter
    interval times specified (anywhere from 1-3 seconds).
  * Less volatile counters such as "Terminal Services", "Paging File",
    and "Print Queue" can have longer times configured.
* Default is 300 seconds.

readInterval = <integer>
* How often, in milliseconds, that the input should read the network
  kernel driver for events.
* Advanced option. Use the default value unless there is a problem
  with input performance.
* Set this to adjust the frequency of calls into the network kernel driver.
* Choosing lower values (higher frequencies) can reduce network
  performance, while higher numbers (lower frequencies) can cause event
  loss.
* The minimum allowed value is 10 and the maximum allowed value is 1000.
* Defaults to unset, handled as 100 (msec).

let me know if this helps!

0 Karma

BMacher
Path Finder

Thank you for your reply. But your answer does not really fit my needs. Even if I set an interval I cannot make sure that no file is in the copy process when the file monitor starts again.

0 Karma
Get Updates on the Splunk Community!

The Splunk Success Framework: Your Guide to Successful Splunk Implementations

Splunk Lantern is a customer success center that provides advice from Splunk experts on valuable data ...

Splunk Training for All: Meet Aspiring Cybersecurity Analyst, Marc Alicea

Splunk Education believes in the value of training and certification in today’s rapidly-changing data-driven ...

Investigate Security and Threat Detection with VirusTotal and Splunk Integration

As security threats and their complexities surge, security analysts deal with increased challenges and ...