Getting Data In

File Monitor: Avoid indexing files during the copy process

BMacher
Path Finder

Hello,

I have a problem with the file monitor. I guess it's not the right way using it, but I don't know any other method for this kind of import.

CSV-files with a size of 1 up to 20 MBytes are periodically copied into a directory from a remote server, where I have a file monitor on. The Problem is, they are indexed during the copy process as binary events. Can I force Splunk to wait until the copy process ends before it starts the index process.

Thank you 🙂

0 Karma
1 Solution

BMacher
Path Finder

I only had to set NO_BINARY_CHECK to false, which was true for some reason.

View solution in original post

0 Karma

BMacher
Path Finder

I only had to set NO_BINARY_CHECK to false, which was true for some reason.

0 Karma

mayurr98
Super Champion

Hey there are some settings in inputs.conf that you can configure:
You can set appropriate integer values accordingly.

pollPeriod = <integer>
* How often, in seconds, to check a directory for changes.
* Defaults to 3600 seconds (1 hour).

interval = <integer>
* How often, in seconds, to poll for new data.
* This setting is required, and the input will not run if the setting is
  not present.
* The recommended setting depends on the Performance Monitor object,
  counter(s) and instance(s) that you define in the input, and how much
  performance data you require.
  * Objects with numerous instantaneous or per-second counters, such
    as "Memory," "Processor" and "PhysicalDisk" should have shorter
    interval times specified (anywhere from 1-3 seconds).
  * Less volatile counters such as "Terminal Services", "Paging File",
    and "Print Queue" can have longer times configured.
* Default is 300 seconds.

readInterval = <integer>
* How often, in milliseconds, that the input should read the network
  kernel driver for events.
* Advanced option. Use the default value unless there is a problem
  with input performance.
* Set this to adjust the frequency of calls into the network kernel driver.
* Choosing lower values (higher frequencies) can reduce network
  performance, while higher numbers (lower frequencies) can cause event
  loss.
* The minimum allowed value is 10 and the maximum allowed value is 1000.
* Defaults to unset, handled as 100 (msec).

let me know if this helps!

0 Karma

BMacher
Path Finder

Thank you for your reply. But your answer does not really fit my needs. Even if I set an interval I cannot make sure that no file is in the copy process when the file monitor starts again.

0 Karma
Get Updates on the Splunk Community!

Splunk Observability as Code: From Zero to Dashboard

For the details on what Self-Service Observability and Observability as Code is, we have some awesome content ...

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Shape the Future of Splunk: Join the Product Research Lab!

Join the Splunk Product Research Lab and connect with us in the Slack channel #product-research-lab to get ...