Getting Data In

How to monitor and index data every 30 minutes (incremental)?

Ravan
Path Finder

We have a log file which a team wants to index in Splunk every 30 minutes. And we would like to keep the log data at source even after indexing in Splunk. What are the options we have ..?

0 Karma

splunker12er
Motivator

Example to monitor the apache web server access_log and error_log ., I create the below staza in the inputs.conf.
Here you need not specify intervals like 5m/ 30 min ., whenever the file content changes the logs are monitored and sent for indexing.

[monitor://<path>]
  • This directs Splunk to watch all files in .
  • can be an entire directory or just a single file.
  • You must specify the input type and then the path, so put three slashes in your path if you are starting at the root (to include the slash that goes before the root directory).

E.g. inputs.conf

[monitor:///var/log/httpd]
sourcetype = access_common
index = httpd_logs
sourcetype=access_combined
ignoreOlderThan = 7d

How the monitor processor works ?
Specify a path to a file or directory and the monitor processor consumes any new data written to that file or directory. This is how you can monitor live application logs such as those coming from Web access logs, Java 2 Platform Enterprise Edition (J2EE) or .NET applications, and so on.

Splunk software monitors and indexes the file or directory as new data appears. You can also specify a mounted or shared directory, including network file systems, as long as Splunk software can read from the directory. If the specified directory contains subdirectories, the monitor process recursively examines them for new files, as long as the directories can be read.

You can include or exclude files or directories from being read by using whitelists and blacklists.

If you disable or delete a monitor input, Splunk software does not stop indexing the files that the input references. It only stops checking those files again. To stop all in-process data indexing, the Splunk server must be stopped and restarted.

Interval parameter

e.g interval = 300 //Every 5 min once

Use the interval parameter to schedule and monitor scripts. The interval parameter specifies how long a script waits before it restarts.

The interval parameter is useful for a script that performs a task periodically. The script performs a specific task and then exits. The interval parameter specifies when the script restarts to perform the task again.

The interval parameter is also useful to ensure that a script restarts, even if a previous instance of the script exits unexpectedly.

Entering an empty value for interval results in a script only being executed on start and/or endpoint reload (on edit).

0 Karma

lycollicott
Motivator

I'm confused by your question.
First, why can't you just monitor it normally and let Splunk index the new events as they occur?

Second, Splunk doesn't delete your source data - it just reads it - so it will still be at your source.

0 Karma

somesoni2
Revered Legend

How does the file content changes? Do you want to just grab the difference from 30 mins back?

0 Karma

Ravan
Path Finder

Yes , only the new data.Its an application log file the data only gets appended.

0 Karma

somesoni2
Revered Legend

Any specific reason for monitoring every 30 mins? If your data has timestamp then there l can be data for every min.

0 Karma

Ravan
Path Finder

This(30min) has been decided by customer to avoid some performance issues. Can we increase that 1min to 30 if we have time stamp in the logs ..?

0 Karma

lycollicott
Motivator

Is it being read by a Universal Forwarder?

Do they experience a performance issue that they are trying to work around or do they just think that a 30 minute interval will be better?

If this log file is very busy then you could have a larger performance impact if you try to ingest it in larger 30 minute amounts. Little bites versus big bites.

Please describe the performance issue and show us the inputs.conf stanza.

0 Karma
Get Updates on the Splunk Community!

Splunk Enterprise Security 8.0.2 Availability: On cloud and On-premise!

A few months ago, we released Splunk Enterprise Security 8.0 for our cloud customers. Today, we are excited to ...

Logs to Metrics

Logs and Metrics Logs are generally unstructured text or structured events emitted by applications and written ...

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...