i have a state file that writes the same size data file every 5 minutes but the data inside is only sometimes different. An example would be "event A = true" , but on the next write it might be "event A = false"
I dont want it to reindex everything and spike the usage limit. Is there a way for me to just index the change from true to false?
Thanks
Write a scripted input that runs every 5 minutes (off cycle from the file update if possible).
In your script, read the file. Emit the value of eventA (or whatever you want), to stdout.
Don't monitor the file in any other way. Here is a python script that might do the trick:
import re
data = open("filePathHere").read()
matchFound = re.search("(?P<matchString>eventA\s*?=\s*?\S+)",data)
if matchFound:
print matchFound.group('matchString')
else:
print "EventA not found"
This will work best if the file is not too large. It will look for the first occurrence of eventA=something
and print what it finds. It allows for optional whitespace around the equals sign (that's what \s*?
does).
If you set this up as a scripted input in Splunk, you will see only the following added to Splunk every 5 minutes:
eventA=something
When you go into the Splunk Manager, under Data Inputs >> Scripts, click the New button. Fill in the info, including the stuff under More Options. Notice that your script has to be placed in a particular directory.
After you complete this setup, Splunk will run the script at the interval you have selected, and will index the output of the script.
For more info: http://docs.splunk.com/Documentation/Splunk/latest/Data/Setupcustominputs#Add_a_scripted_input_in_Sp...
how would the script integrate into splunk? would it just retrieve the latest updates and insert it?