Getting Data In

How to detect that a file has been indexed

JeremyHagan
Communicator

I have an automated process running on a Windows server that has the Universal Forwarder installed. It drops files for indexing in a specific folder and I want it to be able to clean up and delete files that have been indexed.

In my testing I found that the splunkd.log file had entries like this:

11-06-2013 22:08:49.817 +1100 INFO
BatchReader - Removed from queue
file='U:\WebsenseExport\Exported\20131106100010.csv'.

My script was parsing this file and if it found this entry it would delete the file. This morning I find a bunch of files have been indexed, but are not noted in the splunkd.log. I suspect that this is because they are small and the "BatchReader" doesn't handle them? This is just a guess.

Is there a foolproof way I can tell, from the forwarder, that a file has definitely been indexed?

Tags (1)
0 Karma

MuS
SplunkTrust
SplunkTrust

Hi JeremyHagan,

you can use the REST end point

 /services/admin/inputstatus/TailingProcessor:FileStatus

to track if a universal forwarder is reading files monitored or completed sending events.

In the end point you can find information about "open file", and others showing "finished reading".

Some details about the endpoint information, when the percent is 100% :

"finished reading" means that the file has been read and forwarded till the end.

"open file" means the same, but in addition the handle on the file is still open (because it has been less than 3 seconds, or because it is being 'tailed', or the file has just being reopen for any update or rotation).

Splunk will monitor every file, because Splunk assumes that a new event can be added to any file.

hope this helps...

cheers, MuS

lguinn2
Legend

I like the answer from @MuS

To follow up, you might also take a look at this blog post. It's a bit old, but you can grab some code that does something like what you want...

I love code samples

http://blogs.splunk.com/2011/01/02/did-i-miss-christmas-2/

JeremyHagan
Communicator

Thanks for the answer, but I need to be able to check from the forwarder and programmatically.

Your suggestion would require access to the splunk indexer.

0 Karma

somesoni2
Revered Legend

you might check your Index where the data is indexed, with the source=yourfilename and see if any events are found there.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Build the Future of Agentic AI: Join the Splunk Agentic Ops Hackathon

AI is changing how teams investigate incidents, detect threats, automate workflows, and build intelligent ...

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...