Getting Data In

problem about Monitor Files

benjiminhugh
Explorer

I choose "Continuously index data from a file or directory this Splunk instance can access" to input a file.
Give a example,
there are a, b, c, three records in the file, when I add another d, the searching result in splunk is right, a,b,c,d
But when i delete one, from a,b,c to a,b then the searching result is a,b,a,b,c.
I want it to be a,b
How to fix it?

Tags (1)
0 Karma
1 Solution

Ayn
Legend

I think you're misunderstanding how Splunk stores events. Splunk reads events from the sources it monitors and adds them into its own index. So if you remove data from a file that Splunk is monitoring, you will not remove that data from Splunk's index. Instead, what is likely happening in your case is that Splunk detects that the file has changed, then because it can't search to the previous last position it read in the file (because you've made the file smaller) it will reindex the whole file instead.

So, first of all Splunk will add event a,b,c to its index. Then after your modification the source file will only have a,b. Splunk looks at the file, sees that it's changed, and because it can't find a valid position to search from (for the reason described above) it will reindex the whole file, which means event a,b. This will result in the events a,b,c,a,b in Splunk's index.

View solution in original post

Ayn
Legend

I think you're misunderstanding how Splunk stores events. Splunk reads events from the sources it monitors and adds them into its own index. So if you remove data from a file that Splunk is monitoring, you will not remove that data from Splunk's index. Instead, what is likely happening in your case is that Splunk detects that the file has changed, then because it can't search to the previous last position it read in the file (because you've made the file smaller) it will reindex the whole file instead.

So, first of all Splunk will add event a,b,c to its index. Then after your modification the source file will only have a,b. Splunk looks at the file, sees that it's changed, and because it can't find a valid position to search from (for the reason described above) it will reindex the whole file, which means event a,b. This will result in the events a,b,c,a,b in Splunk's index.

Ayn
Legend

No, you can't "avoid" it - it's an integral part of how Splunk works.

As for the second question, if you add an event to the file, Splunk will just carry on from its last known position (c) and read until the new end of the file (after d), so it will not have to reindex the whole file.

0 Karma

benjiminhugh
Explorer

And why when i add d to the file, it is a,b,c,d. why not be a,b,c,a,b,c,d

0 Karma

benjiminhugh
Explorer

Is there any way to avoid this?

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Community Content Calendar, September edition

Welcome to another insightful post from our Community Content Calendar! We're thrilled to continue bringing ...

Splunkbase Unveils New App Listing Management Public Preview

Splunkbase Unveils New App Listing Management Public PreviewWe're thrilled to announce the public preview of ...

Leveraging Automated Threat Analysis Across the Splunk Ecosystem

Are you leveraging automation to its fullest potential in your threat detection strategy?Our upcoming Security ...