Getting Data In

How to configure Splunk to NOT repeatedly read and index old data?

Communicator

New to splunk...i need the splunk server to get data from a .log file which is constantly being inserted data. I hope the index will read the new data in the .log file once the new data is inserted. But what happens is the index also read the old data and it makes the index volume too large. How could I solve this problem?

0 Karma
1 Solution

SplunkTrust
SplunkTrust

The difference is that Splunk expects logfiles to be appended to the end. If the file changes anywhere else it's considered to be a new file, and therefore reindexed entirely.

View solution in original post

SplunkTrust
SplunkTrust

The difference is that Splunk expects logfiles to be appended to the end. If the file changes anywhere else it's considered to be a new file, and therefore reindexed entirely.

View solution in original post

Communicator

Is it possible to make splunk read the new data only by changing any configurations?

0 Karma

Legend

No. Splunk works on streams of log data, where it is assumed that whatever data is added comes at the end. Otherwise Splunk would have to somehow keep a record of what the file looked like, do a diff and then grab only stuff that's been added, which would be very resource intensive and lead to all kinds of weird problems.

SplunkTrust
SplunkTrust

That should by default lead to Splunk only reading in new lines from the end.

Do post the configuration you used to read this file.

0 Karma

Communicator

sorry the new data are inserted in the middle of the whole file's data cuz there are different parts of data in the file. What's the difference between these situations?

0 Karma

SplunkTrust
SplunkTrust

Are you appending to the end of the logfile or inserting data somewhere else?

0 Karma

Communicator

Appending to the end of the .log

0 Karma