Solved: Re: How to configure Splunk to NOT repeatedly read...

LuiesCui · ‎02-04-2015

New to splunk...i need the splunk server to get data from a .log file which is constantly being inserted data. I hope the index will read the new data in the .log file once the new data is inserted. But what happens is the index also read the old data and it makes the index volume too large. How could I solve this problem?

martin_mueller · ‎02-04-2015

The difference is that Splunk expects logfiles to be appended to the end. If the file changes anywhere else it's considered to be a new file, and therefore reindexed entirely.

View solution in original post

martin_mueller · ‎02-04-2015

The difference is that Splunk expects logfiles to be appended to the end. If the file changes anywhere else it's considered to be a new file, and therefore reindexed entirely.

LuiesCui · ‎02-04-2015

Is it possible to make splunk read the new data only by changing any configurations?

Ayn · ‎02-04-2015

No. Splunk works on streams of log data, where it is assumed that whatever data is added comes at the end. Otherwise Splunk would have to somehow keep a record of what the file looked like, do a diff and then grab only stuff that's been added, which would be very resource intensive and lead to all kinds of weird problems.

martin_mueller · ‎02-04-2015

That should by default lead to Splunk only reading in new lines from the end.

Do post the configuration you used to read this file.

LuiesCui · ‎02-04-2015

sorry the new data are inserted in the middle of the whole file's data cuz there are different parts of data in the file. What's the difference between these situations?

martin_mueller · ‎02-04-2015

Are you appending to the end of the logfile or inserting data somewhere else?

LuiesCui · ‎02-04-2015

Appending to the end of the .log

How to configure Splunk to NOT repeatedly read and index old data?

Announcing Scheduled Export GA for Dashboard Studio

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!