Good morning,
I need to monitor a very long file containing data from 2021 onwards.
I'm only interested in data from last week onwards.
Is there a way to tell the agent where to start analyzing the data?
As others mentioned, Splunk cannot start reading a single large file from a specific line or position. It always reads files sequentially from the beginning unless it has indexed that file before.
But i would suggest to preprocess the file using a script or tool like awk(put filter for the dates you want) and write to a new file.
Regards,
Prewin
If this answer helped you, please consider marking it as the solution or giving a Karma. Thanks!
When Splunk monitors a file, it monitors the entire file. There is no mechanism for starting somewhere in the middle.
This is not to be confused with the ignoreOlderThan setting which tells Splunk to skip a file that is too old.
Yup. While there is a tool (btprobe) to inspect the database which Splunk component keeps to track the state of input files to some degree and possibly clean some entries so that you can re-ingest the files (or other files with the same "header hash") it doesn't let you manipulate that database beyond that.