I want to ingest a very large file that has no usable timestamps. I want to set:
SHOULD_LINEMERGE = false
DATETIME_CONFIG = CURRENT
The problem is that the thousands of rows get the same timestamp down to the millisecond. This makes searching extremely slow, because all the records are clumped together on one indexer.
Is there a way to force Splunk to break up the file and assign slightly varying timestamps on ingestion?
Hi @reed.kelly You had a couple of good suggestions made which I have converted to answers. If either of them helped, please accept the answer and upvote.
If the proposed solutions don't work, please update us with any other details to see if there are any other suggestions.
you could initially upload the data to a temp index, modify the indexed _time field using something like this:
| streamstats count(_time) as time_modifier
| eval time_modifier = time_modifier *5
| eval _time = strftime(strptime(strftime(_time, "%Y-%m-%d %H:%M:%S"), "%Y-%m-%d %H:%M:%S") + time_modifier, "%Y-%m-%d %H:%M:%S")
and output it to a new index.
A good idea, but make sure you don't change the sourcetype when you output with
|collect otherwise it will cost you 2 x the volume from your licence.
If you wish to append slightly varying timestamps on the raw logs, I think it's best to do this pre-ingestion...like via a python script. I can imagine a loop through all the logs and appending timestamp.