I want to ingest a very large file that has no usable timestamps. I want to set:
SHOULD_LINEMERGE = false
DATETIME_CONFIG = CURRENT
The problem is that the thousands of rows get the same timestamp down to the millisecond. This makes searching extremely slow, because all the records are clumped together on one indexer.
Is there a way to force Splunk to break up the file and assign slightly varying timestamps on ingestion?
Hi @reed.kelly You had a couple of good suggestions made which I have converted to answers. If either of them helped, please accept the answer and upvote.
If the proposed solutions don't work, please update us with any other details to see if there are any other suggestions.
@reed.kelly
|head 20
will you provide your logs like above?
you could initially upload the data to a temp index, modify the indexed _time field using something like this:
index= "index_name"
| streamstats count(_time) as time_modifier
| eval time_modifier = time_modifier *5
| eval _time = strftime(strptime(strftime(_time, "%Y-%m-%d %H:%M:%S"), "%Y-%m-%d %H:%M:%S") + time_modifier, "%Y-%m-%d %H:%M:%S")
and output it to a new index.
converted to an answer
thanks @nickhillscpl
A good idea, but make sure you don't change the sourcetype when you output with |collect
otherwise it will cost you 2 x the volume from your licence.
^ prolly the fastes way to do it
If you wish to append slightly varying timestamps on the raw logs, I think it's best to do this pre-ingestion...like via a python script. I can imagine a loop through all the logs and appending timestamp.
converted to an answer