Re: How to ingest a thousands of rows with no time...

reed_kelly · ‎03-01-2020

I want to ingest a very large file that has no usable timestamps. I want to set:
SHOULD_LINEMERGE = false
DATETIME_CONFIG = CURRENT

The problem is that the thousands of rows get the same timestamp down to the millisecond. This makes searching extremely slow, because all the records are clumped together on one indexer.

Is there a way to force Splunk to break up the file and assign slightly varying timestamps on ingestion?

nickhills · ‎03-04-2020

Hi @reed.kelly You had a couple of good suggestions made which I have converted to answers. If either of them helped, please accept the answer and upvote.
If the proposed solutions don't work, please update us with any other details to see if there are any other suggestions.

If my comment helps, please give it a thumbs up!

to4kawa · ‎03-02-2020

@reed.kelly
|head 20
will you provide your logs like above?

anmolpatel · ‎03-01-2020

you could initially upload the data to a temp index, modify the indexed _time field using something like this:
index= "index_name"
| streamstats count(_time) as time_modifier
| eval time_modifier = time_modifier *5
| eval _time = strftime(strptime(strftime(_time, "%Y-%m-%d %H:%M:%S"), "%Y-%m-%d %H:%M:%S") + time_modifier, "%Y-%m-%d %H:%M:%S")

and output it to a new index.

nickhills · ‎03-04-2020

converted to an answer

If my comment helps, please give it a thumbs up!

anmolpatel · ‎03-04-2020

thanks @nickhillscpl

nickhills · ‎03-02-2020

A good idea, but make sure you don't change the sourcetype when you output with |collect otherwise it will cost you 2 x the volume from your licence.

If my comment helps, please give it a thumbs up!

morethanyell · ‎03-01-2020

^ prolly the fastes way to do it

morethanyell · ‎03-01-2020

If you wish to append slightly varying timestamps on the raw logs, I think it's best to do this pre-ingestion...like via a python script. I can imagine a loop through all the logs and appending timestamp.

nickhills · ‎03-04-2020

converted to an answer

If my comment helps, please give it a thumbs up!

How to ingest a thousands of rows with no timestamps

Introducing the Splunk Community Dashboard Challenge!

Wondering How to Build Resiliency in the Cloud?

Updated Data Management and AWS GDI Inventory in Splunk Observability