Solved: What is the best way to add timestamps to a large ...

dskillman · ‎04-12-2010

I have a file with ~6M events that gets FTP'd to Splunk on a daily basis. Unfortunately I don't have control of the output and there are no timestamp. Using CURRENT_TIME breaks things since all events show up with the same time and I have to search across an entire day at a time.

Any thoughts on how to get enough timestamps so that that I don't run into search limitations?

I was thinking of using an LWF to receive the FTP'd file and tweak the maxKBps in limits.conf so that CURRENT_TIME processes across 10's or 100's of seconds. Thoughts?

gkanapathy · ‎04-12-2010

The easiest way would simply be to name the file with the date/timestamp in a way that datetime.xml can get the timestamp, assuming the events are all supposed to have the same timestamp. Then, Splunk should extract the date/time from the file name, and auto-increment the extracted time as it finds that it's getting too many repeats.

Similarly, if you can manipulate the file, you could prepend a single timestamp at the top of the file and subsequent events lacking a timestamp should get that timestamp.

If more than 100,000 events come in for the same host/source/sourcetype in sequence with the same second timestamp, Splunk will auto-increment timestamps by 1 second, specifically to avoid this issue, so either of these solutions should work.

View solution in original post

gkanapathy · ‎04-12-2010

The easiest way would simply be to name the file with the date/timestamp in a way that datetime.xml can get the timestamp, assuming the events are all supposed to have the same timestamp. Then, Splunk should extract the date/time from the file name, and auto-increment the extracted time as it finds that it's getting too many repeats.

Similarly, if you can manipulate the file, you could prepend a single timestamp at the top of the file and subsequent events lacking a timestamp should get that timestamp.

If more than 100,000 events come in for the same host/source/sourcetype in sequence with the same second timestamp, Splunk will auto-increment timestamps by 1 second, specifically to avoid this issue, so either of these solutions should work.

What is the best way to add timestamps to a large log file without timestamps?

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

Are you a member of the Splunk Community?

What is the best way to add timestamps to a large log file without timestamps?

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...