Splunk Enterprise

How to calculate the time taken for each log file in Splunk ?

Path Finder

I have 100+ log files with me. Inside of each log file there is no timestamp and each line is single event in splunk.
My requirement is to calculate the start time and end time & time taken from data uploaded time in Splunk for each log file.
Please sugest how to achieve this.

My output should be like below.

FileName StartTime EndTime Time Taken
deploy1.log 5/6/2020 07:30 5/6/2020 07:35 5 mins
deploy2.log 5/6/2020 08:23 5/6/2020 08:35 12 mins
deploy3.log 5/6/2020 09:12 5/6/2020 09:27 15 mins

Labels (1)
0 Karma
1 Solution

Splunk Employee
Splunk Employee

@georgear7If you need a specific source, change the where clause in the 1st line (tstats command) as shown in the following example.

 

where index=test source=*restart.log

 

You can also filter by index, sourcetype, host, etc. Or using Timepicker to specify a time range.

 

where index=test sourcetype=access_combined_wcookie source=*access.log

 

Alternatively, you can filter by adding the search command to the end of SPL. However, this method may not be efficient in a large scale environment. tstats is fast and nimble, but filter with the where clause and time range if possible.

 

| search FileName=*restart.log

 

 Reference Documents: 

https://docs.splunk.com/Documentation/Splunk/8.0.4/SearchReference/Tstats

View solution in original post

Splunk Employee
Splunk Employee

Here's an example using indexed time ( _indextime ).  Please change the "index=*" to your specific index.

 

| tstats min(_indextime) as StartTime max(_indextime) as EndTime where index=* by source
| rename source as FileName
| eval TimeTaken= round((EndTime - StartTime)/60,2)
| eval StartTime=strftime(StartTime, "%d/%m/%Y %H:%M:%S"),EndTime=strftime(EndTime, "%d/%m/%Y %H:%M:%S")

 

 

Path Finder

@skakehi_splunk Thanks for your reply..This gives output for all the sources..What if i need only certain source time values ? For eg..source=*restart.log.

0 Karma

SplunkTrust
SplunkTrust

try this :

| tstats latest(_time) as latest earliest(_time) as earliest where source=*restart.log by source 
| eval diff=latest-earliest 
| convert ctime(latest) ctime(earliest) ctime(diff)

 

Splunk Employee
Splunk Employee

@georgear7If you need a specific source, change the where clause in the 1st line (tstats command) as shown in the following example.

 

where index=test source=*restart.log

 

You can also filter by index, sourcetype, host, etc. Or using Timepicker to specify a time range.

 

where index=test sourcetype=access_combined_wcookie source=*access.log

 

Alternatively, you can filter by adding the search command to the end of SPL. However, this method may not be efficient in a large scale environment. tstats is fast and nimble, but filter with the where clause and time range if possible.

 

| search FileName=*restart.log

 

 Reference Documents: 

https://docs.splunk.com/Documentation/Splunk/8.0.4/SearchReference/Tstats

View solution in original post

Path Finder

@skakehi_splunk I tried this already.  It gives me below error.

Error in 'TsidxStats': Missing field list after 'GROUPBY' or 'by' keyword

Query used:

| tstats min(_indextime) as StartTime max(_indextime) as EndTime where index=* by source=*.restart.log
| rename source as FileName
| eval TimeTaken= round((EndTime - StartTime)/60,2)
| eval StartTime=strftime(StartTime, "%d/%m/%Y %H:%M:%S"),EndTime=strftime(EndTime, "%d/%m/%Y %H:%M:%S")

Seems the last method is working for me. Thanks a bunch for your help. 

| search FileName=*restart.log
0 Karma

Splunk Employee
Splunk Employee

@georgear7 FYI.  The tstats command does not support wildcard characters in field values in aggregate functions or BY clauses.

Here's a sample with WHERE clauses.

 

| tstats min(_indextime) as StartTime max(_indextime) as EndTime where index=* source=*.restart.log by source
| rename source as FileName
| eval TimeTaken= round((EndTime - StartTime)/60,2)
| eval StartTime=strftime(StartTime, "%d/%m/%Y %H:%M:%S"),EndTime=strftime(EndTime, "%d/%m/%Y %H:%M:%S")

 

Path Finder

@skakehi_splunk I got one new requirement from this. Now i need average time of the logs per day and I need this in line chart.

tried the below query:

| stats avg(TimeTaken) by date_mday

But didn't get any result. Please suggest!

0 Karma

Splunk Employee
Splunk Employee

@georgear7 First of all, Need an _time field to plot a line chart. It should be time series data.

In this case,

1) assign a new _time like this. Do it before override fields by strftime.

| eval _time=StartTime

Note: Basically if you search without tstats and _indextime, you don't need to care attempt _time with search. _time included with events.

2) Using timechart command + avg() aggregation function is the simple way to plot line chart.
Of course you can do same thing with stats command but don't forget _time.

Modify the span=1d to change aggregation time span.
If you don't need GROUP BY, please remove BY clauses.

| timechart span=1d avg(TimeTaken) by FileName

Here is a sample.

| tstats min(_indextime) as StartTime max(_indextime) as EndTime where index=* source=*.restart.log by source
| rename source as FileName
| eval _time=StartTime
| eval TimeTaken= round((EndTime - StartTime)/60,2)
| eval StartTime=strftime(StartTime, "%d/%m/%Y %H:%M:%S"),EndTime=strftime(EndTime, "%d/%m/%Y %H:%M:%S")
| timechart span=1d avg(TimeTaken) by FileName

Path Finder

@skakehi_splunk This worked perfectly, thank you. But i am not so clear on use of below field in the query. Would you mind explaining to me about in which way it's useful ?

| eval _time=StartTime

 

0 Karma

Path Finder

@skakehi_splunk Now it's working fine. Thanks 🙂

0 Karma

SplunkTrust
SplunkTrust

Perhaps use the _indextime after the file is indexed?

This would of course assume no delays is getting the data from the forwarder to the indexer in Splunk