Splunk Search

Including Indextime in Table

dbastidas
New Member

I am a fairly new Splunk user..I have 5 different source types. Each sourcetype represents a unique txt file that generates every half hour. Each of the five txt files are written to subdirectories of D:\testing\. IE, D:\testing\subdir1, D:\testing\subdir2, etc.

I need to accomplish 4 goals:

  1. Search only the last log file that generated for each Sourcetype (the -30m&m time timeframe).
  2. Count the number of Lines in each file (subtracting 1 line from each).
  3. Identify the Indextime for each of the 5 log files.
  4. Display "Sourcetype", "Count", and "Indextime" in one table (sorted by count) for a total of 5 rows and 3 colums of data.

Search #1 - Displays Sourcetype and Count in a table with no problems.

earliest=-30m@m | search source="D:\\testing\\*" | stats sum(linecount) as "linecount" by sourcetype | eval Count=linecount-1 | sort 0 - "Count" | table "sourcetype" "Count"

Search #2 - Displays Sourcetype and Indextime in a table with no problems.

earliest=-30m@m | search source="D:\\testing\\*" | eval "Indextime"=strftime(_indextime,"%+")| table "sourcetype" "Indextime"

Search #3 - When I try to combine both searches into one, I get results similar to Search #1 but with no data in the Indextime column.

earliest=-30m@m | search source="D:\\testing\\*" | stats sum(linecount) as "linecount" by sourcetype | eval Count=linecount-1 | eval "Indextime"=strftime(_indextime,"%+") | sort 0 "Count" | table "sourcetype" "Count" "Indextime"

I've been struggling with this for a couple of days and would appreciate it if someone could help me come up with a solution that I can try.

Note that I have no choice but to use Indextime because there are no timestamps in these txt files.

0 Karma
1 Solution

lguinn2
Legend

First, you do have a timestamp. Splunk always creates a timestamp. If there is no timestamp in the events, Splunk will use the file mod time as the timestamp. If there is no file mod time (for example in a scripted input), Splunk will use the index time.

Second, I think your searches are more complicated than they need to be. Try these instead:

Single search:

source="D:\testing\*" earliest=-30m@m | stats count as Count by sourcetype | eval Count=Count-1 

Combined search:

source="D:\testing\*" earliest=-30m@m 
| stats count as Count latest(_time) as LatestTime by sourcetype source
| sort -LatestTime
| dedup sourcetype
| eval Count=Count-1 

The second search calculates the event count and timestamp for every file (assuming there will be multiple files per sourcetype). It then sorts the table with the most recent sources first. dedup keeps only the first (therefore most recent) entry for each sourcetype.

BTW, the count function in the stats command will work great if your text file has one-line-per-event, and it is very efficient. However, if you have multi-line events. you can replace "count as Count" with "sum(linecount) as Count" in order to count actual lines instead of events.

View solution in original post

lguinn2
Legend

First, you do have a timestamp. Splunk always creates a timestamp. If there is no timestamp in the events, Splunk will use the file mod time as the timestamp. If there is no file mod time (for example in a scripted input), Splunk will use the index time.

Second, I think your searches are more complicated than they need to be. Try these instead:

Single search:

source="D:\testing\*" earliest=-30m@m | stats count as Count by sourcetype | eval Count=Count-1 

Combined search:

source="D:\testing\*" earliest=-30m@m 
| stats count as Count latest(_time) as LatestTime by sourcetype source
| sort -LatestTime
| dedup sourcetype
| eval Count=Count-1 

The second search calculates the event count and timestamp for every file (assuming there will be multiple files per sourcetype). It then sorts the table with the most recent sources first. dedup keeps only the first (therefore most recent) entry for each sourcetype.

BTW, the count function in the stats command will work great if your text file has one-line-per-event, and it is very efficient. However, if you have multi-line events. you can replace "count as Count" with "sum(linecount) as Count" in order to count actual lines instead of events.

lguinn2
Legend

You are not getting the latest count, you are getting the largest count. Look at your sort command. If that's what you want, then okay. But to test it, run the command with and without the dedup in it. You can see that the sorting will be largest count first, and dedup keeps the first event and discards the rest of the same sourcetype...

0 Karma

dbastidas
New Member

I was looking to use linecount, since one txt file=one event. You lead me in the right direction, thanks! Here is the search that I used that allowed me to accomplishes all 4 goals.

earliest=-30m@m | search source="D:\testing\*"
| stats sum(linecount) as Count latest(_time) as LatestTime by sourcetype
| sort -Count
| dedup sourcetype
| eval Count=Count-1
| eval Timestamp=strftime(LatestTime,"%D %H:%M %Z")
| table sourcetype Count Timestamp

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...