Splunk Search

Splunk conditional count/aggregation

bomran
Explorer

I have some CSV data about files imported in to Splunk. The data looks like this:

"\\domain\path\to\file\","<filename>","<fsize>","<ext>","<Last_access>","<last_write>","<creation_time>","<attributes>","<owner>"

I have converted all the date strings to epoch using:

| eval epoch_LastAccessTime=strptime(LastAccessTime, "%d/%m/%Y %H:%M:%S")
...
...

I want to get:

  • A percentage of files last accessed between 6 months and 3 years ago
  • A percentage of files last accessed 3 years or more ago.

This is the search query that I have tried before getting stuck:

index="<my_index>" sourcetype="<my_sourcetype>" 
| rex field=DirectoryName "\\\domain\.org\\\teams\\\(?<Team>[^\\\]*)" 
offset_field=_extracted_fields_bounds  
| eval epoch_LastAccessTime=strptime(LastAccessTime, "%d/%m/%Y 
%H:%M:%S") 
| eval _time=epoch_LastAccessTime
| timechart span=6mon count

I've tried using commands along the lines of:

| where epoch_LastAccessTime>=three_year_ago_from_now AND 
epoch_LastAccessTime<=six_months_ago_from_now

However, this excludes everything else (3y+)

I want the result to look something like:

TimeRange  Perc
6m-3y      60%
3y+        40%
1 Solution

Lowell
Super Champion

Your nearly on the right track. Instead of using a where to limit your results, use an eval to build a new categorical field. At that point you'll no longer need the timechart.

Something like:

| eval TimeRange=case(epoch_LastAccessTime>=three_year_ago_from_now, "3y+",
                      epoch_LastAccessTime>=six_months_ago_from_now, "6m-3y",
                      0=0, "less than 6m")
| stats count by TimeRange
| eventstats sum(count) as total_count
| eval pct=100*count/total_count

Note "0=0" is used to provide a "default" option, since it's always true.

View solution in original post

0 Karma

Lowell
Super Champion

Your nearly on the right track. Instead of using a where to limit your results, use an eval to build a new categorical field. At that point you'll no longer need the timechart.

Something like:

| eval TimeRange=case(epoch_LastAccessTime>=three_year_ago_from_now, "3y+",
                      epoch_LastAccessTime>=six_months_ago_from_now, "6m-3y",
                      0=0, "less than 6m")
| stats count by TimeRange
| eventstats sum(count) as total_count
| eval pct=100*count/total_count

Note "0=0" is used to provide a "default" option, since it's always true.

0 Karma

bomran
Explorer

Exactly what I was looking for, thanks for your help.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...