Splunk Search

How to count XML filenames by prefix, count filenames without a prefix, and add the column totals for a custom time range?


I have a field "filename" which is an xml going through a component. I want to count the number of them with a specific prefix so that I get a table like the following.

FOO   |   23
BAR   |      0
BAZ    | 104
other |  340
Total  |  467

given that FOO is a prefix in an xml called FOO_239cjase93912ds.xml, BAR and BAZ, etc.

SO far I have the following rex

| rex field=filename max_match=1 "(?(FOO|BAR|BAZ)*.xml)" | stats count by prefix | addcoltotals

Problems and Questions:
1. I don't know how to count "other" (filename's that don't have any of the given prefixes)
2. I want to have a custom range of "last business day" that means on Monday I want to count all files from Saturday 00:00 to Monday Now() and up to last midnight on weekdays. (e.g. on Thursday apply Today())

Thank you


I'll deal with #1 first. Try this:

  base search | rex field=filename "^(?<prefix>FOO|BAR|BAZ).*\.xml$"
| eval prefix = coalesce(prefix, "Other") | stats count by prefix | addcoltotals labelfield=prefix

As for #2, I'm not quite sure if I understand your Saturday-Monday example correctly. I'll draw up a table:

Current day  earliest    latest
Monday       Sat 0000    Mon now()
Tuesday      Tue 0000    Tue now()
Friday       Fri 0000    Fri now()
Saturday     Sat 0000    Sat now()
Sunday       Sat 0000    Sun now()

Is that what you need? In English, should the time range "Your Kind of Business Day" be the same as "Today" except that Saturday, Sunday, and Monday are considered as one loooong day?


Time for number two... (hah!)

I don't see a way to build this into a stored time range that you can select from the time range picker... maybe someone else does of course.

However, you can specify that within the search string like this:

earliest=`mybusinessday_earliest` latest=now index=foo bar=baz | stats ...

Where mybusinessday_earliest is an eval-based macro with this body:

if((tonumber(strftime(time(), "%u"))%6)==1, "@w6", "@d")

This grabs today's day of the week (Mo=1, ..., Su=7) and returns @w6 ("skip to the start of last Saturday") for Monday and Sunday or @d for Tuesday through Saturday.

0 Karma


Concerning #1: Wrap a case() around the coalesce() like this to filter out non-XML files:

... | eval prefix = case(match(filename, "\.xml$"), coalesce(prefix, "Other")) | ...

Note, all those regular expressions are case sensitive. If you might expect mixed or upper case filenames you may want to prefix the regexes with the (?i) flag.

I'll deal with #2 later.

0 Karma


Your first solution totals up all the prefixes into "Other", but by "Other" I meant all the xml's that did not fit into any of the prefixes. so that I would need...

FOO      7
BAR       0
BAZ       1
Other    2
Total   10

And Yes, basically Monday is one long day. Saturday and Sunday and Monday "earliest" is Saturday 00:00 and latest is always now(), no matter which day.

0 Karma