Splunk Search

Case statement tiered matching omitting some results

Path Finder

Hello fellow Splunkers,

Pretty new to using case statements in Splunk and I've run into an odd problem that I have no clue how to resolve. I'm trying to get counts based on departments. In some cases we'd like to pull numbers for specific sub-departments within a broader department; ie; mathematics and mathematics - research and mathematics - analysis - etc...

What I noticed is that whenever I try to get a total count for items that have child departments those counts get omitted from the parent department's total count.

ie: let's say I have several departments, some will have sub-departments and others won't. From a numbers standpoint, I only need specific one's for the sub-departments. At the same time, I need to get numbers for parent and all child-depts that fall under that parent.

mathematics
mathematics - research
mathematics - research adv
mathematics - analysis
mathematics - analysis tier2
mathematics - analysis tier3
mathematics - analysis tier4
english
science

|searchHere
|eval deptSummary = case(  
dept LIKE "%mathematics - analysis%", "Mth_A",  
dept LIKE "%mathematics - analysis tier%", "Mth_A_t", 
dept LIKE "%english%", "Eng",
dept LIKE "%mathematics%", "Mth",
dept LIKE "%science%, ""Scn")
|stats count(dept) by deptSummary

The problem I've run into is that the parents counts don't include numbers for any that are captured by the other LIKE statements.

So if I have:
mathematics - 10
mathematics - research - 1
mathematics - research adv - 5
mathematics - analysis - 5
mathematics - analysis tier1 - 10
mathematics - analysis tier2 - 1
mathematics - analysis tier3 - 1

dept LIKE "%mathematics%", "Mth", --->would come out to a total of 33
dept LIKE "%mathematics - analysis%", "Mth_A", --> would come out to 17
dept LIKE "%mathematics - analysis tier%", "Mth_A_t", --> would come out to 12

What's happening is that counts captured by the LIKE statements with the sub-departments aren't included in the parent level LIKE statement. So if I cant to see the total for the parent department I'm left with just the events that didn't match any of the other LIKE statements. The counts for the departments which have no sub-departments are accurate, so it seems to just be a problem with the tiered matching.

Any suggestions how to go about this?

Tags (1)
0 Karma
1 Solution

Splunk Employee
Splunk Employee

The following shows basic construct, you can optionally change row values if needed
Query -

index=455116 
| rex field=department "(?<mth>.*mathematics.*)" | eval mth_count = if(isnotnull(mth),count,0)
| rex field=department "(?<mth_a>.*mathematics - analysis.*)" | eval mth_a_count = if(isnotnull(mth_a),count,0)
| rex field=department "(?<mth_a_t>.*mathematics - analysis tier.*)" | eval mth_a_t_count = if(isnotnull(mth_a_t),count,0)
| rex field=department "(?<scn>.*science.*)" | eval scn_count = if(isnotnull(scn),count,0)
| rex field=department "(?<eng>.*english.*)" | eval eng_count = if(isnotnull(eng),count,0)
| table department, mth, mth_count, mth_a mth_a_count, mth_a_t, mth_a_t_count, scn, scn_count, eng, eng_count, count
| stats sum(mth_count), sum(mth_a_count), sum(mth_a_t_count), sum(scn_count), sum(eng_count)
| eval id=1
| untable id Metric Value
| fields - id

Result -
alt text

View solution in original post

0 Karma

Splunk Employee
Splunk Employee

The following shows basic construct, you can optionally change row values if needed
Query -

index=455116 
| rex field=department "(?<mth>.*mathematics.*)" | eval mth_count = if(isnotnull(mth),count,0)
| rex field=department "(?<mth_a>.*mathematics - analysis.*)" | eval mth_a_count = if(isnotnull(mth_a),count,0)
| rex field=department "(?<mth_a_t>.*mathematics - analysis tier.*)" | eval mth_a_t_count = if(isnotnull(mth_a_t),count,0)
| rex field=department "(?<scn>.*science.*)" | eval scn_count = if(isnotnull(scn),count,0)
| rex field=department "(?<eng>.*english.*)" | eval eng_count = if(isnotnull(eng),count,0)
| table department, mth, mth_count, mth_a mth_a_count, mth_a_t, mth_a_t_count, scn, scn_count, eng, eng_count, count
| stats sum(mth_count), sum(mth_a_count), sum(mth_a_t_count), sum(scn_count), sum(eng_count)
| eval id=1
| untable id Metric Value
| fields - id

Result -
alt text

View solution in original post

0 Karma

Path Finder

You da man, that did the trick. Thanks!

0 Karma

Splunk Employee
Splunk Employee

So assuming only 3 levels for the departments and there is a consistent way to name departments. The following can be used.

CSV data -

department,count
mathematics,10
mathematics - research,1
mathematics - research adv,5
mathematics - analysis,5
mathematics - analysis tier1,10
mathematics - analysis tier2,1
mathematics - analysis tier3,1
english,101 
science,42

Splunk Query -

index=455116
| rex field=department "(?<parent>[^\s]*)(\s-\s)*(?<child>[^\s]*)(\s)*(?<grandchild>.*)" 
| stats sum(count) by parent
| rename parent AS department
| append 
    [ search index=455116 
    | rex field=department "(?<parent_child>[^\s]*\s-\s[^\s]*)(\s)*.*" 
    | stats sum(count) by parent_child
    | rename parent_child AS department 
    ] 
| append 
    [ search index=455116 
    | rex field=department "(?<parent_child_child>[^\s]+\s-\s[^\s]+\s.+)" 
    | stats sum(count) by parent_child_child
    | rename parent_child_child AS department ]

Results -
alt text

This solution uses regex to determine the parent, parent_child, and parent_child_child for each of the departments.
Then totals the count up for each parent, parent_child, and parent_child_child.
Then combines the totals together.

The following query shows how the departments are put into their different levels, which is the basis of the previous query.

index=455116 
| rex field=department "(?<parent>[^\s]*)(\s-\s)*(?<child>[^\s]*)(\s)*(?<grandchild>.*)" 
| table department, parent, child, grandchild, count 
0 Karma

Path Finder

Thanks for the reply, Unfortunately the naming convention is not consistent. In some cases some department are bundled in within site/codes which is why I went with the LIKE statements. I also need to tie a pre-defined name to each of the matches, which is why I'm trying to use the case statement.

0 Karma

Splunk Employee
Splunk Employee

If you only ever have 3 levels another option might be to map the department to parent, child, grandchild using a lookup table. Then something similar to above could be reused.

0 Karma

Splunk Employee
Splunk Employee

Is there a maximum number of levels within the department? Will it always be 3? Is there also a consistent way to name the sub departments? For example

  • parent
  • parent - child ( dash to separate )
  • parent - child grandchild ( dash and spaces to separate )

If so you could use the eval statement to calculate fields for parent, child, grandchild for each event then do some totals on those new calculated fields.

0 Karma

Path Finder

Nope, in some cases since these are legacy, various site and division codes are in front and behind some of the values I'm trying to match with no common structure

0 Karma