Archive
Highlighted

How to Group data by a few factors?

Engager

Hello,

I need to prepare statistics of some events occurrences and this is my data in splunk:

``````07-03-18;11:55:14;id:2222222;<a> f:Pool-A,l:2066;<b> f:Pool-B,l:2066;<c> f:Pool-C,l:959;<d> f:Pool-C,l:454;<e> f:Pool-D,l:959;Stat a:0
07-03-18;11:55:14;id:3333333;<a> f:Pool-A,l:2066;<b> f:Pool-B,l:2066;Stat a:10
07-03-18;11:56:14;id:2222222;Stat a:0
07-03-18;11:56:14;id:2222222;<a> f:Pool-A,l:2066;<b> f:Pool-B,l:2066;Stat a:10
07-03-18;11:56:14;id:3333333;<a> f:Pool-A,l:2066;<b> f:Pool-B,l:2066;Stat a:10
``````

Now I'd like to display how many times f:Pool-X (where Pool-X could be any string) with the same value of "l:" (for example l:2066) occurs for each node represented by "id".
As you can see for given node there may be few f:XXXX values (in different order) or non in one entry. There is a lot of entries for the same nodes.

This is what I'd like to achieve:

So far I have no idea how to do this. Maybe some of you could help with this not trivial issue.

Tags (3)
1 Solution
Highlighted

Re: How to Group data by a few factors?

SplunkTrust

Give this a try (assuming no field extraction is done, if it has been, you can ignore the rex commands)

``````your base search
| rex "id:(?<id>[^\;]+)"
| rex max_match=0 "f\:(?<Pool>[^,]+),(?<l>[^\;]+)"
| stats count by id Pool l
| stats list(count) as count list(Pool) as Pool list(l) as l by id
``````
Highlighted

Re: How to Group data by a few factors?

Engager

Hi somesoni2, your query is perfect 🙂 However now I see that this is not exactly what I need. Could you please help again and reorganize this query to group data by the same Pool and l then calculate how many ids belongs to this set. Then display ids with occurrences counter. Example:

``````Pool-AA       2066     25       3333333(20), 2222222(5)
245     10       3333333(4), 4444444(5), 22222222(1)
``````

where:
- Pool-AA is Pool
- 2066 and 245 are l
- 25 and 10 are total counter of pair Pool and l
- 3333333(20), 2222222(5) - list of ids with number of occurrences

This query is pretty close to the result I'd like to achieve:

``````stats count by Pool l id
| stats list(count) as count list(id) as ID list(l) as l by Pool
``````

I'm totally newbie in splunk so any help or advice are very welcome.

Highlighted

Re: How to Group data by a few factors?

SplunkTrust

Try this (first 4 lines are same as previous version, rest are the code to format the output per your need)

``````your base search
| rex "id:(?<id>[^\;]+)"
| rex max_match=0 "f\:(?<Pool>[^,]+),(?<l>[^\;]+)"
| stats count by id Pool l
| eval id=id."(".count.")"
| stats sum(count) as count list(id) as id by Pool l delim="," | nomv id
| stats list(l) as l list(count) as count list(id) as id by Pool
``````

You can learn more about stats command and available options by using this Splunk documentation page: http://docs.splunk.com/Documentation/Splunk/7.0.2/SearchReference/Stats. You can see other command on left side navigation tree. To understand the commands/steps better, add one part at a time to see how it's changing the output.

Highlighted

Re: How to Group data by a few factors?

Engager

Wow, you are awesome. Your query works like a charm.
Many thanks.

Highlighted

Re: How to Group data by a few factors?

Engager

Do you know if splunk has some limitation for number of displayed data? I see that counter for pair Pool:l shows for example 259 ids but there is 100 ids in id column. Is it expected result? For sure, I took into consideration values from brackets.

Highlighted

Re: How to Group data by a few factors?

SplunkTrust

The count columns gives total count (e.g. 25 in your example dat) and id column will have all unique ids (e.g 2 in your example data). You're saying you have more unique ids and they are not shown?

Highlighted

Re: How to Group data by a few factors?

Engager

Yes, I have more unique ids than they are shown. However, I think that this make sense because how to display thousands of data? I didn't mention that I use splunk web search tool not API.

Highlighted

Re: How to Group data by a few factors?

SplunkTrust

The multivalued command do have limitation of memory usage (see limits.conf specification to more details) which may cause truncation. However, showing 100's of entries in single result row may not be readable anyways. See if you're ok with format of the result with this query.

``````your base search
| rex "id:(?<id>[^\;]+)"
| rex max_match=0 "f\:(?<Pool>[^,]+),(?<l>[^\;]+)"
| stats count by id Pool l
| eval id=id."(".count.")"
| stats sum(count) as count list(id) as id by Pool l delim="," | nomv id
``````
Highlighted

Re: How to Group data by a few factors?

Engager

Yes, this query is also nice. When you add extra space to delim (e.g delim=", ") then ids are displayed as multi-lines what is really nice. Thanks again for your help.