Hello,
I need to prepare statistics of some events occurrences and this is my data in splunk:
07-03-18;11:55:14;id:2222222;<a> f:Pool-A,l:2066;<b> f:Pool-B,l:2066;<c> f:Pool-C,l:959;<d> f:Pool-C,l:454;<e> f:Pool-D,l:959;Stat a:0
07-03-18;11:55:14;id:3333333;<a> f:Pool-A,l:2066;<b> f:Pool-B,l:2066;Stat a:10
07-03-18;11:56:14;id:2222222;Stat a:0
07-03-18;11:56:14;id:2222222;<a> f:Pool-A,l:2066;<b> f:Pool-B,l:2066;Stat a:10
07-03-18;11:56:14;id:3333333;<a> f:Pool-A,l:2066;<b> f:Pool-B,l:2066;Stat a:10
Now I'd like to display how many times f:Pool-X (where Pool-X could be any string) with the same value of "l:" (for example l:2066) occurs for each node represented by "id".
As you can see for given node there may be few f:XXXX values (in different order) or non in one entry. There is a lot of entries for the same nodes.
This is what I'd like to achieve:
So far I have no idea how to do this. Maybe some of you could help with this not trivial issue.
Give this a try (assuming no field extraction is done, if it has been, you can ignore the rex commands)
your base search
| rex "id:(?<id>[^\;]+)"
| rex max_match=0 "f\:(?<Pool>[^,]+),(?<l>[^\;]+)"
| stats count by id Pool l
| stats list(count) as count list(Pool) as Pool list(l) as l by id
Give this a try (assuming no field extraction is done, if it has been, you can ignore the rex commands)
your base search
| rex "id:(?<id>[^\;]+)"
| rex max_match=0 "f\:(?<Pool>[^,]+),(?<l>[^\;]+)"
| stats count by id Pool l
| stats list(count) as count list(Pool) as Pool list(l) as l by id
Hi somesoni2, your query is perfect 🙂 However now I see that this is not exactly what I need. Could you please help again and reorganize this query to group data by the same Pool and l then calculate how many ids belongs to this set. Then display ids with occurrences counter. Example:
Pool-AA 2066 25 3333333(20), 2222222(5)
245 10 3333333(4), 4444444(5), 22222222(1)
where:
- Pool-AA is Pool
- 2066 and 245 are l
- 25 and 10 are total counter of pair Pool and l
- 3333333(20), 2222222(5) - list of ids with number of occurrences
This query is pretty close to the result I'd like to achieve:
stats count by Pool l id
| stats list(count) as count list(id) as ID list(l) as l by Pool
I'm totally newbie in splunk so any help or advice are very welcome.
Many thanks in advance. I really appreciate your help,
Try this (first 4 lines are same as previous version, rest are the code to format the output per your need)
your base search
| rex "id:(?<id>[^\;]+)"
| rex max_match=0 "f\:(?<Pool>[^,]+),(?<l>[^\;]+)"
| stats count by id Pool l
| eval id=id."(".count.")"
| stats sum(count) as count list(id) as id by Pool l delim="," | nomv id
| stats list(l) as l list(count) as count list(id) as id by Pool
You can learn more about stats command and available options by using this Splunk documentation page: http://docs.splunk.com/Documentation/Splunk/7.0.2/SearchReference/Stats. You can see other command on left side navigation tree. To understand the commands/steps better, add one part at a time to see how it's changing the output.
Do you know if splunk has some limitation for number of displayed data? I see that counter for pair Pool:l shows for example 259 ids but there is 100 ids in id column. Is it expected result? For sure, I took into consideration values from brackets.
The count columns gives total count (e.g. 25 in your example dat) and id column will have all unique ids (e.g 2 in your example data). You're saying you have more unique ids and they are not shown?
Yes, I have more unique ids than they are shown. However, I think that this make sense because how to display thousands of data? I didn't mention that I use splunk web search tool not API.
The multivalued command do have limitation of memory usage (see limits.conf specification to more details) which may cause truncation. However, showing 100's of entries in single result row may not be readable anyways. See if you're ok with format of the result with this query.
your base search
| rex "id:(?<id>[^\;]+)"
| rex max_match=0 "f\:(?<Pool>[^,]+),(?<l>[^\;]+)"
| stats count by id Pool l
| eval id=id."(".count.")"
| stats sum(count) as count list(id) as id by Pool l delim="," | nomv id
Yes, this query is also nice. When you add extra space to delim (e.g delim=", ") then ids are displayed as multi-lines what is really nice. Thanks again for your help.
Unfortunately count is not working as expected. For the following input data it shows that pair Pool-A,l:2066 for id:2222222 occurs only once:
07-03-18;11:55:14;id:2222222;<a> f:Pool-A,l:2066
07-03-18;11:55:14;id:2222222;<a> f:Pool-A,l:2066;<b> f:Pool-B,l:2066;<c> f:Pool-C,l:959;<d> f:Pool-C,l:454;<e> f:Pool-D,l:959;Stat a:0
It seems that only entries with one pair of f:Pool-A,l:2066 is counted. When more pairs than not counted at all.
My previous thoughts about an issue with count is not correct. It is something wrong with this part:
rex max_match=0 "f\:(?<Pool>[^,]+),(?<l>[^\;]+)"
running this query
your base search
| rex "id:(?<id>[^\;]+)"
| rex max_match=0 "f\:(?<Pool>[^,]+),(?<l>[^\;]+)"
| table Pool
results in table with few Pools values and empty rows, but number of Pool-XX is less than expected.
I was able to manage my last issue. The solution is to keep Pool and l as one value:
rex max_match=0 "f\:(?<Pool_l>[^,]+,l:\d+)"
Wow, you are awesome. Your query works like a charm.
Many thanks.