Splunk Search

How to use distinct count of multiple fields?

JoeHubner
Explorer

I have data that looks like the following:

Week               Employee        Project#

6/3/2022         A                      001

6/3/2022         A                      002

6/10/2022       A                      002

6/10/2022       B                      002

6/17/2022       A                      003

6/17/2022       B                      001

6/17/2022       B                      002

6/24/2022       B                      001

I would like to get a count of the total of the number of distinct weeks that employees appear in the data regardless of how many projects they have an entry for .  So, for the above the count should be 6 as below:

6/3/2022 > Employee A > Count=1

6/10/2022 > Employee A and B > Count=2

6/17/2022 > Employee A and B > Count=2

6/24/2022 > Employee B > Count=1

Is there some way I can use multiple fields in Distinct Count to accomplish this?

 

Labels (2)
0 Karma
1 Solution

yuanliu
SplunkTrust
SplunkTrust

Something like this?

| stats values(Employee) as EmployeeID dc(Employee) as EmployeeCount
| eval EmployeeID = mvjoin(EmployeeID, ", ")

View solution in original post

0 Karma

JoeHubner
Explorer

That just returned each of the values of Employee and not the total number of weeks the total employees appears in the data.

0 Karma

bowesmana
SplunkTrust
SplunkTrust

add by Week to the stats command, i.e.

| stats values(Employee) as EmployeeID dc(Employee) as EmployeeCount by Week

Note that will not actually do by a 7 day period, but will split by the 'Week' field.

If you want to make it really split by a 7 day window in case your dates are not always 7 day periods, then you will have to parse time and use the stats + bin command to group by a real week.

 

0 Karma

JoeHubner
Explorer

Making progress on this.  My data now show the correct EmployeeCount for each week.  How can get the total employee count over each month?  That would be the final number that I would need.  Thanks for the help so far.

0 Karma

bowesmana
SplunkTrust
SplunkTrust

The solution to getting the appropriate time spans is to use the bin command along with stats by _time, 

| eval _time=strptime(Week, "%m/%d/%Y")
| bin _time span=1mon
| stats values(Employee) as EmployeeID dc(Employee) as EmployeeCount by _time

The span=1mon will case the date, as specified by the original Week value to be aggregated based on the month. The alternate is the string based approach

| eval Month = mvindex(split(Week, "/"), 0)
| stats values(Employee) as EmployeeID dc(Employee) as EmployeeCount by Month

Note that this would not handle > 1 year, as the month would duplicate.

 

 

0 Karma

yuanliu
SplunkTrust
SplunkTrust

The actual solution will depend on how you want to list results.  Here's an idea

| stats values(Employee) as EmployeeID dc(Employee) as EmployeeCount by Week
| eval Month = mvindex(split(Week, "/"), 0)
| eventstats sum(EmployeeCount) as EmployeeMonthlyCount by Month

As bowesmana noted, Week and Month are not based on time here, just based on strings.

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Something like this?

| stats values(Employee) as EmployeeID dc(Employee) as EmployeeCount
| eval EmployeeID = mvjoin(EmployeeID, ", ")
0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...