Splunk Search

How to create a search to find a percentage using my data set and display this in a timechart?

jojujose
New Member

First of all I am very new to splunk! 🙂
My data can be simplified to look something like this.

Employee = (UniqueId Id, EmployeeId ManagerId)

So we have an employee record which has a Id field and a managerId field. The managerId field contains the Id of the manager who is also an employee.
So basically there can be records in employee which don't have managerId in them.
I wanted to do a splunk search that determines the % of employees that have manager.
Currently, I am doing something like this, which returns me the employees that have managers:

index=* host=* logRecordType=mytype managerId=* | timechart dc(Id) AS "Number of employees with manager" span=1d

However, I need to do something which will give me the % of employees that have manager over time:

index=* host=* logRecordType=wousa|eval(managerId="*") as employeesWithManagers,count as Employees| eval(employeesWithManagers/Employees*100) as "% of total employess with managers"| --somehow timechart this over time?

The above search returns 0 for employeesWithManagers so I know that's incorrect.
Any suggestions on how to get to the promised land will help!

0 Karma
1 Solution

Richfez
SplunkTrust
SplunkTrust

Well, first off let me say welcome!

Secondly, let me say EDIT FAIL. Somehow my browser got confused and posted this while I was editing it. Sorry. 🙂

The first technique that came to mind would be something like the following.

index=X host=Y logRecordType=mytype | eval HasManager=if(isnull(managerId),0,1) | timechart eval(count/sum(HasManager)) as percentWithManager

In that, we make sure our base search returns both events with managerId and those without. We create a new field with a value 1 or 0 for if managerId is in that event or not - this makes it easy to do "count" style calculations. Finally, we use a timechart of an eval'd field.

Please note that if I did things in my usual way, I did my division backwards and you'll have to fix that.

I would only use span=1d in your timechart only if you actually need that so that the search will auto-time span itself depending on the overall timeframe in use, but if you need it, add it. Also, if you go to the visualization tab it will probably pick line chart, which is appropriate, but you might need to "Format" your line chart, and in the section General, Null Values, select the right-most option "Connect". On the other hand you may not need to - I just thought I'd mention it in case.

Anyway, give that a try and reply back with any additional assistance needed!

View solution in original post

woodcock
Esteemed Legend

Like this:

index=* host=* logRecordType=wousa | outputcsv TEMP.csv | search managerId="*" | dedup EmployeeId | timechart span=1d count AS "employeesWithManagers"
| appendpipe [ | inputcsv TEMP.csv | search NOT managerId="*"| dedup EmployeeId | timechart span=1d count AS "employeesWithoutManagers" ]
| eval  pctEmployeesWithManagers = 100 * (employeesWithManagers/(employeesWithManagers + employeesWithoutManagers))
| fields - employeesWithManagers employeesWithoutManagers
0 Karma

Richfez
SplunkTrust
SplunkTrust

Well, first off let me say welcome!

Secondly, let me say EDIT FAIL. Somehow my browser got confused and posted this while I was editing it. Sorry. 🙂

The first technique that came to mind would be something like the following.

index=X host=Y logRecordType=mytype | eval HasManager=if(isnull(managerId),0,1) | timechart eval(count/sum(HasManager)) as percentWithManager

In that, we make sure our base search returns both events with managerId and those without. We create a new field with a value 1 or 0 for if managerId is in that event or not - this makes it easy to do "count" style calculations. Finally, we use a timechart of an eval'd field.

Please note that if I did things in my usual way, I did my division backwards and you'll have to fix that.

I would only use span=1d in your timechart only if you actually need that so that the search will auto-time span itself depending on the overall timeframe in use, but if you need it, add it. Also, if you go to the visualization tab it will probably pick line chart, which is appropriate, but you might need to "Format" your line chart, and in the section General, Null Values, select the right-most option "Connect". On the other hand you may not need to - I just thought I'd mention it in case.

Anyway, give that a try and reply back with any additional assistance needed!

Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...