Splunk Search

How to create a search to find a percentage using my data set and display this in a timechart?

jojujose
New Member

First of all I am very new to splunk! 🙂
My data can be simplified to look something like this.

Employee = (UniqueId Id, EmployeeId ManagerId)

So we have an employee record which has a Id field and a managerId field. The managerId field contains the Id of the manager who is also an employee.
So basically there can be records in employee which don't have managerId in them.
I wanted to do a splunk search that determines the % of employees that have manager.
Currently, I am doing something like this, which returns me the employees that have managers:

index=* host=* logRecordType=mytype managerId=* | timechart dc(Id) AS "Number of employees with manager" span=1d

However, I need to do something which will give me the % of employees that have manager over time:

index=* host=* logRecordType=wousa|eval(managerId="*") as employeesWithManagers,count as Employees| eval(employeesWithManagers/Employees*100) as "% of total employess with managers"| --somehow timechart this over time?

The above search returns 0 for employeesWithManagers so I know that's incorrect.
Any suggestions on how to get to the promised land will help!

0 Karma
1 Solution

Richfez
SplunkTrust
SplunkTrust

Well, first off let me say welcome!

Secondly, let me say EDIT FAIL. Somehow my browser got confused and posted this while I was editing it. Sorry. 🙂

The first technique that came to mind would be something like the following.

index=X host=Y logRecordType=mytype | eval HasManager=if(isnull(managerId),0,1) | timechart eval(count/sum(HasManager)) as percentWithManager

In that, we make sure our base search returns both events with managerId and those without. We create a new field with a value 1 or 0 for if managerId is in that event or not - this makes it easy to do "count" style calculations. Finally, we use a timechart of an eval'd field.

Please note that if I did things in my usual way, I did my division backwards and you'll have to fix that.

I would only use span=1d in your timechart only if you actually need that so that the search will auto-time span itself depending on the overall timeframe in use, but if you need it, add it. Also, if you go to the visualization tab it will probably pick line chart, which is appropriate, but you might need to "Format" your line chart, and in the section General, Null Values, select the right-most option "Connect". On the other hand you may not need to - I just thought I'd mention it in case.

Anyway, give that a try and reply back with any additional assistance needed!

View solution in original post

woodcock
Esteemed Legend

Like this:

index=* host=* logRecordType=wousa | outputcsv TEMP.csv | search managerId="*" | dedup EmployeeId | timechart span=1d count AS "employeesWithManagers"
| appendpipe [ | inputcsv TEMP.csv | search NOT managerId="*"| dedup EmployeeId | timechart span=1d count AS "employeesWithoutManagers" ]
| eval  pctEmployeesWithManagers = 100 * (employeesWithManagers/(employeesWithManagers + employeesWithoutManagers))
| fields - employeesWithManagers employeesWithoutManagers
0 Karma

Richfez
SplunkTrust
SplunkTrust

Well, first off let me say welcome!

Secondly, let me say EDIT FAIL. Somehow my browser got confused and posted this while I was editing it. Sorry. 🙂

The first technique that came to mind would be something like the following.

index=X host=Y logRecordType=mytype | eval HasManager=if(isnull(managerId),0,1) | timechart eval(count/sum(HasManager)) as percentWithManager

In that, we make sure our base search returns both events with managerId and those without. We create a new field with a value 1 or 0 for if managerId is in that event or not - this makes it easy to do "count" style calculations. Finally, we use a timechart of an eval'd field.

Please note that if I did things in my usual way, I did my division backwards and you'll have to fix that.

I would only use span=1d in your timechart only if you actually need that so that the search will auto-time span itself depending on the overall timeframe in use, but if you need it, add it. Also, if you go to the visualization tab it will probably pick line chart, which is appropriate, but you might need to "Format" your line chart, and in the section General, Null Values, select the right-most option "Connect". On the other hand you may not need to - I just thought I'd mention it in case.

Anyway, give that a try and reply back with any additional assistance needed!

Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...