Splunk Search

How to find the most prevalent values for a field

bigtyma
Communicator

Title Position Security_Template
NonEE - Volunteer F01-7121-600002 IP VIEW ONLY CLINICAL
NonEE - Volunteer F01-7121-600002 FRONT DESK AND IP ORDERS CLERK
NonEE - Volunteer F01-7121-600002 IP VIEW ONLY CLINICAL
NonEE - Volunteer F01-7121-600002 IP VIEW ONLY CLINICAL
NonEE - Volunteer F01-7121-600002 ELINK VIEW-ONLY CLINICAL

Using the example above, I would like to have a report listing the most frequent or most common value of 'Security_Template' for every Position. In the example above the most prevalent would be 'IP VIEW-ONLY'

Bonus point is if we can list the percentage of records that have each value.

Thank you!

Tags (3)
0 Karma

tgow
Splunk Employee
Splunk Employee

If the field is already extracted then you can run the "top" command. Here is a quick example:

source=... | top Security_Template, PositionID

0 Karma

bigtyma
Communicator

woohoo! this works. I had to to change it slightly to ' | top Security_Template by PositionID

0 Karma

bigtyma
Communicator

Thank you for replying, unfortunately I did not get the results I was hoping for. Security_Template is already extracted.

In this scenario I should see

PositionID SecurityTemplate Percentage
F01-7121-600002 IP VIEW ONLY CLINICAL 50%

0 Karma

bigtyma
Communicator

"wdd-d{4}-d{6}" format?

Yes.

0 Karma

tgow
Splunk Employee
Splunk Employee

You will need to extract the field first. Here is a quick example that might work:

source = .... | rex field=_raw "-\d+\s+(?[^\n]+) | stats count by Security_Template | sort -count

If you want it by percentage then use the "top" search command.

source = .... | rex field=_raw "-\d+\s+(?[^\n]+) | top Security_Template

If this is a CSV file with a header then you might be able to set the CHECK_FOR_HEADER in the props.conf file. Here is a link to more information:

http://docs.splunk.com/Documentation/Splunk/5.0.1/admin/Propsconf

0 Karma

bigtyma
Communicator

Thank you for replying, unfortunately I did not get the results I was hoping for. Security_Template is already extracted.

In this scenario I should see

PositionID SecurityTemplate Percentage
F01-7121-600002 IP VIEW ONLY CLINICAL 50%

0 Karma

alacercogitatus
SplunkTrust
SplunkTrust

The data is not easy to split into fields because of the number of spaces. Will the position column always be in the "\w\d\d-\d{4}-\d{6}" format?

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Splunkbase Unveils New App Listing Management Public Preview

Splunkbase Unveils New App Listing Management Public PreviewWe're thrilled to announce the public preview of ...

Leveraging Automated Threat Analysis Across the Splunk Ecosystem

Are you leveraging automation to its fullest potential in your threat detection strategy?Our upcoming Security ...

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...