Splunk Search

Convert categorical string to number

zeophlite
New Member

I have a field in my events that is a string (but does not translate to a number directly)

Is there a way to convert this string to an integer consistently (value does not matter), such as using a hash function? The functions available, such as md5 convert strings to strings, but is there a way to convert this back to an integer? An example is as follows:

user     favorite_fruit     fruit_number
bob      Apple                   1
jane     Pear                    2
pete     Apple                   1

Where user and favorite_fruit are known at index-time, and fruit_number is calculated at search-time. The actual value of fruit_number is arbitrary and doesn't need to be sequential.

I can't use a lookup, as the list of favorite_fruit's is arbitrary.

0 Karma
1 Solution

renjith_nair
Legend

Try something similar. You can use different by clause in streamstats and eventstats based on requirement.

 |stats count|eval fruit="apple,orange,apple,apple,cherry"|eval user="bob" | makemv delim="," fruit| makemv delim="," user|mvexpand fruit|streamstats count|eventstats first(count) as fruit_number by fruit|fields - count

Just add |streamstats count|eventstats first(count) as fruit_number by fruit|fields - count to your original search

---
What goes around comes around. If it helps, hit it with Karma 🙂

View solution in original post

0 Karma

renjith_nair
Legend

Try something similar. You can use different by clause in streamstats and eventstats based on requirement.

 |stats count|eval fruit="apple,orange,apple,apple,cherry"|eval user="bob" | makemv delim="," fruit| makemv delim="," user|mvexpand fruit|streamstats count|eventstats first(count) as fruit_number by fruit|fields - count

Just add |streamstats count|eventstats first(count) as fruit_number by fruit|fields - count to your original search

---
What goes around comes around. If it helps, hit it with Karma 🙂
0 Karma

zeophlite
New Member

Hi Renjith, apologies, I've updated my question to give an example

0 Karma

renjith_nair
Legend

Ok got it.

Try something similar. You can use different by clause in streamstats and eventstats based on requirement.

|stats count|eval fruit="apple,orange,apple,apple,cherry"|eval user="bob" | makemv delim="," fruit| makemv delim="," user|mvexpand fruit|streamstats count|eventstats first(count) as fruit_number by fruit|fields - count

Just add |streamstats count|eventstats first(count) as fruit_number by fruit|fields - count to your original search

---
What goes around comes around. If it helps, hit it with Karma 🙂

zeophlite
New Member

Works great, please edit this into your answer

0 Karma
Get Updates on the Splunk Community!

Community Content Calendar, November Edition

Welcome to the November edition of our Community Spotlight! Each month, we dive into the Splunk Community to ...

October Community Champions: A Shoutout to Our Contributors!

As October comes to a close, we want to take a moment to celebrate the people who make the Splunk Community ...

Stay Connected: Your Guide to November Tech Talks, Office Hours, and Webinars!

What are Community Office Hours? Community Office Hours is an interactive 60-minute Zoom series where ...