Splunk Search

Convert categorical string to number

zeophlite
New Member

I have a field in my events that is a string (but does not translate to a number directly)

Is there a way to convert this string to an integer consistently (value does not matter), such as using a hash function? The functions available, such as md5 convert strings to strings, but is there a way to convert this back to an integer? An example is as follows:

user     favorite_fruit     fruit_number
bob      Apple                   1
jane     Pear                    2
pete     Apple                   1

Where user and favorite_fruit are known at index-time, and fruit_number is calculated at search-time. The actual value of fruit_number is arbitrary and doesn't need to be sequential.

I can't use a lookup, as the list of favorite_fruit's is arbitrary.

0 Karma
1 Solution

renjith_nair
Legend

Try something similar. You can use different by clause in streamstats and eventstats based on requirement.

 |stats count|eval fruit="apple,orange,apple,apple,cherry"|eval user="bob" | makemv delim="," fruit| makemv delim="," user|mvexpand fruit|streamstats count|eventstats first(count) as fruit_number by fruit|fields - count

Just add |streamstats count|eventstats first(count) as fruit_number by fruit|fields - count to your original search

---
What goes around comes around. If it helps, hit it with Karma 🙂

View solution in original post

0 Karma

renjith_nair
Legend

Try something similar. You can use different by clause in streamstats and eventstats based on requirement.

 |stats count|eval fruit="apple,orange,apple,apple,cherry"|eval user="bob" | makemv delim="," fruit| makemv delim="," user|mvexpand fruit|streamstats count|eventstats first(count) as fruit_number by fruit|fields - count

Just add |streamstats count|eventstats first(count) as fruit_number by fruit|fields - count to your original search

---
What goes around comes around. If it helps, hit it with Karma 🙂
0 Karma

zeophlite
New Member

Hi Renjith, apologies, I've updated my question to give an example

0 Karma

renjith_nair
Legend

Ok got it.

Try something similar. You can use different by clause in streamstats and eventstats based on requirement.

|stats count|eval fruit="apple,orange,apple,apple,cherry"|eval user="bob" | makemv delim="," fruit| makemv delim="," user|mvexpand fruit|streamstats count|eventstats first(count) as fruit_number by fruit|fields - count

Just add |streamstats count|eventstats first(count) as fruit_number by fruit|fields - count to your original search

---
What goes around comes around. If it helps, hit it with Karma 🙂

zeophlite
New Member

Works great, please edit this into your answer

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Unlock What’s Next: The Splunk Cloud Platform at .conf25

In just a few days, Boston will be buzzing as the Splunk team and thousands of community members come together ...