Splunk Search

How to calculate how many times a letter is present consecutive in a string?

bijodev1
Communicator

Hi Team,

I am wondering if there is any command to to calculate how many times a string consecutive present.

for eg : 

Here I am trying to pull the letter "C"

if the data is "ACDEFCCCXYZ" - output should be "3"
if the data is "ACDEFCCXYCCCCZ" - output should be "4"

 

Not sure what could be possible way to do it. Please assit

Thanks

Labels (3)
0 Karma
1 Solution

somesoni2
SplunkTrust
SplunkTrust

Assuming the resulting multivalued field contains same character (with string of different length), an mvsort will help here to get the max length without mvexpand. See this runanywhere search:

| makeresults | eval data="ACDEFCCCXYCCXCZ ACDEFCCXYCCCCZCC" | table data |makemv data | mvexpand data 
| rex field=data max_match=0 "(?<matched>C+)" | eval max=len(mvindex(mvsort(matched),-1))

 

View solution in original post

PickleRick
SplunkTrust
SplunkTrust

You can use rex with max_match=0 to get all matches of consecutive letters into a multivalued field. If you want a single "known" letter, it's relatively easy. For example, for your C's, you'd do

rex max_match=0 "(?<matched>C+)"

If you want to match contiguous streams of any letter, you'll have to use backrefs in your regex and if I remember correctly the backref must be - a bit confusingly - for the second capturing group.

rex max_match=0 "(?<matched>(\w)\2+)"

This should split your event into contiguous segments of word characters.

Unfortunately, this gives you multivalued field and those are tricky to manipulate. It'd probably be easiest to mvexpand it to separate results, then calculate length of the resulting field, do eventstats to find the maximum value of this length and filter the results with where.

More elegant solution, anyone?

0 Karma

somesoni2
SplunkTrust
SplunkTrust

Assuming the resulting multivalued field contains same character (with string of different length), an mvsort will help here to get the max length without mvexpand. See this runanywhere search:

| makeresults | eval data="ACDEFCCCXYCCXCZ ACDEFCCXYCCCCZCC" | table data |makemv data | mvexpand data 
| rex field=data max_match=0 "(?<matched>C+)" | eval max=len(mvindex(mvsort(matched),-1))

 

bijodev1
Communicator

Thank you @somesoni2  it worked perfectly.

 

thank you everyone for the help. Appreciated.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

But it only works with a single letter case.

So if we're looking for "longest string of C's", then it's indeed an easier solution.

But if we're looking for a longest consecutive stream of same letter, regardles of what it is, it might not work. But you can get around it by replacing all letters except the last one with a symbol or a digit. Then sort.

This way we'll get the longest string first with a proper letter at the end.

| makeresults 
| eval a="aaa,aaaaa,aa,bbbbbbb,b,bb,xx,zzzzz,c"
| eval b=split(a,",")
| rex mode=sed field=b "s/.(?=.)/_/g"
| eval c=mvsort(b)

Now it's just a matter of getting the last letter and possibly replacing the placeholders back with the proper letter.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...