Solved: How to get index of a particular letter in string

ma_anand1984 · ‎11-20-2012

How can i find index of last occurrence of letter in value of a field

string



splunk_user

microsoft_good_task

god_particle

Now i want below as answer (index of last occurrence of underscore)
7
15
4

How can i achieve this
I don't see any function to help

martin_mueller · ‎11-20-2012

Imagine you're looking in the field message:

...  | rex field=message "_(?<rest>[^_]*)$" | eval pos = if(length(rest) > 0, length(message) - length(rest), -1) | chart values(pos) by message

You'll get -1 if there is no occurrence of the char, and the position if there is at least one.

View solution in original post

timpacl · ‎03-31-2017

It has been a while since this thread was active but here is another method to do this:

 len(mvindex(split(lower([string]),"[char]"),0))

Basically, you split [string] at [char] then count the length of the first element in the resulting array to get the 0-based position of [char] in [string]. I add lower around [string] assuming that [string] is a field name and it is unknown what combination of upper-case and lower-case letters might be in it. Because I am showing quotes around [char], it is my assumption that [char] is a fixed text string so you have the opportunity when writing the statement to ensure all characters are in lower case.

You should take note that when [char] does not exist in [string], this will return the length of [string]. You can adjust for that with a more complex formula and while this will return -1 when [char] isn't in [string], it is also more difficult to maintain:

 if(len(mvindex(split(lower([string]),"[char]"),0))=len(lower([string])),-1,len(mvindex(split(lower([string]),"[char]"),0)))

This can be taken a step further. You can use mvcount to get the number of occurrences of [char] in [string]:

 mvcount(split(lower([string]),"[char]"))-1

The array created by split will have 1 more element than the number of occurrences of [char] that it finds in [string]. So if splitting the string "happy" on the char "p", I should get 3 elements "ha", "", "y". mvcount will return the number of elements in the array created by split which is one more that the number of [char]s so we subtract 1 from the result to get the number of occurrences of [char].

I will work on this some more and if I find a more elegant solution, I will post it.

drewski3420 · ‎12-05-2018

You should take note that when [char] does not exist in [string], this will return the length of [string]. You can adjust for that with a more complex formula and while this will return -1 when [char] isn't in [string], it is also more difficult to maintain:

if(len(mvindex(split(lower([string]),"[char]"),0))=len(lower([string])),-1,len(mvindex(split(lower([string]),"char]"),0)))

You could also wrap in some ifnull/nullif to avoid having to repeat the formula.

ifnull(nullif(len(mvindex(split(lower([string])),"[char]"),0)),len([string]))),-1)

malvidin · ‎08-12-2020

To modify @martin_mueller's answer to find where the underscores ("_") are, the "rex" command option, "offset_field", will gather the locations of your match. The "offset_field" option has been available since at least Splunk 6.3.0, but I can't go back farther in the documentation to check when it was introduced. If you only want the first match index, or a limited number of indexed locations, the "max_match" parameters can be changed.

...  
| rex field=message offset_field=message_offsets max_match=0 "(?P<char_match>_)" 
| rex field=message_offsets max_match=0 "char_match=(?P<offset_range>[0-9]+-[0-9]+)"
| chart values(offset_range) by message

The first "rex" command creates a field named "message_offsets" will contain data like the results of these eval statements, if the character(s) are found.

| eval example_message_offsets = "char_match=7"
| eval example_message_offsets = "char_match=12-12&char_match=19-19"

The second "rex" extracts the index from those values into "offset_range". For one character, the values are the same and separated with a "-".

If there is no match, both the char_match and the message_offsets fields are null(), so the chart will not include field values that don't contain the string of interest. Unless fillnull, coalesce, or another eval statement is used to populate null values.

martin_mueller · ‎11-20-2012

Imagine you're looking in the field message:

...  | rex field=message "_(?<rest>[^_]*)$" | eval pos = if(length(rest) > 0, length(message) - length(rest), -1) | chart values(pos) by message

You'll get -1 if there is no occurrence of the char, and the position if there is at least one.

martin_mueller · ‎11-20-2012

No, I'm not aware of a Splunk function that does this in one step.

ma_anand1984 · ‎11-20-2012

@martin_mueller
thanks, are you aware of any function that can do this? for instance substr will get string based on index. we should also be getting index based on value using some other function?

How to get index of a particular letter in string

Dashboards: Hiding charts while search is being executed and other uses for tokens

Splunk Observability Cloud's AI Assistant in Action Series: Explaining Metrics and ...

Brains, Bytes, and Boston: Learn from the Best at .conf25

Are you a member of the Splunk Community?

How to get index of a particular letter in string

Dashboards: Hiding charts while search is being executed and other uses for tokens

Splunk Observability Cloud's AI Assistant in Action Series: Explaining Metrics and ...

Brains, Bytes, and Boston: Learn from the Best at .conf25