Splunk Search

What would be the strategy to extract relevant data from field with unnecessary data?

nqjpm
Path Finder

Description field parsing data from has some unnecessary survey data that I would like to ignore and NOT count. That data is denoted by ##Survey##. All data after this denotation can be ignored but I have not been able to determine a good way to do this. Can this be done without regex?

Example of the what the user input field looks like (since its text input the length and words change):

description="My outlook wont opena as a i keep getting an error message. ##Survey## Which of the following best describes your needs?: My Outlook is slow or not launching - Please clarify your issue further:: Outlook is slow or unable to launch

OR

description="I tried to login via * a few days ago and I kept getting the message that my password and/or token information was incorrect. I reset my token PIN and it still wouldn't work (PIN first, token key second). Does this happen often? I need to login this weekend and would like to have the issue resolved as soon as possible. Thank you! ##Survey## Please choose the option which best describes your problem.: ASSISTANCE WITH * TOKEN - Do you need assistance with your * token?: yes - Which best describes your request?: Other

My search counts words in the description field to see what issues may be trending. However, the words in the survey are skewing my data.

 index=foo
    | fields description
    | makemv delim=" " description
    | mvexpand description
    | eval LowerCase=lower(description)
    | eval length=len(LowerCase) |search length > 2
    |top limit=20 LowerCase
Tags (2)
0 Karma
1 Solution

somesoni2
Revered Legend

Try like this (line 3 would keep the description value before the ##Survey##)

index=foo
| fields description
| eval description=mvindex(split(description,"##Survey##"),0)
| makemv delim=" " description
| mvexpand description
| eval LowerCase=lower(description)
| eval length=len(LowerCase) |search length > 2
|top limit=20 LowerCase

View solution in original post

elliotproebstel
Champion

This will remove "##Survey##" and everything following it from the field description:

|rex mode=sed field=description "s/##Survey##.*//"

So I'd arrange your search commands like this:

index=foo
| fields description
| rex mode=sed field=description "s/##Survey##.*//"
| makemv delim=" " description
| mvexpand description
| eval LowerCase=lower(description)
| eval length=len(LowerCase) 
| search length > 2
| top limit=20 LowerCase

nqjpm
Path Finder

This also works. Two great working answers in less than an hour. I love this community!

somesoni2
Revered Legend

Try like this (line 3 would keep the description value before the ##Survey##)

index=foo
| fields description
| eval description=mvindex(split(description,"##Survey##"),0)
| makemv delim=" " description
| mvexpand description
| eval LowerCase=lower(description)
| eval length=len(LowerCase) |search length > 2
|top limit=20 LowerCase

nqjpm
Path Finder

That works great! Thanks!

0 Karma
Get Updates on the Splunk Community!

Aligning Observability Costs with Business Value: Practical Strategies

 Join us for an engaging Tech Talk on Aligning Observability Costs with Business Value: Practical ...

Mastering Data Pipelines: Unlocking Value with Splunk

 In today's AI-driven world, organizations must balance the challenges of managing the explosion of data with ...

Splunk Up Your Game: Why It's Time to Embrace Python 3.9+ and OpenSSL 3.0

Did you know that for Splunk Enterprise 9.4, Python 3.9 is the default interpreter? This shift is not just a ...