Splunk Search

What would be the strategy to extract relevant data from field with unnecessary data?

nqjpm
Path Finder

Description field parsing data from has some unnecessary survey data that I would like to ignore and NOT count. That data is denoted by ##Survey##. All data after this denotation can be ignored but I have not been able to determine a good way to do this. Can this be done without regex?

Example of the what the user input field looks like (since its text input the length and words change):

description="My outlook wont opena as a i keep getting an error message. ##Survey## Which of the following best describes your needs?: My Outlook is slow or not launching - Please clarify your issue further:: Outlook is slow or unable to launch

OR

description="I tried to login via * a few days ago and I kept getting the message that my password and/or token information was incorrect. I reset my token PIN and it still wouldn't work (PIN first, token key second). Does this happen often? I need to login this weekend and would like to have the issue resolved as soon as possible. Thank you! ##Survey## Please choose the option which best describes your problem.: ASSISTANCE WITH * TOKEN - Do you need assistance with your * token?: yes - Which best describes your request?: Other

My search counts words in the description field to see what issues may be trending. However, the words in the survey are skewing my data.

 index=foo
    | fields description
    | makemv delim=" " description
    | mvexpand description
    | eval LowerCase=lower(description)
    | eval length=len(LowerCase) |search length > 2
    |top limit=20 LowerCase
Tags (2)
0 Karma
1 Solution

somesoni2
Revered Legend

Try like this (line 3 would keep the description value before the ##Survey##)

index=foo
| fields description
| eval description=mvindex(split(description,"##Survey##"),0)
| makemv delim=" " description
| mvexpand description
| eval LowerCase=lower(description)
| eval length=len(LowerCase) |search length > 2
|top limit=20 LowerCase

View solution in original post

elliotproebstel
Champion

This will remove "##Survey##" and everything following it from the field description:

|rex mode=sed field=description "s/##Survey##.*//"

So I'd arrange your search commands like this:

index=foo
| fields description
| rex mode=sed field=description "s/##Survey##.*//"
| makemv delim=" " description
| mvexpand description
| eval LowerCase=lower(description)
| eval length=len(LowerCase) 
| search length > 2
| top limit=20 LowerCase

nqjpm
Path Finder

This also works. Two great working answers in less than an hour. I love this community!

somesoni2
Revered Legend

Try like this (line 3 would keep the description value before the ##Survey##)

index=foo
| fields description
| eval description=mvindex(split(description,"##Survey##"),0)
| makemv delim=" " description
| mvexpand description
| eval LowerCase=lower(description)
| eval length=len(LowerCase) |search length > 2
|top limit=20 LowerCase

nqjpm
Path Finder

That works great! Thanks!

0 Karma
Get Updates on the Splunk Community!

Splunk Enterprise Security 8.0.2 Availability: On cloud and On-premise!

A few months ago, we released Splunk Enterprise Security 8.0 for our cloud customers. Today, we are excited to ...

Logs to Metrics

Logs and Metrics Logs are generally unstructured text or structured events emitted by applications and written ...

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...