Splunk Search

Field Aliases and Extractions -- overlap or order of operations causing issue

oliverj
Communicator

So I have an event:

<164>2019-05-14T22:04:15.161Z hostname Hostd: Rejected password for user myuser from 192.168.1.10

The user field is not extracted automatically, so I created (via webUI) a extraction:

[source::VMware:esxlog:source::tcp:1514]
EXTRACT-username-esxi-extraction = (?=[^f]*(?:for user|f.*for user))^(?:[^ \n]* ){7}(?P<username>\w+)

This extraction works great when I do:

mysearch | rex "(?=[^f]*(?:for user|f.*for user))^(?:[^ \n]* ){7}(?P<username>\w+)"

(See sample: https://www.regextester.com/?fam=109334)
Unfortunately, if I just run the search without the REX (the props.conf extraction should handle it fine), I get nothing.
Messing around with it, I found that if I changed the "(?P<username>\w+)" to something like "(?P<xxxx>\w+)", it works!

So, I thought maybe there was some overlap, but I don't know how/why that would be an issue. I don't know what to look for in the btool readout -- it looks fine.

So then I thought, ok! ill alias "xxxx" over to "username". Hacky, but I'm so tired of this stupid extraction by now, I don't even care.
ANd that leads me to this clever alias:

[VMware:esxlog:Hostd]
FIELDALIAS-normalize_username_esxi_hostd = Username as username user AS username xxxx AS username

But this does nothing! the other 2 seem to still work (Username and user) but "xxxx" is a dud.
I checked the order of things here:
https://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Searchtimeoperationssequence
I see no reason my alias would not work on an extracted field.

Any suggestions or some glaring error I am missing?

0 Karma

marycordova
SplunkTrust
SplunkTrust

A tricky thing that isn't obvious (but I don't think is your problem if the answer to my below question is yes) the part "username-esxi-extraction" is called a "class" and must be globally unique. If there is another config anywhere that is "EXTRACT-username-esxi-extraction" whichever app/sourcetype comes first alphabetically will have "class" precedence.

When I develop my own stuff I'll start a schema like EXTRACT-custom_esxi_1, EXTRACT-custom_esxi_2, FIELDALIAS-custom_esxi_3, etc. This way I'm nearly 100% sure it wont show up in a TA or App from Splunkbase and cause me problems.

So, to your problem, is this working in props or just in search? If in props, go ahead and set the config and use "username"

Messing around with it, I found that if I changed the "(?P<username>\w+)" to something like "(?P<xxxx>\w+)", it works!

Then, comment out the existing alias

#FIELDALIAS-normalize_username_esxi_hostd = Username as username user AS username

and do this instead

EVAL-username = mvdedup(lower(mvappend('Username','user','username')))

@marycordova
0 Karma

oliverj
Communicator

Maybe an additional example now that I have had time to look into this some more.

Username: bob
         pc1
Date: 10-10-2019
Hostname: pc1
usr: bob

I need my end result to be "username" (not Username).
So, I add "Username as username" to my fieldalias, but this wont work because this particular event has multiple values for the Username field (Both "bob" and "pc1").
Good thing that the "usr" field exists!
I add "usr as username" to my fieldalias.

But: This wont work? Splunk will not allow 2 fieldaliases with the same destination?
So what would I do in a situation where "Username" had better info, and "usr" didn't exist?
In the actual search field, I can always "rename usr AS username", but I would like to have it done in the background, not in the actual search string.
How would you normalize a field like this, where a sourcetype may have 2 different fields that need to wind up in the same destination? Maybe not all of my events have "usr", but they do all have Username, but if "usr" exists, I definitely want to use it instead of Username.

And with all of these questions, I find myself in a different place than when I made my original post. Same problem, but "normalization" just keeps getting harder.

0 Karma
Get Updates on the Splunk Community!

Monitoring Postgres with OpenTelemetry

Behind every business-critical application, you’ll find databases. These behind-the-scenes stores power ...

Mastering Synthetic Browser Testing: Pro Tips to Keep Your Web App Running Smoothly

To start, if you're new to synthetic monitoring, I recommend exploring this synthetic monitoring overview. In ...

Splunk Edge Processor | Popular Use Cases to Get Started with Edge Processor

Splunk Edge Processor offers more efficient, flexible data transformation – helping you reduce noise, control ...