Splunk Search

How do I do lookups based on many forms of regex/event?

Explorer

I'm having no success making sense of lookups. Some work, some don't, and I can't figure out why. Let's take an obvious example. sshd syslogs in all sorts of formats which indicate the username. I want to extract the username field from those various forms, then look that username up in my external CSV file. I know how to get that working in basic form, and have done it for one form of sshd syslog line.

Specifically, we have sshd events like:

user usernameHere authenticated as blahblah
session opened for usernameHere
session closed for usernameHere
Accepted someAuthMethod for usernameHere

All of those (defined as field extractions) need to trigger a lookup of usernameHere in the CSV file which is already defined in transforms.conf as 'employee'

The following does not work completely (only the "authenticated as" part looks up):

[syslog]
LOOKUP-username1 = employee uid AS Username1
EXTRACT-Username1 = (?i) for (?P<Username1>[^ ]+)
LOOKUP-username2 = employee uid AS Username2
EXTRACT-Username2 = (?i) (?P<Username2>[^ ]+) authenticated as

Nor does this ordering (a shot in the dark):

[syslog]
EXTRACT-Username1 = (?i) for (?P<Username1>[^ ]+)
EXTRACT-Username2 = (?i) (?P<Username2>[^ ]+) authenticated as
LOOKUP-username2 = employee uid AS Username2
LOOKUP-username1 = employee uid AS Username1

If I remove the functioning "authenticated as" LOOKUP and EXTRACT, then the other one starts working.

I have also tried the following, fixing the case of my LOOKUP classes:

[syslog]
EXTRACT-Username1 = (?i) for (?P<Username1>[^ ]+)
LOOKUP-Username1 = employee uid AS Username1
EXTRACT-Username2 = (?i) (?P<Username2>[^ ]+) authenticated as
LOOKUP-Username2 = employee uid AS Username2

So clearly I am not understanding the relationship between the field extraction and the lookup.

Really what I want is:

my_sshd_extraction1 to store username
my_sshd_extraction2 to store username
my_sshd_extraction3 to store username
my_sshd_extraction4 to store username
lookup username for any of those!

Any help would be greatly appreciated.

0 Karma
1 Solution

Legend

My suggestion is - Edit props.conf and change all the Username1 Username2 etc. to just Username, like this

[syslog]
EXTRACT-U1 = (?i) for (?P<Username>[^ ]+)
EXTRACT-U2 = (?i) (?P<Username>[^ ]+) authenticated as
LOOKUP-username = employee uid AS Username  

Note that the different extraction identifiers must be unique - but the field name itself can be the same. This is good, because it really is the same field, it just appears in different places in different events.

Now you only need one lookup, on the Username field. Note that I also renamed the lookup to LOOKUP-username, although the lookup identifier really doesn't matter.

I think this solution will make searches and reporting easier overall, as well as simplifying your lookups.

View solution in original post

Legend

My suggestion is - Edit props.conf and change all the Username1 Username2 etc. to just Username, like this

[syslog]
EXTRACT-U1 = (?i) for (?P<Username>[^ ]+)
EXTRACT-U2 = (?i) (?P<Username>[^ ]+) authenticated as
LOOKUP-username = employee uid AS Username  

Note that the different extraction identifiers must be unique - but the field name itself can be the same. This is good, because it really is the same field, it just appears in different places in different events.

Now you only need one lookup, on the Username field. Note that I also renamed the lookup to LOOKUP-username, although the lookup identifier really doesn't matter.

I think this solution will make searches and reporting easier overall, as well as simplifying your lookups.

View solution in original post

Legend

Stanza names in props.conf aren't normal regexes. Here are the rules:

When setting a [] stanza, you can use the following regex-type syntax:

... recurses through directories

* matches anything but / 0 or more times

| is equivalent to 'or'

( ) are used to limit scope of |

So [syslog|linux_secure] should work. This is either a bug in the code, or an error in the documentation.

Question: where do you set the sourcetypes of syslog and linux_secure? inputs.conf? If it's in props.conf, you need to look at the priority and ordering of stanzas in props.conf

0 Karma

Explorer

PS: Comment formatting controls here at Answers are greatly needed.

0 Karma

Explorer

Awesome. That works. Now the problem is that [syslog|linux_secure] isn't working. If I break my stuff out (duplicate the field extraction definitions) into [syslog] and also [linux_secure], they all work. Combined with an 'or' pipe, they don't.

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!