Solved: How to use the regex matched variables from the fi...

vsasdao · ‎08-21-2023

My first search with regex as following:

index=bigip "Storefront_v243" | rex ".*Common:(?<sid>.*?): New session from client IP (?<ip>.*?) $ST.*$ at VIP 123.45.78.172"

With my second search, I'll have to reference these two matched fields from the first search

index=bigip "Storefront_v243" | rex "Common:$sid$: Username '(?<un>.*?)' | stats count as nrs by sid, un, $ip$ | dedup un $ip$

How can I combine these two search queries into one by using pipe?

Thanks a lot in advance!

ITWhisperer · ‎08-22-2023

Try this (note that Ending is now capitalised as this is how you have shown it in your sample data)

| rex "Common:(?<sid>.*?): New session from client IP (?<ip>.*?) \(ST.*\) at VIP 123.45.78.172"
| rex "Common:(?<sid>.*?): Username '(?<un>.*?)'"
| rex "Common:(?<sid>.*?): \| Start \(fallback\) .* Ending is: (?<ending>Allow)"
| stats count latest(un) as user latest(ip) as ip latest(ending) as ending by sid
| where user != "" AND ending=="Allow"

View solution in original post

andrew_nelson · ‎08-21-2023

Is the goal of the search to find out the username and IP of each session ?

If so, you should be able to do it all in one go.

Something like :

index=bigip "Storefront_v243" 
| rex ".*Common:(?<sid>.*?): New session from client IP (?<ip>.*?) \(ST.*\) at VIP 123.45.78.172"
| rex "Common:(?<sid>.*?): Username '(?<un>.*?)'"
| stats count latest(un) as user latest(ip) as ip by sid

You can dedup by user and ip after that if needed .

vsasdao · ‎08-21-2023

One silly question here, the execution result of index=bigip "Storefront_v243" will be piped into this regex

 rex ".*Common:(?<sid>.*?): New session from client IP (?<ip>.*?) \(ST.*\) at VIP 123.45.78.172"

the regex results will only contain any logging lines matching this pattern:

".*Common:(?<sid>.*?): New session from client IP (?<ip>.*?) \(ST.*\) at VIP 123.45.78.172"

When these first RegExp matched results are piped to the next regex:

| rex "Common:(?<sid>.*?): Username '(?<un>.*?)'"

it should NOT match anything, shouldn't it? IMO, all the loggings lines matching "Common:(?<sid>.*?): Username '(?<un>.*?)'" have already been filtered out with the firstly above mentioned RegExp, right?

ITWhisperer · ‎08-21-2023

You may be confusing two similar but different commands.

Remember that SPL work by processing a pipeline of events.

The regex command can be used to filter the events based on them matching (or not) the regular expression specified for the field. This will mean the only events which satisfy the criteria defined will be passed on to the next (and subsequent) command(s) in the SPL pipeline.

The rex command can by used to extract information from the specified field (default _raw), into named field(s). It does not remove any events from the pipeline.

If you have two rex commands in your SPL pipeline, each one will process the events passed to it, so in your case, the first rex can extract the sid and ip, whereas the second can extract the sid and un.

In order to "join" these pieces of information into a single event, you can use the stats command with the by clause specifying the common field.

vsasdao · ‎08-21-2023

index=bigip "Storefront_v243" 
| rex ".*Common:(?<sid>.*?): New session from client IP (?<ip>.*?) \(ST.*\) at VIP 123.45.78.172"
| rex "Common:(?<sid>.*?): Username '(?<un>.*?)'"
| rex "Common:(?<sid>.*?):   \| Start \(fallback\) .* ending is: Allow"
| where un != "" 
| stats count latest(un) as user latest(ip) as ip by sid

I've tried to collect the usernames and IP addresses associated with all APM sessions whose policy result is Allow, however, the above mentioned search query didn't work as expected, for instance, it listed username from a Denied session

ITWhisperer · ‎08-22-2023

Try this

index=bigip "Storefront_v243" 
| rex "Common:(?<sid>.*?): New session from client IP (?<ip>.*?) \(ST.*\) at VIP 123.45.78.172"
| rex "Common:(?<sid>.*?): Username '(?<un>.*?)'"
| rex "Common:(?<sid>.*?):   \| Start \(fallback\) .* ending is: (?<ending>Allow)"
| where un != "" 
| stats count latest(un) as user latest(ip) as ip latest(ending) as ending by sid
| where ending=="Allow"

vsasdao · ‎08-22-2023

Thanks for your suggestion! However it doesn't generate the expected results.

I've also tweaked a bit as following,

index=bigip "Storefront_v243" 
| rex ".*Common:(?<sid1>.*?): New session from client IP (?<ip>.*?) \(ST.*\) at VIP 146.72.254.172"
| rex "Common:(?<sid2>.*?): Username '(?<un>.*?)'"
| rex "Common:(?<sid3>.*?):   \| Start \(fallback\) .* ending is: Allow"
| where un != "" AND sid1=sid2 AND sid2=sid3
| stats count latest(un) as user latest(ip) as ip by sid1 sid3

it didn't work either

ITWhisperer · ‎08-22-2023

Try this (note that Ending is now capitalised as this is how you have shown it in your sample data)

| rex "Common:(?<sid>.*?): New session from client IP (?<ip>.*?) \(ST.*\) at VIP 123.45.78.172"
| rex "Common:(?<sid>.*?): Username '(?<un>.*?)'"
| rex "Common:(?<sid>.*?): \| Start \(fallback\) .* Ending is: (?<ending>Allow)"
| stats count latest(un) as user latest(ip) as ip latest(ending) as ending by sid
| where user != "" AND ending=="Allow"

vsasdao · ‎08-22-2023

It seems that your suggestion now works much better.

Can you explain in details why the where-clause should be put after the stats count clause? When I tried to reverse their order, it didn't work as expected

ITWhisperer · ‎08-22-2023

The where command applies to all the events currently in the pipeline. Not all events have un matched and extracted from the event, so if you have the where command before the stats, you will remove the events which have the ip and ending fields extracted, so they won't be available to be gathered by the stats command.

vsasdao · ‎08-21-2023

Thanks for your inputs. I've tried to follow your thoughts further by checking the Policy result associated with that particular APM Session ID should be Allow

That is, I'd like to get statistics over username and IP address associated with Allow/Successful authentications. However, the following search query doesn't work as expected, because the username with failed APM Authentication was also returned.

Aug 21 17:27:04 ::ffff:10.0.49.14 notice apmd[14365]: 01490005:5: /Common/Storefront_v243.app/Storefront_v243:Common:2a904a10: Following rule 'fallback' from item 'AD Authentication' to ending 'Deny'

Aug 21 17:27:04 ::ffff:10.0.49.14 notice apmd[14365]: 01490010:5: /Common/Storefront_v243.app/Storefront_v243:Common:2a904a10: Username 'vsasvospush'

index=bigip "Storefront_v243" 
| rex ".*Common:(?<sid>.*?): New session from client IP (?<ip>.*?) \(ST.*\) at VIP 123.45.78.172"
| rex "Common:(?<sid>.*?): Username '(?<un>.*?)'"
| rex "Common:(?<sid>.*?):   \| Start \(fallback\) .* ending is: Allow"
| where un != "" 
| stats count latest(un) as user latest(ip) as ip by sid

vsasdao · ‎08-21-2023

These concerned logs are generated by the external product, the F5 BigIP/APM module, for instance,

This is the event my first search query operates on:

Aug 21 08:46:02 ::ffff:10.0.49.14 notice tmm1[21852]: 01490500:5: /Common/Storefront_v243.app/Storefront_v243:Common:aa8ccd2c: New session from client IP XX.XXX.88.248 (ST=Oslo/CC=NO/C=EU) at VIP 123.45.78.172 Listener /Common/Storefront_v243.app/Storefront_v243_webui_https (Reputation=Unknown)

host = ::ffff:10.0.49.14
source = /var/log/remote/f5_ltm/::ffff:10.0.49.14/2023/08/21/syslog
sourcetype = f5:bigip:apm:syslog

and the other event from BigIP/APM my second search query operates on:

Aug 21 08:46:07 ::ffff:10.0.49.14 notice apmd[14365]: 01490010:5: /Common/Storefront_v243.app/Storefront_v243:Common:aa8ccd2c: Username 'john.smitch'

host = ::ffff:10.0.49.14
source = /var/log/remote/f5_ltm/::ffff:10.0.49.14/2023/08/21/syslog
sourcetype = f5:bigip:apm:syslog

As you can see, the APM Session ID aa8ccd2c is the jointing key between these two events from the BigIP/APM

ITWhisperer · ‎08-21-2023

The syntax you have used, and indeed the semantics you are attempting is not possible in Splunk. By this I mean, even if you could pass the values retrieved by the first search into the second search (which you might be able to do with the map command, although I wouldn't recommend it), the use of $ip$ is using a value, when this should be a field (name), and you haven't extracted that field in the second search.

Depending on your actual events, you may be able to do this another way. For example, is there only one ip address associated with each (unique) sid?

If so, you could "attach" the ip to every event with the same sid using eventstats, then count your events by sid, un, ip (although are you expecting this to be anything other than 1?)

How to use the regex matched variables from the first search into the other search

regex

subsearch

Enterprise Security Content Update (ESCU) | New Releases

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

Index This | What are the 12 Days of Splunk-mas?