My first search with regex as following:
index=bigip "Storefront_v243" | rex ".*Common:(?<sid>.*?): New session from client IP (?<ip>.*?) \(ST.*\) at VIP 123.45.78.172"
With my second search, I'll have to reference these two matched fields from the first search
index=bigip "Storefront_v243" | rex "Common:$sid$: Username '(?<un>.*?)' | stats count as nrs by sid, un, $ip$ | dedup un $ip$
How can I combine these two search queries into one by using pipe?
Thanks a lot in advance!
Try this (note that Ending is now capitalised as this is how you have shown it in your sample data)
| rex "Common:(?<sid>.*?): New session from client IP (?<ip>.*?) \(ST.*\) at VIP 123.45.78.172"
| rex "Common:(?<sid>.*?): Username '(?<un>.*?)'"
| rex "Common:(?<sid>.*?): \| Start \(fallback\) .* Ending is: (?<ending>Allow)"
| stats count latest(un) as user latest(ip) as ip latest(ending) as ending by sid
| where user != "" AND ending=="Allow"
Is the goal of the search to find out the username and IP of each session ?
If so, you should be able to do it all in one go.
Something like :
index=bigip "Storefront_v243"
| rex ".*Common:(?<sid>.*?): New session from client IP (?<ip>.*?) \(ST.*\) at VIP 123.45.78.172"
| rex "Common:(?<sid>.*?): Username '(?<un>.*?)'"
| stats count latest(un) as user latest(ip) as ip by sid
You can dedup by user and ip after that if needed .
One silly question here, the execution result of index=bigip "Storefront_v243" will be piped into this regex
rex ".*Common:(?<sid>.*?): New session from client IP (?<ip>.*?) \(ST.*\) at VIP 123.45.78.172"
the regex results will only contain any logging lines matching this pattern:
".*Common:(?<sid>.*?): New session from client IP (?<ip>.*?) \(ST.*\) at VIP 123.45.78.172"
When these first RegExp matched results are piped to the next regex:
| rex "Common:(?<sid>.*?): Username '(?<un>.*?)'"
it should NOT match anything, shouldn't it? IMO, all the loggings lines matching "Common:(?<sid>.*?): Username '(?<un>.*?)'" have already been filtered out with the firstly above mentioned RegExp, right?
You may be confusing two similar but different commands.
Remember that SPL work by processing a pipeline of events.
The regex command can be used to filter the events based on them matching (or not) the regular expression specified for the field. This will mean the only events which satisfy the criteria defined will be passed on to the next (and subsequent) command(s) in the SPL pipeline.
The rex command can by used to extract information from the specified field (default _raw), into named field(s). It does not remove any events from the pipeline.
If you have two rex commands in your SPL pipeline, each one will process the events passed to it, so in your case, the first rex can extract the sid and ip, whereas the second can extract the sid and un.
In order to "join" these pieces of information into a single event, you can use the stats command with the by clause specifying the common field.
index=bigip "Storefront_v243" | rex ".*Common:(?<sid>.*?): New session from client IP (?<ip>.*?) \(ST.*\) at VIP 123.45.78.172" | rex "Common:(?<sid>.*?): Username '(?<un>.*?)'" | rex "Common:(?<sid>.*?): \| Start \(fallback\) .* ending is: Allow" | where un != "" | stats count latest(un) as user latest(ip) as ip by sid
I've tried to collect the usernames and IP addresses associated with all APM sessions whose policy result is Allow, however, the above mentioned search query didn't work as expected, for instance, it listed username from a Denied session
Try this
index=bigip "Storefront_v243"
| rex "Common:(?<sid>.*?): New session from client IP (?<ip>.*?) \(ST.*\) at VIP 123.45.78.172"
| rex "Common:(?<sid>.*?): Username '(?<un>.*?)'"
| rex "Common:(?<sid>.*?): \| Start \(fallback\) .* ending is: (?<ending>Allow)"
| where un != ""
| stats count latest(un) as user latest(ip) as ip latest(ending) as ending by sid
| where ending=="Allow"
Thanks for your suggestion! However it doesn't generate the expected results.
I've also tweaked a bit as following,
index=bigip "Storefront_v243"
| rex ".*Common:(?<sid1>.*?): New session from client IP (?<ip>.*?) \(ST.*\) at VIP 146.72.254.172"
| rex "Common:(?<sid2>.*?): Username '(?<un>.*?)'"
| rex "Common:(?<sid3>.*?): \| Start \(fallback\) .* ending is: Allow"
| where un != "" AND sid1=sid2 AND sid2=sid3
| stats count latest(un) as user latest(ip) as ip by sid1 sid3
it didn't work either
Try this (note that Ending is now capitalised as this is how you have shown it in your sample data)
| rex "Common:(?<sid>.*?): New session from client IP (?<ip>.*?) \(ST.*\) at VIP 123.45.78.172"
| rex "Common:(?<sid>.*?): Username '(?<un>.*?)'"
| rex "Common:(?<sid>.*?): \| Start \(fallback\) .* Ending is: (?<ending>Allow)"
| stats count latest(un) as user latest(ip) as ip latest(ending) as ending by sid
| where user != "" AND ending=="Allow"
It seems that your suggestion now works much better.
Can you explain in details why the where-clause should be put after the stats count clause? When I tried to reverse their order, it didn't work as expected
The where command applies to all the events currently in the pipeline. Not all events have un matched and extracted from the event, so if you have the where command before the stats, you will remove the events which have the ip and ending fields extracted, so they won't be available to be gathered by the stats command.
Thanks for your inputs. I've tried to follow your thoughts further by checking the Policy result associated with that particular APM Session ID should be Allow
That is, I'd like to get statistics over username and IP address associated with Allow/Successful authentications. However, the following search query doesn't work as expected, because the username with failed APM Authentication was also returned.
index=bigip "Storefront_v243"
| rex ".*Common:(?<sid>.*?): New session from client IP (?<ip>.*?) \(ST.*\) at VIP 123.45.78.172"
| rex "Common:(?<sid>.*?): Username '(?<un>.*?)'"
| rex "Common:(?<sid>.*?): \| Start \(fallback\) .* ending is: Allow"
| where un != ""
| stats count latest(un) as user latest(ip) as ip by sid
These concerned logs are generated by the external product, the F5 BigIP/APM module, for instance,
This is the event my first search query operates on:
and the other event from BigIP/APM my second search query operates on:
As you can see, the APM Session ID aa8ccd2c is the jointing key between these two events from the BigIP/APM
The syntax you have used, and indeed the semantics you are attempting is not possible in Splunk. By this I mean, even if you could pass the values retrieved by the first search into the second search (which you might be able to do with the map command, although I wouldn't recommend it), the use of $ip$ is using a value, when this should be a field (name), and you haven't extracted that field in the second search.
Depending on your actual events, you may be able to do this another way. For example, is there only one ip address associated with each (unique) sid?
If so, you could "attach" the ip to every event with the same sid using eventstats, then count your events by sid, un, ip (although are you expecting this to be anything other than 1?)