Splunk Search

regular Expression extration

Kwip
Contributor

I want to do something like this, referer_domain is the field i want to extract to create a new field. I want to rex the google as one field and bing as another field, Say example google as Test and bing as Test1

referer_domain
http://www.bing.com
http://www.buttercupgames.com
http://www.google.com
http://www.yahoo.com
https://www.google.com
https://www.bing.com

I did something like this, which is not working out,

index=main sourcetype=access_combined_wcookie  
| rex field=referer_domain "(?<Test>google)"  
| rex field=referer_domain "(?<Test1>bing)"  
| stats count by Test Test1 host

I can see single extraction is working fine as expected(the below query), but not the double one. Is it not allowed in splunk or am i missing out any syntax?

index=main sourcetype=access_combined_wcookie 
| rex field=referer_domain "(?<Test>google)" 
| stats count by Test Test1 host
0 Karma
1 Solution

DalJeanis
Legend

The problem isn't the extraction, its the stats. stats will ignore all records that have nulls in any of the by fields.

To make that work, you would have to add a fillnull command before stats.

| fillnull value="" Test Test1

However, this is cleaner...

 index=main sourcetype=access_combined_wcookie  
| rex field=referer_domain "(?i)(?<Test>Google|Bing)"  
| fillnull value="Other" Test
| stats count by Test host

Note - I made the test case-insensitive (?i) and capitalized the search engine names to make it pretty. Isn't that just precious?

Edited to mark keywords as code.

View solution in original post

DalJeanis
Legend

The problem isn't the extraction, its the stats. stats will ignore all records that have nulls in any of the by fields.

To make that work, you would have to add a fillnull command before stats.

| fillnull value="" Test Test1

However, this is cleaner...

 index=main sourcetype=access_combined_wcookie  
| rex field=referer_domain "(?i)(?<Test>Google|Bing)"  
| fillnull value="Other" Test
| stats count by Test host

Note - I made the test case-insensitive (?i) and capitalized the search engine names to make it pretty. Isn't that just precious?

Edited to mark keywords as code.

Kwip
Contributor

you are always hitting the target with cent percent accuracy @DalJeanis. Below is what my expectation. Thank you.

index=main sourcetype=access_combined_wcookie  
 | rex field=referer_domain "(?<Test>google)"  
 | rex field=referer_domain "(?<Test1>bing)"  
| fillnull value="-" Test Test1
 | stats count by Test Test1 host

sbbadri
Motivator

try this
index=main sourcetype=access_combined_wcookie | rex field=referer_domain "(?&lttest1&gtgoogle)|(?&lttest2&gtbing) | stats count by test1 test2 host

Kwip
Contributor

Thank you for valuable comment @sbbadri. below is the one i am looking for,

index=main sourcetype=access_combined_wcookie  
 | rex field=referer_domain "(?<Test>google)|(?<Test1>bing)"  
| fillnull value="-" Test Test1
 | stats count by Test Test1 host
0 Karma
Get Updates on the Splunk Community!

Splunk Enterprise Security 8.x: The Essential Upgrade for Threat Detection, ...

 Prepare to elevate your security operations with the powerful upgrade to Splunk Enterprise Security 8.x! This ...

Get Early Access to AI Playbook Authoring: Apply for the Alpha Private Preview ...

Passionate about security automation? Apply now to our AI Playbook Authoring Alpha private preview ...

Reduce and Transform Your Firewall Data with Splunk Data Management

Managing high-volume firewall data has always been a challenge. Noisy events and verbose traffic logs often ...