Splunk Search

regular Expression extration

Kwip
Contributor

I want to do something like this, referer_domain is the field i want to extract to create a new field. I want to rex the google as one field and bing as another field, Say example google as Test and bing as Test1

referer_domain
http://www.bing.com
http://www.buttercupgames.com
http://www.google.com
http://www.yahoo.com
https://www.google.com
https://www.bing.com

I did something like this, which is not working out,

index=main sourcetype=access_combined_wcookie  
| rex field=referer_domain "(?<Test>google)"  
| rex field=referer_domain "(?<Test1>bing)"  
| stats count by Test Test1 host

I can see single extraction is working fine as expected(the below query), but not the double one. Is it not allowed in splunk or am i missing out any syntax?

index=main sourcetype=access_combined_wcookie 
| rex field=referer_domain "(?<Test>google)" 
| stats count by Test Test1 host
0 Karma
1 Solution

DalJeanis
Legend

The problem isn't the extraction, its the stats. stats will ignore all records that have nulls in any of the by fields.

To make that work, you would have to add a fillnull command before stats.

| fillnull value="" Test Test1

However, this is cleaner...

 index=main sourcetype=access_combined_wcookie  
| rex field=referer_domain "(?i)(?<Test>Google|Bing)"  
| fillnull value="Other" Test
| stats count by Test host

Note - I made the test case-insensitive (?i) and capitalized the search engine names to make it pretty. Isn't that just precious?

Edited to mark keywords as code.

View solution in original post

DalJeanis
Legend

The problem isn't the extraction, its the stats. stats will ignore all records that have nulls in any of the by fields.

To make that work, you would have to add a fillnull command before stats.

| fillnull value="" Test Test1

However, this is cleaner...

 index=main sourcetype=access_combined_wcookie  
| rex field=referer_domain "(?i)(?<Test>Google|Bing)"  
| fillnull value="Other" Test
| stats count by Test host

Note - I made the test case-insensitive (?i) and capitalized the search engine names to make it pretty. Isn't that just precious?

Edited to mark keywords as code.

Kwip
Contributor

you are always hitting the target with cent percent accuracy @DalJeanis. Below is what my expectation. Thank you.

index=main sourcetype=access_combined_wcookie  
 | rex field=referer_domain "(?<Test>google)"  
 | rex field=referer_domain "(?<Test1>bing)"  
| fillnull value="-" Test Test1
 | stats count by Test Test1 host

sbbadri
Motivator

try this
index=main sourcetype=access_combined_wcookie | rex field=referer_domain "(?&lttest1&gtgoogle)|(?&lttest2&gtbing) | stats count by test1 test2 host

Kwip
Contributor

Thank you for valuable comment @sbbadri. below is the one i am looking for,

index=main sourcetype=access_combined_wcookie  
 | rex field=referer_domain "(?<Test>google)|(?<Test1>bing)"  
| fillnull value="-" Test Test1
 | stats count by Test Test1 host
0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...