Splunk Search

regular Expression extration

Kwip
Contributor

I want to do something like this, referer_domain is the field i want to extract to create a new field. I want to rex the google as one field and bing as another field, Say example google as Test and bing as Test1

referer_domain
http://www.bing.com
http://www.buttercupgames.com
http://www.google.com
http://www.yahoo.com
https://www.google.com
https://www.bing.com

I did something like this, which is not working out,

index=main sourcetype=access_combined_wcookie  
| rex field=referer_domain "(?<Test>google)"  
| rex field=referer_domain "(?<Test1>bing)"  
| stats count by Test Test1 host

I can see single extraction is working fine as expected(the below query), but not the double one. Is it not allowed in splunk or am i missing out any syntax?

index=main sourcetype=access_combined_wcookie 
| rex field=referer_domain "(?<Test>google)" 
| stats count by Test Test1 host
0 Karma
1 Solution

DalJeanis
Legend

The problem isn't the extraction, its the stats. stats will ignore all records that have nulls in any of the by fields.

To make that work, you would have to add a fillnull command before stats.

| fillnull value="" Test Test1

However, this is cleaner...

 index=main sourcetype=access_combined_wcookie  
| rex field=referer_domain "(?i)(?<Test>Google|Bing)"  
| fillnull value="Other" Test
| stats count by Test host

Note - I made the test case-insensitive (?i) and capitalized the search engine names to make it pretty. Isn't that just precious?

Edited to mark keywords as code.

View solution in original post

DalJeanis
Legend

The problem isn't the extraction, its the stats. stats will ignore all records that have nulls in any of the by fields.

To make that work, you would have to add a fillnull command before stats.

| fillnull value="" Test Test1

However, this is cleaner...

 index=main sourcetype=access_combined_wcookie  
| rex field=referer_domain "(?i)(?<Test>Google|Bing)"  
| fillnull value="Other" Test
| stats count by Test host

Note - I made the test case-insensitive (?i) and capitalized the search engine names to make it pretty. Isn't that just precious?

Edited to mark keywords as code.

Kwip
Contributor

you are always hitting the target with cent percent accuracy @DalJeanis. Below is what my expectation. Thank you.

index=main sourcetype=access_combined_wcookie  
 | rex field=referer_domain "(?<Test>google)"  
 | rex field=referer_domain "(?<Test1>bing)"  
| fillnull value="-" Test Test1
 | stats count by Test Test1 host

sbbadri
Motivator

try this
index=main sourcetype=access_combined_wcookie | rex field=referer_domain "(?&lttest1&gtgoogle)|(?&lttest2&gtbing) | stats count by test1 test2 host

Kwip
Contributor

Thank you for valuable comment @sbbadri. below is the one i am looking for,

index=main sourcetype=access_combined_wcookie  
 | rex field=referer_domain "(?<Test>google)|(?<Test1>bing)"  
| fillnull value="-" Test Test1
 | stats count by Test Test1 host
0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In November, the Splunk Threat Research Team had one release of new security content via the Enterprise ...

Index This | Divide 100 by half. What do you get?

November 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this ...

Stay Connected: Your Guide to December Tech Talks, Office Hours, and Webinars!

❄️ Celebrate the season with our December lineup of Community Office Hours, Tech Talks, and Webinars! ...