Splunk Enterprise Security

zScaler "url" field lacks protocol and doesn't match threat lists in ES

stroud_bc
Path Finder

We use the zScaler proxy product and have it configured with NSS to collect logs in Splunk Enterprise. We also download the PhishTank URL watchlist into the Threat_Intelligence framework in Enterprise Security. We have a problem because the URL field in our zScaler logs is stored differently than the IoCs in the PhishTank list.

A zScaler log for an HTTPS connection might look like either of these:
`url=google.com/images
protocol=HTTPS

url=google.com/images
protocol=SSL`

while PhishTank would provide an IoC like this:
https://google.com/images

Glancing at the Web data model, it seems like the expectation is that the URL field includes the protocol, so it seems like the logs are what need to be fixed, not the threat list (see hxxps://docs.splunk[.]com/Documentation/CIM/4.14.0/User/Web)

At first glance, it seems like the easy fix would just be to alias the "url" field to have the "protocol" field in front of it, ( url_new=protocol."://".url_old or similar) but in cases where the protocol field is SSL, it needs to be slightly more complicated. I could write this as an |eval statement, but I'm not exactly sure where to put it to put it into effect. Would this be a field extraction?

In any case, I need to conditionally add "https://" to the url field for SSL or HTTPS, while adding "http://" in cases where the protocol field is HTTP.

Many thanks!

0 Karma
1 Solution

stroud_bc
Path Finder

Ended up altering the threat_gen search for URL matches to make the url field an mvfield with the three most likely url protocol prefixes. The only line I added to the stock code was | eval url=mvappend("http://".url, "https://".url, "ftp://".url), shown in line below. This allows the lookups and the domain extraction to function properly, and the analyst is able to review the original log to see the actual protocol used. If you have Zscaler and need a cleaner solution, you can probably add "Web.transport" to the |tstats ... by clause and actually build the correct value with an eval case statement, but I went with the mvfield option for its simplicity.

| `tstats` values(sourcetype) as sourcetype,values(Web.src),values(Web.dest) from datamodel=Web.Web by Web.http_referrer 
| eval url='Web.http_referrer' 
| eval threat_match_field="http_referrer" 
| `tstats` append=true values(sourcetype) as sourcetype,values(Web.src),values(Web.dest) from datamodel=Web.Web by Web.url 
| eval url=if(isnull(url),'Web.url',url) 
| eval threat_match_field=if(isnull(threat_match_field),"url",threat_match_field)
| stats values(sourcetype) as sourcetype,values(Web.src) as src,values(Web.dest) as dest by url,threat_match_field 
| eval url=mvappend("http://".url, "https://".url, "ftp://".url)
| extract domain_from_url 
| `threatintel_url_lookup(url)` 
| `threatintel_domain_lookup(url_domain)` 
| search threat_collection_key=* 
| `mvtruncate(src)` 
| `mvtruncate(dest)` 
| `zipexpand_threat_matches`

View solution in original post

0 Karma

stroud_bc
Path Finder

Ended up altering the threat_gen search for URL matches to make the url field an mvfield with the three most likely url protocol prefixes. The only line I added to the stock code was | eval url=mvappend("http://".url, "https://".url, "ftp://".url), shown in line below. This allows the lookups and the domain extraction to function properly, and the analyst is able to review the original log to see the actual protocol used. If you have Zscaler and need a cleaner solution, you can probably add "Web.transport" to the |tstats ... by clause and actually build the correct value with an eval case statement, but I went with the mvfield option for its simplicity.

| `tstats` values(sourcetype) as sourcetype,values(Web.src),values(Web.dest) from datamodel=Web.Web by Web.http_referrer 
| eval url='Web.http_referrer' 
| eval threat_match_field="http_referrer" 
| `tstats` append=true values(sourcetype) as sourcetype,values(Web.src),values(Web.dest) from datamodel=Web.Web by Web.url 
| eval url=if(isnull(url),'Web.url',url) 
| eval threat_match_field=if(isnull(threat_match_field),"url",threat_match_field)
| stats values(sourcetype) as sourcetype,values(Web.src) as src,values(Web.dest) as dest by url,threat_match_field 
| eval url=mvappend("http://".url, "https://".url, "ftp://".url)
| extract domain_from_url 
| `threatintel_url_lookup(url)` 
| `threatintel_domain_lookup(url_domain)` 
| search threat_collection_key=* 
| `mvtruncate(src)` 
| `mvtruncate(dest)` 
| `zipexpand_threat_matches`
0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.


Introducing Unified TDIR with the New Enterprise Security 8.2

Read the blog
Get Updates on the Splunk Community!

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...

Data Persistence in the OpenTelemetry Collector

This blog post is part of an ongoing series on OpenTelemetry. What happens if the OpenTelemetry collector ...

Thanks for the Memories! Splunk University, .conf25, and our Community

Thank you to everyone in the Splunk Community who joined us for .conf25, which kicked off with our iconic ...