Hi all,
We are running the latest version of URL Toolbox (at the time of writing, 1.9.1 released on Dec 2021) on Splunk 8.2.3 with Splunk ES 6.6.2. After the upgrade, we have noticed that the mozilla list is not working properly anymore. To test it:
| makeresults
| eval domain="http://www.example.com/123/123.php",list="mozilla"
| `ut_parse_extended(domain,list)`
Gives:
domain | list | ut_domain | ut_domain_without_tld | ut_fragment | ut_netloc | ut_params | ut_path | ut_port | ut_query | ut_scheme | ut_subdomain | ut_subdomain_count | ut_tld |
http://www.example.com/123/123.php | mozilla | None | None | None | www.example.com | None | /123/123.php | 80 | None | http | None | 0 | None |
With iana no problems at all (even if the parsing is a bit different and mozilla would be the ideal ones for our user cases):
| makeresults
| eval domain="http://www.example.com/123/123.php",list="iana"
| `ut_parse_extended(domain,list)`
domain | list | ut_domain | ut_domain_without_tld | ut_fragment | ut_netloc | ut_params | ut_path | ut_port | ut_query | ut_scheme | ut_subdomain | ut_subdomain_count | ut_subdomain_level_1 | ut_tld |
http://www.example.com/123/123.php | iana | example.com | example | None | www.example.com | None | /123/123.php | 80 | None | http | www | 1 | www | com |
Is anyone having the same issue and/or a fix that we might apply?
Thank you and cheers!
Version 1.9.2 has just been made available on Splunkbase: https://splunkbase.splunk.com/app/2734/
It has been addressed here: https://github.com/splunk/utbox/pull/2
This version has a fix in place for the Mozilla list. The others were working fine, but the conversion to Python3 seemed to have impacted the loading of this specific suffix list.
Had a similar issue on our Splunk Cloud instance which seems resolved by setting list="mozila" - the typo is on purpose; latest build of the app on our instance seems to have a typo in the dat file.
Have noticed though that the Mozilla list isn't parsing out .co.uk domains correctly (setting TLD as just .uk)
Just a quick comment on that suggestion. So I think if you're deliberately misspelling "mozilla" you're actually using the fallback which would be the IANA list.
That would explain why the .co.uk domains aren't being parsed correctly because afaik the IANA list only has .uk as a TLD.
Tested the last two hours. Experiencing the same here with v1.9.1 from splunkbase, also with latest from github. Issue did not exist in v1.8. A little strange, as this was just a small codechange from 1.8 to 1.9.1.