Splunk Enterprise Security

After running a search which uses a lookup file to whitelist certain domains, I'm receiving the following error: "Regex: UTF-8 error: byte 2 top bits not 0x80"

Explorer

I am attempting to run a search which matches specific domain names. In this search, I am using a lookup file to whitelist certain domains. When I am running this search, I am getting the error: Regex: UTF-8 error: byte 2 top bits not 0x80

My search is as follows:

| tstats `summariesonly` values(Web.dest) as domain min(_time) as firstTime from datamodel=Web by Web.src 
| `drop_dm_object_name("Web")` 
| `ctime(firstTime)`
|  mvexpand domain
| search NOT (domain="exampledomain.com")
|  rex field=domain "www.(?<domain>\S+)"
|  search NOT 
    [| inputlookup whitelisted_domains.csv 
    | rename "Domain Name" as domain
    | fields domain]

I believe the issue appears to be with the lookup file itself. It works if I extract certain fields but not the Domain Name field. Are there limitations in what characters can be included in a lookup like this? Is there something else that may be causing this issue?

0 Karma
1 Solution

Explorer

I was able to fix this by doing the following:

  • Export the lookup table from Splunk
  • Open this in Notepad++ and select Encoding > Encode in UTF-8
  • Add the lookup to Splunk

View solution in original post

0 Karma

Explorer

I was able to fix this by doing the following:

  • Export the lookup table from Splunk
  • Open this in Notepad++ and select Encoding > Encode in UTF-8
  • Add the lookup to Splunk

View solution in original post

0 Karma