Hello,
I've noticed post upgrade to Splunk Enterprise 8.0.5 that NLP Text Analytics searches freeze when encountering accented characters as well as some additional characters such as:
I am certain there are more, but I just want to know how to make them compatible with the NLP Text Analytics searches. I did not have this problem with Splunk 7.3.2 running Python 2.7.x.
I am using lookups to put my data in, but the same happens when the data is coming from an index. I tried creating and recreating the lookup with various methods to ensure that it's UTF-8 encoding but I could not resolve. If I put one of the characters mentioned above into the pride_prejudice sample CSV files and it breaks that as well (to try it, use field "sentence" and search "| inputlookup pride_prejudice.csv | head 1" on the Counts dashboard).
I have the following components installed:
Does anybody know how to solve?
Thanks!
Andrew
I figured it out, and the issue was with the cleantext custom command packaged with NLP Text Analysis. This command seems only to work with python2. I suspected this and updated the cleantext stanza local/commands.conf with python.version = python2. After restarting nothing changed.
After further investigation it seems Splunk 8.0.x comes packaged with both python 3 and python 2, with python.version in default/server.conf set to python2. In my case I had a value force_python3 value set in local/server.conf, which means that setting python.version anywhere else (like in local/commands.conf for the cleantext command) will be ignored. I updated that value to python3, restarted, and everything started working.
So I think NLP Text Analytics assumes that users leave the python.version value in default/server.conf as python2. In my case that value was updated in local/server.conf which screwed everything up. This might be written in the documentation somewhere, but I'm not going to lie: I didn't even check it...
I hope this may clarify some things for others!
I figured it out, and the issue was with the cleantext custom command packaged with NLP Text Analysis. This command seems only to work with python2. I suspected this and updated the cleantext stanza local/commands.conf with python.version = python2. After restarting nothing changed.
After further investigation it seems Splunk 8.0.x comes packaged with both python 3 and python 2, with python.version in default/server.conf set to python2. In my case I had a value force_python3 value set in local/server.conf, which means that setting python.version anywhere else (like in local/commands.conf for the cleantext command) will be ignored. I updated that value to python3, restarted, and everything started working.
So I think NLP Text Analytics assumes that users leave the python.version value in default/server.conf as python2. In my case that value was updated in local/server.conf which screwed everything up. This might be written in the documentation somewhere, but I'm not going to lie: I didn't even check it...
I hope this may clarify some things for others!