All Apps and Add-ons

Why does NLP Text Analytics on Splunk 8.0.5 (Python 3.7.4) freeze searches when encountering accented characters?

andrew_f_trobec
Explorer

Hello,

I've noticed post upgrade to Splunk Enterprise 8.0.5 that NLP Text Analytics searches freeze when encountering accented characters as well as some additional characters such as:

  • à
  • – (long dash)

I am certain there are more, but I just want to know how to make them compatible with the NLP Text Analytics searches.  I did not have this problem with Splunk 7.3.2 running Python 2.7.x.

I am using lookups to put my data in, but the same happens when the data is coming from an index.  I tried creating and recreating the lookup with various methods to ensure that it's UTF-8 encoding but I could not resolve.  If I put one of the characters mentioned above into the pride_prejudice sample CSV files and it breaks that as well (to try it, use field "sentence" and search "| inputlookup pride_prejudice.csv | head 1" on the Counts dashboard).

I have the following components installed:

  • nlp-text-analytics - 1.1.0
  • Splunk_SA_Scientific_Python_linux_x86_64 - 2.0.2
  • Splunk_ML_Toolkit - 5.2.0

Does anybody know how to solve?

Thanks!

Andrew

Labels (1)
Tags (1)
0 Karma
1 Solution

andrew_f_trobec
Explorer

I figured it out, and the issue was with the cleantext custom command packaged with NLP Text Analysis.  This command seems only to work with python2.  I suspected this and updated the cleantext stanza local/commands.conf with python.version = python2.  After restarting nothing changed.

After further investigation it seems Splunk 8.0.x comes packaged with both python 3 and python 2, with python.version in default/server.conf set to python2.  In my case I had a value force_python3 value set in local/server.conf, which means that setting python.version anywhere else (like in local/commands.conf for the cleantext command) will be ignored.  I updated that value to python3, restarted, and everything started working.

So I think NLP Text Analytics assumes that users leave the python.version value in default/server.conf as python2.  In my case that value was updated in local/server.conf which screwed everything up.  This might be written in the documentation somewhere, but I'm not going to lie: I didn't even check it...

I hope this may clarify some things for others!

View solution in original post

0 Karma

andrew_f_trobec
Explorer

I figured it out, and the issue was with the cleantext custom command packaged with NLP Text Analysis.  This command seems only to work with python2.  I suspected this and updated the cleantext stanza local/commands.conf with python.version = python2.  After restarting nothing changed.

After further investigation it seems Splunk 8.0.x comes packaged with both python 3 and python 2, with python.version in default/server.conf set to python2.  In my case I had a value force_python3 value set in local/server.conf, which means that setting python.version anywhere else (like in local/commands.conf for the cleantext command) will be ignored.  I updated that value to python3, restarted, and everything started working.

So I think NLP Text Analytics assumes that users leave the python.version value in default/server.conf as python2.  In my case that value was updated in local/server.conf which screwed everything up.  This might be written in the documentation somewhere, but I'm not going to lie: I didn't even check it...

I hope this may clarify some things for others!

0 Karma
Get Updates on the Splunk Community!

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud  In today’s fast-paced digital ...

Observability protocols to know about

Observability protocols define the specifications or formats for collecting, encoding, transporting, and ...

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...