All Apps and Add-ons

Why does NLP Text Analytics on Splunk 8.0.5 (Python 3.7.4) freeze searches when encountering accented characters?

andrew_f_trobec
Explorer

Hello,

I've noticed post upgrade to Splunk Enterprise 8.0.5 that NLP Text Analytics searches freeze when encountering accented characters as well as some additional characters such as:

  • à
  • – (long dash)

I am certain there are more, but I just want to know how to make them compatible with the NLP Text Analytics searches.  I did not have this problem with Splunk 7.3.2 running Python 2.7.x.

I am using lookups to put my data in, but the same happens when the data is coming from an index.  I tried creating and recreating the lookup with various methods to ensure that it's UTF-8 encoding but I could not resolve.  If I put one of the characters mentioned above into the pride_prejudice sample CSV files and it breaks that as well (to try it, use field "sentence" and search "| inputlookup pride_prejudice.csv | head 1" on the Counts dashboard).

I have the following components installed:

  • nlp-text-analytics - 1.1.0
  • Splunk_SA_Scientific_Python_linux_x86_64 - 2.0.2
  • Splunk_ML_Toolkit - 5.2.0

Does anybody know how to solve?

Thanks!

Andrew

Labels (1)
Tags (1)
0 Karma
1 Solution

andrew_f_trobec
Explorer

I figured it out, and the issue was with the cleantext custom command packaged with NLP Text Analysis.  This command seems only to work with python2.  I suspected this and updated the cleantext stanza local/commands.conf with python.version = python2.  After restarting nothing changed.

After further investigation it seems Splunk 8.0.x comes packaged with both python 3 and python 2, with python.version in default/server.conf set to python2.  In my case I had a value force_python3 value set in local/server.conf, which means that setting python.version anywhere else (like in local/commands.conf for the cleantext command) will be ignored.  I updated that value to python3, restarted, and everything started working.

So I think NLP Text Analytics assumes that users leave the python.version value in default/server.conf as python2.  In my case that value was updated in local/server.conf which screwed everything up.  This might be written in the documentation somewhere, but I'm not going to lie: I didn't even check it...

I hope this may clarify some things for others!

View solution in original post

0 Karma

andrew_f_trobec
Explorer

I figured it out, and the issue was with the cleantext custom command packaged with NLP Text Analysis.  This command seems only to work with python2.  I suspected this and updated the cleantext stanza local/commands.conf with python.version = python2.  After restarting nothing changed.

After further investigation it seems Splunk 8.0.x comes packaged with both python 3 and python 2, with python.version in default/server.conf set to python2.  In my case I had a value force_python3 value set in local/server.conf, which means that setting python.version anywhere else (like in local/commands.conf for the cleantext command) will be ignored.  I updated that value to python3, restarted, and everything started working.

So I think NLP Text Analytics assumes that users leave the python.version value in default/server.conf as python2.  In my case that value was updated in local/server.conf which screwed everything up.  This might be written in the documentation somewhere, but I'm not going to lie: I didn't even check it...

I hope this may clarify some things for others!

0 Karma
Get Updates on the Splunk Community!

New in Observability - Improvements to Custom Metrics SLOs, Log Observer Connect & ...

The latest enhancements to the Splunk observability portfolio deliver improved SLO management accuracy, better ...

Improve Data Pipelines Using Splunk Data Management

  Register Now   This Tech Talk will explore the pipeline management offerings Edge Processor and Ingest ...

3-2-1 Go! How Fast Can You Debug Microservices with Observability Cloud?

Register Join this Tech Talk to learn how unique features like Service Centric Views, Tag Spotlight, and ...