<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Splunk Python SDK - Causing HTTP 503 (HTTP Too Many Threads) and Socket Errno=110 in Splunk Dev</title>
    <link>https://community.splunk.com/t5/Splunk-Dev/Splunk-Python-SDK-Causing-HTTP-503-HTTP-Too-Many-Threads-and/m-p/372375#M6094</link>
    <description>&lt;BLOCKQUOTE&gt;
&lt;P&gt;"heavy forwarder to collect search results from a Splunk Enterprise server, writing them to file, monitoring the file and forwarding to Splunk Cloud."&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;

&lt;P&gt;This seems like a complete overengineering. We might be able to solve the same thing with native features in Splunk rather than creating new script and the resulting complexities and points of failures.&lt;/P&gt;

&lt;P&gt;Let's make sure we have the problem correct: are you trying to collect data from an On Prem splunk and selectively forward to Splunk Cloud?&lt;/P&gt;

&lt;P&gt;Are you searching for &lt;CODE&gt;Org=pitneybowes AND Env=prod AND EndpointName= AND responseStatus=&lt;/CODE&gt; over a minute ago?&lt;BR /&gt;
The search itself is interesting to me for the following reasons:&lt;/P&gt;

&lt;UL&gt;
&lt;LI&gt;Is &lt;CODE&gt;Org&lt;/CODE&gt; a field? With the leading capitalized &lt;CODE&gt;O&lt;/CODE&gt;&lt;/LI&gt;
&lt;LI&gt;Is this data always from a particular index?&lt;/LI&gt;
&lt;LI&gt;Are these fields part of the log payload or have you added fields at search time?&lt;/LI&gt;
&lt;LI&gt;Why the &lt;CODE&gt;AND&lt;/CODE&gt; in between given any collection of items in a search are implicitly AND?&lt;/LI&gt;
&lt;LI&gt;Are you looking for events where the specific text &lt;CODE&gt;EndpointName=&lt;/CODE&gt; AND &lt;CODE&gt;responseStatus=&lt;/CODE&gt; are shown? I ask because you may have preferred the search to be &lt;CODE&gt;responseStatus=* EndpointName=*&lt;/CODE&gt; which means only events where there is a value for those fields.&lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;Anyway, the way I would consider setting this up (in order of least complexity) is to:&lt;/P&gt;

&lt;OL&gt;
&lt;LI&gt;Have the forwarders send data to both on prem AND cloud  &lt;A href="https://docs.splunk.com/Documentation/SplunkCloud/latest/Forwarding/Configureforwarderswithoutputs.confd#Data_cloning"&gt;https://docs.splunk.com/Documentation/SplunkCloud/latest/Forwarding/Configureforwarderswithoutputs.confd#Data_cloning&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;Have the indexers on prem index AND forward &lt;A href="https://docs.splunk.com/Documentation/SplunkCloud/latest/Forwarding/Configureforwarderswithoutputs.confd#Global_stanza"&gt;https://docs.splunk.com/Documentation/SplunkCloud/latest/Forwarding/Configureforwarderswithoutputs.confd#Global_stanza&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;Have the Heavy Forwarder run the search as a Summary Indexing activity but set up to just forward that index to Cloud &lt;A href="https://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Usesummaryindexing"&gt;https://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Usesummaryindexing&lt;/A&gt;&lt;/LI&gt;
&lt;/OL&gt;

&lt;P&gt;Let me know if there's features of the SDK you think are needed and why. I'll use that information to see if there's still out-of-the-box solutions for solving your problem rather than creating a new script and all.&lt;/P&gt;</description>
    <pubDate>Fri, 06 Oct 2017 22:09:12 GMT</pubDate>
    <dc:creator>sloshburch</dc:creator>
    <dc:date>2017-10-06T22:09:12Z</dc:date>
    <item>
      <title>Splunk Python SDK - Causing HTTP 503 (HTTP Too Many Threads) and Socket Errno=110</title>
      <link>https://community.splunk.com/t5/Splunk-Dev/Splunk-Python-SDK-Causing-HTTP-503-HTTP-Too-Many-Threads-and/m-p/372374#M6093</link>
      <description>&lt;P&gt;Suggestions for improvement to the Python SDK script implementation are being requested. Would modifying the EXEC_MODE or OUTPUT_MODE to another value help?&lt;/P&gt;

&lt;P&gt;I'm am using a Python SDK (splunk-sdk-python-1.6.2) script in the examples directory (search.py) on a heavy forwarder to collect search results from a Splunk Enterprise server, writing them to file, monitoring the file and forwarding to Splunk Cloud.&lt;/P&gt;

&lt;P&gt;I've wrapped the search.py script it in a BASH shell script and it is somewhat successfully executing from the splunk user crontab every minute. Initially, it appears data is collected and everything is working fine. However, after a few minutes, I start to receive HTTP Error 503 (too many HTTP threads) and start to get socket timeout errors (errno 110).&lt;/P&gt;

&lt;P&gt;Eventually, the host's memory utilization is so high that it is no longer reachable and needs to be rebooted. I can see there a variety of processes spawned, like: kthreadd, ksoftirqd/0, kworker/0:0H and the like.&lt;/P&gt;

&lt;P&gt;I know the one minute, repeated execution is a lot and am working with the requestors to change that requirement. In addition, I have asked them to consider forwarding the data directly to Splunk Cloud. In the meantime, I am trying to get a stable implementation working.&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;&lt;EM&gt;The BASH wrapper:&lt;/EM&gt;&lt;/STRONG&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;-#Modify this file if you need to change PYTHONPATH, host, port, username or password

SCRIPT_HOME=/opt/splunk/etc/apps/gcs-shippingapi/bin
source $SCRIPT_HOME/gcs-shippingapi-hostcred.cfg

-#The time string can be either a UTC time (with fractional seconds), a relative time specifier (to now) or a formatted time string.

EARLIEST='-2m@m'
LATEST='-1m@m'

-#Execution mode valid values: (blocking | oneshot | normal); default=normal
-#Refer to the following for more information: http://dev.splunk.com/view/python-sdk/SP-CAAAEE5
EXEC_MODE='oneshot'

-#Output mode valid values: (atom | csv | json | json_cols | json_rows | raw | xml); default=xml
OUTPUT_MODE='raw'

SEARCH='search Org=pitneybowes AND Env=prod AND EndpointName= AND responseStatus='

/opt/splunk/bin/python $SCRIPT_HOME/search.py "$SEARCH" --host=$SPLUNK_HOST --port=$PORT --username=$SPLUNK_USERNAME --password=$SPLUNK_PASSWORD --output_mode=$
OUTPUT_MODE --earliest_time=$EARLIEST --latest_time=$LATEST
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;&lt;STRONG&gt;&lt;EM&gt;Cron Error Message #1:&lt;/EM&gt;&lt;/STRONG&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;Traceback (most recent call last):
File "/opt/splunk/etc/apps/gcs-shippingapi/bin/search.py", line 115, in main(sys.argv[1:]) File "/opt/splunk/etc/apps/gcs-shippingapi/bin/search.py", line 72, in main service = client.connect(**kwargs_splunk) File "/opt/splunk-sdk-python-1.6.2/splunklib/client.py", line 321, in connect s.login() File "/opt/splunk-sdk-python-1.6.2/splunklib/binding.py", line 857, in login cookie="1") # In Splunk 6.2+, passing "cookie=1" will return the "set-cookie" header File "/opt/splunk-sdk-python-1.6.2/splunklib/binding.py", line 1201, in post return self.request(url, message) File "/opt/splunk-sdk-python-1.6.2/splunklib/binding.py", line 1221, in request raise HTTPError(response) splunklib.binding.HTTPError: HTTP 503 Too many HTTP threads (628) already running, try again later --

Too many HTTP threads (628) already running, try again later
The server can not presently handle the given request.
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;&lt;STRONG&gt;&lt;EM&gt;Cron Error Message #2:&lt;/EM&gt;&lt;/STRONG&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;Traceback (most recent call last):
  File "/opt/splunk/etc/apps/gcs-shippingapi/bin/search.py", line 115, in 
    main(sys.argv[1:])
  File "/opt/splunk/etc/apps/gcs-shippingapi/bin/search.py", line 72, in main
    service = client.connect(**kwargs_splunk)
  File "/opt/splunk-sdk-python-1.6.2/splunklib/client.py", line 321, in connect
    s.login()
  File "/opt/splunk-sdk-python-1.6.2/splunklib/binding.py", line 857, in login
    cookie="1") # In Splunk 6.2+, passing "cookie=1" will return the "set-cookie" header
  File "/opt/splunk-sdk-python-1.6.2/splunklib/binding.py", line 1201, in post
    return self.request(url, message)
  File "/opt/splunk-sdk-python-1.6.2/splunklib/binding.py", line 1218, in request
    response = self.handler(url, message, **kwargs)
  File "/opt/splunk-sdk-python-1.6.2/splunklib/binding.py", line 1357, in request
    connection.request(method, path, body, head)
  File "/opt/splunk/lib/python2.7/httplib.py", line 1042, in request
    self._send_request(method, url, body, headers)
  File "/opt/splunk/lib/python2.7/httplib.py", line 1082, in _send_request
    self.endheaders(body)
  File "/opt/splunk/lib/python2.7/httplib.py", line 1038, in endheaders
    self._send_output(message_body)
  File "/opt/splunk/lib/python2.7/httplib.py", line 882, in _send_output
    self.send(msg)
  File "/opt/splunk/lib/python2.7/httplib.py", line 844, in send
      self.connect()
  File "/opt/splunk/lib/python2.7/httplib.py", line 1255, in connect
    HTTPConnection.connect(self)
  File "/opt/splunk/lib/python2.7/httplib.py", line 821, in connect
    self.timeout, self.source_address)
  File "/opt/splunk/lib/python2.7/socket.py", line 575, in create_connection
    raise err
socket.error: [Errno 110] Connection timed out
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 29 Sep 2020 16:04:13 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Dev/Splunk-Python-SDK-Causing-HTTP-503-HTTP-Too-Many-Threads-and/m-p/372374#M6093</guid>
      <dc:creator>chrismmckenna</dc:creator>
      <dc:date>2020-09-29T16:04:13Z</dc:date>
    </item>
    <item>
      <title>Re: Splunk Python SDK - Causing HTTP 503 (HTTP Too Many Threads) and Socket Errno=110</title>
      <link>https://community.splunk.com/t5/Splunk-Dev/Splunk-Python-SDK-Causing-HTTP-503-HTTP-Too-Many-Threads-and/m-p/372375#M6094</link>
      <description>&lt;BLOCKQUOTE&gt;
&lt;P&gt;"heavy forwarder to collect search results from a Splunk Enterprise server, writing them to file, monitoring the file and forwarding to Splunk Cloud."&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;

&lt;P&gt;This seems like a complete overengineering. We might be able to solve the same thing with native features in Splunk rather than creating new script and the resulting complexities and points of failures.&lt;/P&gt;

&lt;P&gt;Let's make sure we have the problem correct: are you trying to collect data from an On Prem splunk and selectively forward to Splunk Cloud?&lt;/P&gt;

&lt;P&gt;Are you searching for &lt;CODE&gt;Org=pitneybowes AND Env=prod AND EndpointName= AND responseStatus=&lt;/CODE&gt; over a minute ago?&lt;BR /&gt;
The search itself is interesting to me for the following reasons:&lt;/P&gt;

&lt;UL&gt;
&lt;LI&gt;Is &lt;CODE&gt;Org&lt;/CODE&gt; a field? With the leading capitalized &lt;CODE&gt;O&lt;/CODE&gt;&lt;/LI&gt;
&lt;LI&gt;Is this data always from a particular index?&lt;/LI&gt;
&lt;LI&gt;Are these fields part of the log payload or have you added fields at search time?&lt;/LI&gt;
&lt;LI&gt;Why the &lt;CODE&gt;AND&lt;/CODE&gt; in between given any collection of items in a search are implicitly AND?&lt;/LI&gt;
&lt;LI&gt;Are you looking for events where the specific text &lt;CODE&gt;EndpointName=&lt;/CODE&gt; AND &lt;CODE&gt;responseStatus=&lt;/CODE&gt; are shown? I ask because you may have preferred the search to be &lt;CODE&gt;responseStatus=* EndpointName=*&lt;/CODE&gt; which means only events where there is a value for those fields.&lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;Anyway, the way I would consider setting this up (in order of least complexity) is to:&lt;/P&gt;

&lt;OL&gt;
&lt;LI&gt;Have the forwarders send data to both on prem AND cloud  &lt;A href="https://docs.splunk.com/Documentation/SplunkCloud/latest/Forwarding/Configureforwarderswithoutputs.confd#Data_cloning"&gt;https://docs.splunk.com/Documentation/SplunkCloud/latest/Forwarding/Configureforwarderswithoutputs.confd#Data_cloning&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;Have the indexers on prem index AND forward &lt;A href="https://docs.splunk.com/Documentation/SplunkCloud/latest/Forwarding/Configureforwarderswithoutputs.confd#Global_stanza"&gt;https://docs.splunk.com/Documentation/SplunkCloud/latest/Forwarding/Configureforwarderswithoutputs.confd#Global_stanza&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;Have the Heavy Forwarder run the search as a Summary Indexing activity but set up to just forward that index to Cloud &lt;A href="https://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Usesummaryindexing"&gt;https://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Usesummaryindexing&lt;/A&gt;&lt;/LI&gt;
&lt;/OL&gt;

&lt;P&gt;Let me know if there's features of the SDK you think are needed and why. I'll use that information to see if there's still out-of-the-box solutions for solving your problem rather than creating a new script and all.&lt;/P&gt;</description>
      <pubDate>Fri, 06 Oct 2017 22:09:12 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Dev/Splunk-Python-SDK-Causing-HTTP-503-HTTP-Too-Many-Threads-and/m-p/372375#M6094</guid>
      <dc:creator>sloshburch</dc:creator>
      <dc:date>2017-10-06T22:09:12Z</dc:date>
    </item>
  </channel>
</rss>

