Developing for Splunk Enterprise

Custom streaming search command error

GindiKhangura
Explorer

Created a custom streaming command that concatenates an event's fields and field values into one field (since the events that we're dealing with has an unpredictable list of fields, I couldn't figure out a way to do it in SPL).

When ran in a stand-alone Splunk Enterprise instance, it works fine. However, when ran in a clustered environment, it results in an error (one message per indexer node):

 

[<indexer hostname>] Streamed search execute failed because: Error in 'condensefields' command: External search command exited unexpectedly with non-zero error code 1..

 

I have the app that contains the custom command in both the search heads and indexers.

 

Setup:

  • Oracle Linux Server 7.8
  • Splunk Enterprise 7.2.6

Search Example:

 

index=_audit 
| condensefields _time, user, action, info, _raw
| table _time, user, action, info, details

 

 

App (was not able to upload compressed folder):

  • <app>
    • bin
      • condensefields.py
        Spoiler
        #!/usr/bin/env python

        import sys
        import os

        sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..""lib"))
        from splunklib.searchcommands import \
            dispatch, StreamingCommand, Configuration, Option, validators

        @Configuration()
        class CondenseFields(StreamingCommand):
            """ Condense fields of an event into one field.
            ##Syntax
            | condensefields <fields>
            ##Description
            Condenses all of the fields, except ignored fields, from the event into one field in a key-value format.
            """

            def stream(self, events):
                for event in events:
                    fields_to_condense = filter(lambda key: key not in self.fieldnames, event.keys())

                    condensed_str = ''

                    is_first = True
                    for key in fields_to_condense:
                        value = event[key]

                        if not value or len(value) == 0:
                            continue

                        if not is_first:
                            condensed_str += '|'
                        else:
                            is_first = False

                        if isinstance(value, list):
                            value = '[\'' + '\', \''.join(value) + '\']'

                        condensed_str += key + '=' + value

                    event['details'] = condensed_str

                    yield event

        dispatch(CondenseFields, sys.argv, sys.stdin, sys.stdout, __name__)
    • default
      • app.conf
        Spoiler
        [install]
        is_configured = false
        build = 1

        [ui]
        is_visible = false
        label = commands

        [launcher]
        author = Some Rando
        description = Provides custom commands.
        version = 1.0.0
      • commands.conf
        Spoiler
        # [commands.conf]($SPLUNK_HOME/etc/system/README/commands.conf.spec)

        [condensefields]
        chunked = true
      • searchbnf.conf
        Spoiler

        [condensefields-command]
        syntax = condensefields
        shortdesc = Condense fields of an event into one field.
        description = Condenses all of the fields, except ignored fields, from the event into one field in a key-value format.
        content1 = A typical use-case where all of the fields, except for a defined subset, are condensed into the a field with the specified format.
        example1 = | condensefields _time, event_name, application
        category = streaming
        tags = format
    • lib
    • metadata
      • default.meta
        Spoiler

        []
        access = read : [ * ], write : [ admin, power ]
        export = system

Labels (1)
0 Karma
1 Solution

christopherrobe
Splunk Employee
Splunk Employee

Hi @GindiKhangura ,

 

I was able to replicate this to my environment, with the app as attached. It worked great on the standalone node, or if i used a non-streaming command in the search before the streaming command (head 1000), but as soon as I streamed it across the indexers, I received the same error. 

 

I was able to look into the search.log from the search on one of the indexers, and found the trace.

 

 

10-08-2020 17:53:52.554 ERROR ChunkedExternProcessor - stderr: Traceback (most recent call last):
10-08-2020 17:53:52.554 ERROR ChunkedExternProcessor - stderr: File "/opt/splunk/var/run/searchpeers/E138C522-1210-4D6F-AD0B-C91FDB3E95D8-1602178084/apps/testapp/bin/condensefields.py", line 7, in <module>
10-08-2020 17:53:52.554 ERROR ChunkedExternProcessor - stderr: from splunklib.searchcommands import \
10-08-2020 17:53:52.555 ERROR ChunkedExternProcessor - stderr: ImportError: No module named splunklib.searchcommands
10-08-2020 17:53:52.556 ERROR ChunkedExternProcessor - EOF while attempting to read transport header
10-08-2020 17:53:52.556 ERROR ChunkedExternProcessor - Error in 'condensefields' command: External search command exited unexpectedly with non-zero error code 1.
10-08-2020 17:53:52.556 ERROR SearchPipelineExecutor - sid:remote_sh1_1602179632.20_86CF40A2-33DE-4558-8481-6CFF1E8B36D4 Streamed search execute failed because: Error in 'condensefields' command: External search command exited unexpectedly with non-zero error code 1..

 

 

Looking at the error "10-08-2020 17:53:52.555 ERROR ChunkedExternProcessor - stderr: ImportError: No module named splunklib.searchcommands". You'll also notice that the app itself is being passed in the knowledge bundle and being ran in /opt/splunk/var/run/searchpeers/E138C522-1210-4D6F-AD0B-C91FDB3E95D8-1602178084/apps/testapp/bin/condensefields.py and NOT /opt/splunk/etc/slave-apps/testapp/bin/condensefields.py

I looked into that /var/run/searchpeers/<SH_GUID>-<SID>/apps/testapp/ directory, and /lib/ does not get streamed to the indexer, but /bin does.

 

To resolve this issue, I moved splunklib from /lib to /bin and removed the syspath for it to point into its own directory. 

 

After this change, the command was able to stream properly.

View solution in original post

christopherrobe
Splunk Employee
Splunk Employee

Hi @GindiKhangura ,

 

I was able to replicate this to my environment, with the app as attached. It worked great on the standalone node, or if i used a non-streaming command in the search before the streaming command (head 1000), but as soon as I streamed it across the indexers, I received the same error. 

 

I was able to look into the search.log from the search on one of the indexers, and found the trace.

 

 

10-08-2020 17:53:52.554 ERROR ChunkedExternProcessor - stderr: Traceback (most recent call last):
10-08-2020 17:53:52.554 ERROR ChunkedExternProcessor - stderr: File "/opt/splunk/var/run/searchpeers/E138C522-1210-4D6F-AD0B-C91FDB3E95D8-1602178084/apps/testapp/bin/condensefields.py", line 7, in <module>
10-08-2020 17:53:52.554 ERROR ChunkedExternProcessor - stderr: from splunklib.searchcommands import \
10-08-2020 17:53:52.555 ERROR ChunkedExternProcessor - stderr: ImportError: No module named splunklib.searchcommands
10-08-2020 17:53:52.556 ERROR ChunkedExternProcessor - EOF while attempting to read transport header
10-08-2020 17:53:52.556 ERROR ChunkedExternProcessor - Error in 'condensefields' command: External search command exited unexpectedly with non-zero error code 1.
10-08-2020 17:53:52.556 ERROR SearchPipelineExecutor - sid:remote_sh1_1602179632.20_86CF40A2-33DE-4558-8481-6CFF1E8B36D4 Streamed search execute failed because: Error in 'condensefields' command: External search command exited unexpectedly with non-zero error code 1..

 

 

Looking at the error "10-08-2020 17:53:52.555 ERROR ChunkedExternProcessor - stderr: ImportError: No module named splunklib.searchcommands". You'll also notice that the app itself is being passed in the knowledge bundle and being ran in /opt/splunk/var/run/searchpeers/E138C522-1210-4D6F-AD0B-C91FDB3E95D8-1602178084/apps/testapp/bin/condensefields.py and NOT /opt/splunk/etc/slave-apps/testapp/bin/condensefields.py

I looked into that /var/run/searchpeers/<SH_GUID>-<SID>/apps/testapp/ directory, and /lib/ does not get streamed to the indexer, but /bin does.

 

To resolve this issue, I moved splunklib from /lib to /bin and removed the syspath for it to point into its own directory. 

 

After this change, the command was able to stream properly.

View solution in original post

eurban
Explorer

@christopherrobe, I want to start by thanking you for this info!  I spent a lot of time trying to figure out what was going on with my app in our production instance (clustered) that was completely fine in my local standalone instance.  Like you, I noticed that when using a non-streaming command, like `noop` or `head`, before my CSC that this worked fine in the clustered instance.

There are a few things however that make me concerned about putting splunklib under /bin.

First, AppInspect throws a warning if you move the splunklib folder under /bin:

{
    "description": "Check splunklib dependency should not be placed under app's bin folder. Please refer to\n https://dev.splunk.com/view/SP-CAAAER3 and https://dev.splunk.com/view/SP-CAAAEU2 for more details/examples.",
    "messages": [
        {
            "code": "\"splunklib is found under `bin` folder, this may cause some dependency management \"",
            "filename": "check_application_structure.py",
            "line": 228,
            "message": "splunklib is found under `bin` folder, this may cause some dependency management errors with other apps, and it is not recommended. Please follow examples in Splunk documentation to include splunklib. You can find more details here: https://dev.splunk.com/view/SP-CAAAEU2 and https://dev.splunk.com/view/SP-CAAAER3",
            "result": "warning",
            "message_filename": null,
            "message_line": null
        }
    ],
    "name": "check_splunklib_dependency_under_bin_folder",
    "tags": [
        "splunk_appinspect",
        "cloud",
        "private_app"
    ],
    "result": "warning"
},

 

The second concern is that the guide at https://dev.splunk.com/enterprise/docs/developapps/appanatomy/#Considerations-for-Python-code-files also instructs the user to put splunklib under /lib.

Since the AppInspect message tells you there may be dependency issues putting splunklib in /bin, yet it appears this is necessary for streaming CSCs to work in clustered environments, is there some underlying issue that needs to be addressed?

0 Karma

mjz
Explorer

What's said here is spot on. The best practices on SPL2 and AppInspect warn against doing this, yet it's the only way to get things working.

0 Karma

GindiKhangura
Explorer

Thank you, that worked.

I didn't check the search log on the indexer, was only looking at the one on the search head from the job inspector.

0 Karma

christopherrobe
Splunk Employee
Splunk Employee

Awesome!

 

I also would like to note that you're able to get search.logs from the indexers as well in the job inspector under "Search Job Properties". Towards the bottom, under "Additional Info" there should be a link for each search.log from its respective indexer.

richgalloway
SplunkTrust
SplunkTrust

Have you looked at the search log to see the reason for the error?

---
If this reply helps you, an upvote would be appreciated.
0 Karma

GindiKhangura
Explorer

@richgalloway,

Yes, I have looked at the search.log file from the Job Inspector.

I searched for the terms "error", "warn", and "fatal", but none of them were present. I read through the whole log and did not see any messages that would suggest that anything failed.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

When debugging my external command, I found it most useful to search for the command name rather than "error" and the like.  IIRC, tracebacks were not reported as errors.

Also, consider using debug logging

|noop log_DEBUG=*
---
If this reply helps you, an upvote would be appreciated.

GindiKhangura
Explorer

Thank you for the tip.

I tried it out, but no luck in finding the actual cause of the error. The message that I see in the search.log is the same as the one that is seen in the Search app. It appears to just fail with that one message:

GindiKhangura_0-1601997940953.png

(That message is a part of a much larger debug message from the SearchResultParser)

The error code is confusing and strange as it shows up as "1.."; wondering if that's supposed to be equivalent to "1xx"?

0 Karma

richgalloway
SplunkTrust
SplunkTrust

I'm not sure why the command is failing.

Have you tried doing the job using SPL?

| eval details = "_time="._time." user=".user." action=".action." info=".info." _raw="._raw
---
If this reply helps you, an upvote would be appreciated.
0 Karma

GindiKhangura
Explorer

You mean completely getting rid of the command and doing it via SPL?

The query that I provided in this thread was just meant to be a short and simple example that is able to reproduce the issue; the actual use-case for the command is to take all fields in an event, excluding the ones specified in the command's fields args, and concatenating them into one field. Since the events that I'm dealing with can have a random set of fields that are not predictable (I cannot keep a list of them), I was not able to figure out a way to do it via SPL (including macros).

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Thanks for clarifying that.  Yes, I meant to use SPL instead of the external command, but nothing I can come with does what you want.

I'm out of ideas.  Sorry.

---
If this reply helps you, an upvote would be appreciated.
0 Karma

GindiKhangura
Explorer

Thank you for trying!

0 Karma

Tune In & Win!

Don't miss out on your
chance to take home free
prizes by helping our players
save the Splunk Cloudom!

Dungeons & Data
Monsters: Splunk O11y
Day Editions Games
stream live:
5/4 at 6:30pm PST
5/5 at 7:00pm PST
on