Splunk Search

Splunk Python SDK - How to call my custom search command only once?

bojanjanisch
New Member

Hi there,

I'm developing a custom search command to call a custom rest endpoint using Splunk's Python SDK and the command protocol version 2. The SDK supports the command development pretty well, however my command is executed at least twice each time and I don't really know why.

I've created a simple test case which could be installed using the following files in a new app folder under etc/apps/:

./bin/testexecution.py:

import sys
from splunklib.searchcommands import dispatch, ReportingCommand, Configuration, Option, validators

@Configuration()
class testexecutionCommand(ReportingCommand):

    @Configuration(local=True)
    def map(self, records):
        for record in records:
            yield record
        pass


    @Configuration(requires_preop=False, run_in_preview=False, local=True, streaming=False)
    def reduce(self, records):
        self.logger.warn("This should only be seen once.")

        for record in records:
            yield record
        pass  

try:          
  dispatch(testexecutionCommand, sys.argv, sys.stdin, sys.stdout, __name__)
except Exception as e:
  print(str(e))

./default/commands.conf:

[testexecution]
filename = testexecution.py 
chunked = true

./default/authorize.conf:

[capability::run_script_print]
[role_admin]
run_script_print = enabled

./metadata/local.meta:

[commands/testexecution]
access = read : [ * ], write : [ * ]
export = system

The generated search.log contains this information, please note that the marker "This should only be seen once." is logged twice:

09-13-2018 11:33:08.356 INFO  dispatchRunner - Search process mode: preforked (first search in process) (build a0c72a66db66).
09-13-2018 11:33:08.356 INFO  dispatchRunner - initing LicenseMgr in search process: nonPro=0
09-13-2018 11:33:08.356 INFO  LicenseMgr - Initing LicenseMgr
09-13-2018 11:33:08.356 INFO  LMConfig - serverName=test-splunk-d guid=68A1ADDC-9631-4A57-83C5-77403FAD4514
09-13-2018 11:33:08.358 INFO  LMConfig - connection_timeout=30
09-13-2018 11:33:08.358 INFO  LMConfig - send_timeout=30
09-13-2018 11:33:08.358 INFO  LMConfig - receive_timeout=30
09-13-2018 11:33:08.358 INFO  LMConfig - squash_threshold=2000
09-13-2018 11:33:08.358 INFO  LMConfig - strict_pool_quota=1
09-13-2018 11:33:08.358 INFO  LMConfig - key=pool_suggestion not found in licenser stanza of server.conf, defaulting=''
09-13-2018 11:33:08.358 INFO  LMConfig - key=test_aws_metering not found in licenser stanza of server.conf, defaulting=0
09-13-2018 11:33:08.358 INFO  LMConfig - key=test_aws_product_code not found in licenser stanza of server.conf, defaulting=0
09-13-2018 11:33:08.359 INFO  LicenseMgr - Initing LicenseMgr runContext_splunkd=false
09-13-2018 11:33:08.359 INFO  LMStackMgr - closing stack mgr
09-13-2018 11:33:08.359 INFO  LMSlaveInfo - all slaves cleared
09-13-2018 11:33:08.359 INFO  LMStackMgr - Skipping trial license as alternative license type in use
09-13-2018 11:33:08.360 INFO  LMStack - Added type=enterprise license, from file=splunk.license, to stack=enterprise of group=Enterprise
09-13-2018 11:33:08.360 INFO  LMStackMgr - created stack='enterprise'
09-13-2018 11:33:08.360 INFO  LMStackMgr - Skipping trial pool stanza as alternative license in use
09-13-2018 11:33:08.360 INFO  LMStackMgr - added pool auto_generated_pool_enterprise to stack enterprise
09-13-2018 11:33:08.360 INFO  LMStackMgr - added pool auto_generated_pool_forwarder to stack forwarder
09-13-2018 11:33:08.360 INFO  LMStackMgr - added pool auto_generated_pool_free to stack free
09-13-2018 11:33:08.360 INFO  LMStackMgr - Initialized hideQuotaWarning = "0"
09-13-2018 11:33:08.361 INFO  LMStackMgr - init completed [68A1ADDC-9631-4A57-83C5-77403FAD4514,Enterprise,runContext_splunkd=false]
09-13-2018 11:33:08.361 INFO  LicenseMgr - StackMgr init complete...
09-13-2018 11:33:08.361 INFO  LMTracker - Setting default product type='enterprise'
09-13-2018 11:33:08.361 INFO  LMTracker - this is not splunkd, will perform partial init
09-13-2018 11:33:08.361 INFO  LMTracker - Setting feature=Acceleration state=ENABLED (featureStatus=1)
09-13-2018 11:33:08.361 INFO  LMTracker - Setting feature=AdvancedSearchCommands state=ENABLED (featureStatus=1)
09-13-2018 11:33:08.361 INFO  LMTracker - Setting feature=AdvancedXML state=ENABLED (featureStatus=1)
09-13-2018 11:33:08.361 INFO  LMTracker - Setting feature=Alerting state=ENABLED (featureStatus=1)
09-13-2018 11:33:08.361 INFO  LMTracker - Setting feature=ArchiveToHdfs state=ENABLED (featureStatus=1)
09-13-2018 11:33:08.361 INFO  LMTracker - Setting feature=Auth state=ENABLED (featureStatus=1)
09-13-2018 11:33:08.361 INFO  LMTracker - Setting feature=CustomRoles state=ENABLED (featureStatus=1)
09-13-2018 11:33:08.361 INFO  LMTracker - Setting feature=DeployClient state=ENABLED (featureStatus=1)
09-13-2018 11:33:08.361 INFO  LMTracker - Setting feature=DeployServer state=ENABLED (featureStatus=1)
09-13-2018 11:33:08.361 INFO  LMTracker - Setting feature=DistSearch state=ENABLED (featureStatus=1)
09-13-2018 11:33:08.361 INFO  LMTracker - Setting feature=FwdData state=ENABLED (featureStatus=1)
09-13-2018 11:33:08.361 INFO  LMTracker - Setting feature=GuestPass state=ENABLED (featureStatus=1)
09-13-2018 11:33:08.361 INFO  LMTracker - Setting feature=KVStore state=ENABLED (featureStatus=1)
09-13-2018 11:33:08.361 INFO  LMTracker - Setting feature=LDAPAuth state=ENABLED (featureStatus=1)
09-13-2018 11:33:08.361 INFO  LMTracker - Setting feature=LocalSearch state=ENABLED (featureStatus=1)
09-13-2018 11:33:08.361 INFO  LMTracker - Setting feature=MultifactorAuth state=ENABLED (featureStatus=1)
09-13-2018 11:33:08.361 INFO  LMTracker - Setting feature=MultisiteClustering state=ENABLED (featureStatus=1)
09-13-2018 11:33:08.361 INFO  LMTracker - Setting feature=NontableLookups state=ENABLED (featureStatus=1)
09-13-2018 11:33:08.361 INFO  LMTracker - Setting feature=RcvData state=ENABLED (featureStatus=1)
09-13-2018 11:33:08.361 INFO  LMTracker - Setting feature=RcvSearch state=ENABLED (featureStatus=1)
09-13-2018 11:33:08.361 INFO  LMTracker - Setting feature=RollingWindowAlerts state=ENABLED (featureStatus=1)
09-13-2018 11:33:08.361 INFO  LMTracker - Setting feature=SAMLAuth state=ENABLED (featureStatus=1)
09-13-2018 11:33:08.361 INFO  LMTracker - Setting feature=ScheduledAlerts state=ENABLED (featureStatus=1)
09-13-2018 11:33:08.361 INFO  LMTracker - Setting feature=ScheduledReports state=ENABLED (featureStatus=1)
09-13-2018 11:33:08.361 INFO  LMTracker - Setting feature=ScheduledSearch state=ENABLED (featureStatus=1)
09-13-2018 11:33:08.361 INFO  LMTracker - Setting feature=ScriptedAuth state=ENABLED (featureStatus=1)
09-13-2018 11:33:08.361 INFO  LMTracker - Setting feature=SearchheadPooling state=ENABLED (featureStatus=1)
09-13-2018 11:33:08.361 INFO  LMTracker - Setting feature=SigningProcessor state=ENABLED (featureStatus=1)
09-13-2018 11:33:08.361 INFO  LMTracker - Setting feature=SplunkWeb state=ENABLED (featureStatus=1)
09-13-2018 11:33:08.361 INFO  LMTracker - Setting feature=SubgroupId state=ENABLED (featureStatus=1)
09-13-2018 11:33:08.361 INFO  LMTracker - Setting feature=SyslogOutputProcessor state=ENABLED (featureStatus=1)
09-13-2018 11:33:08.361 INFO  LMTracker - Setting feature=UnisiteClustering state=ENABLED (featureStatus=1)
09-13-2018 11:33:08.361 INFO  LicenseMgr - Tracker init complete...
09-13-2018 11:33:08.365 INFO  AdminManagerDispatch - added factory for admin handler: 'licenses'
09-13-2018 11:33:08.365 INFO  AdminManagerDispatch - added factory for admin handler: 'pools'
09-13-2018 11:33:08.365 INFO  AdminManagerDispatch - added factory for admin handler: 'stacks'
09-13-2018 11:33:08.365 INFO  AdminManagerDispatch - added factory for admin handler: 'groups'
09-13-2018 11:33:08.365 INFO  AdminManagerDispatch - added factory for admin handler: 'slaves'
09-13-2018 11:33:08.365 INFO  AdminManagerDispatch - added factory for admin handler: 'localslave'
09-13-2018 11:33:08.365 INFO  AdminManagerDispatch - added factory for admin handler: 'licensermessages'
09-13-2018 11:33:08.365 INFO  AdminManagerDispatch - added factory for admin handler: 'scriptedwarning'
09-13-2018 11:33:08.365 INFO  AdminManagerDispatch - added factory for admin handler: 'licenseusage'
09-13-2018 11:33:08.365 INFO  dispatchRunner - registering build time modules, count=1
09-13-2018 11:33:08.365 INFO  dispatchRunner - registering search time components of build time module name=vix
09-13-2018 11:33:08.365 INFO  dispatchRunner - Getting search configuration data from: /opt/splunk/etc/modules/parsing/config.xml
09-13-2018 11:33:08.367 INFO  BundlesSetup - Setup stats for /opt/splunk/etc: wallclock_elapsed_msec=20, cpu_time_used=0.016, shared_services_generation=2, shared_services_population=1
09-13-2018 11:33:08.375 INFO  UserManagerPro - Load authentication: forcing roles="admin, can_delete, power, user"
09-13-2018 11:33:08.376 INFO  UserManager - Setting user context: splunk-system-user
09-13-2018 11:33:08.376 INFO  UserManager - Done setting user context: NULL -> splunk-system-user
09-13-2018 11:33:08.376 INFO  UserManager - Unwound user context: splunk-system-user -> NULL
09-13-2018 11:33:08.376 INFO  UserManager - Setting user context: admin
09-13-2018 11:33:08.376 INFO  UserManager - Done setting user context: NULL -> admin
09-13-2018 11:33:08.377 INFO  dispatchRunner - search context: user="admin", app="search", bs-pathname="/opt/splunk/etc"
09-13-2018 11:33:08.380 INFO  dispatchRunner - Executing the Search orchestrator and iterator model.
09-13-2018 11:33:08.381 INFO  SearchOrchestrator - SearchOrchestrator getting constructed
09-13-2018 11:33:08.381 INFO  SearchOrchestrator -  Initialized the SRI
09-13-2018 11:33:08.381 INFO  ISplunkDispatch - Not running in splunkd. Bundle replication not triggered.
09-13-2018 11:33:08.381 INFO  SearchOrchestrator - Initialzing the run time settings for the orchestrator.
09-13-2018 11:33:08.381 INFO  UserManager - Setting user context: admin
09-13-2018 11:33:08.381 INFO  UserManager - Done setting user context: NULL -> admin
09-13-2018 11:33:08.381 INFO  SearchOrchestrator - Creating the search DAG.
09-13-2018 11:33:08.381 INFO  SearchPhaseGenerator - Disabling summaries in verbose mode
09-13-2018 11:33:08.381 INFO  SearchParser - PARSING: | testexecution
09-13-2018 11:33:08.383 INFO  ChunkedExternProcessor - Running process: /opt/splunk/bin/python /opt/splunk/etc/apps/testCommand/bin/testexecution.py
09-13-2018 11:33:08.446 INFO  AstOptimizer - SrchOptMetrics optimize_toJson=0.062923079
09-13-2018 11:33:08.447 INFO  SearchParser - PARSING: | testexecution
09-13-2018 11:33:08.447 INFO  ProjElim - Black listed processors=[addinfo]
09-13-2018 11:33:08.462 INFO  AstOptimizer - SrchOptMetrics optimization=0.015610200
09-13-2018 11:33:08.462 INFO  SearchPhaseGenerator - Optimized Search =| testexecution
09-13-2018 11:33:08.462 INFO  ParallelReducePolicy - Current Search Head doesn't have any usable peers to use.
09-13-2018 11:33:08.462 INFO  PhaseNodeGenerationVisitor - User lacking run_multi_phased_searches, rolling back to 2-phase mode.
09-13-2018 11:33:08.462 INFO  PhaseToPipelineVisitor - Phase Search = | testexecution
09-13-2018 11:33:08.462 INFO  SearchParser - PARSING: | testexecution
09-13-2018 11:33:08.462 INFO  ChunkedExternProcessor - Running process: /opt/splunk/bin/python /opt/splunk/etc/apps/testCommand/bin/testexecution.py
09-13-2018 11:33:08.520 INFO  SearchPipeline - ReportSearch=1 AllowBatchMode=0
09-13-2018 11:33:08.520 INFO  SearchParser - PARSING: testexecution phase="map" 
09-13-2018 11:33:08.522 INFO  ChunkedExternProcessor - Running process: /opt/splunk/bin/python /opt/splunk/etc/apps/testCommand/bin/testexecution.py
09-13-2018 11:33:08.522 ERROR ChunkedExternProcessor - stderr: This should only be seen once.
09-13-2018 11:33:08.582 INFO  SearchPhaseGenerator - Disabling timeliner since source is not an event index
09-13-2018 11:33:08.582 INFO  SearchPhaseGenerator - Disabling timeliner since event search is empty
09-13-2018 11:33:08.582 INFO  SearchPhaseGenerator - No need for RTWindowProcessor
09-13-2018 11:33:08.582 INFO  SearchPhaseGenerator - Timeliner will not be executed
09-13-2018 11:33:08.582 INFO  SearchPhaseGenerator - required fields list to add to different pipelines = 
09-13-2018 11:33:08.582 INFO  SearchPhaseGenerator - Search Phases created.
09-13-2018 11:33:08.583 INFO  UserManager - Setting user context: admin
09-13-2018 11:33:08.583 INFO  UserManager - Done setting user context: admin -> admin
09-13-2018 11:33:08.583 INFO  UserManager - Unwound user context: admin -> admin
09-13-2018 11:33:08.583 INFO  DistributedSearchResultCollectionManager - Stream search: 
09-13-2018 11:33:08.583 INFO  SearchPhaseGenerator - Result collation will occur at proc=testexecution
09-13-2018 11:33:08.583 INFO  SearchParser - PARSING: | streamnoop
09-13-2018 11:33:08.583 INFO  SearchParser - PARSING: streamnoop  | testexecution phase="map"  | testexecution 
09-13-2018 11:33:08.583 INFO  ChunkedExternProcessor - Running process: /opt/splunk/bin/python /opt/splunk/etc/apps/testCommand/bin/testexecution.py
09-13-2018 11:33:08.584 INFO  UserManager - Setting user context: admin
09-13-2018 11:33:08.584 INFO  UserManager - Done setting user context: NULL -> admin
09-13-2018 11:33:08.584 INFO  UserManager - Setting user context: admin
09-13-2018 11:33:08.584 INFO  UserManager - Done setting user context: NULL -> admin
09-13-2018 11:33:08.585 INFO  UserManager - Setting user context: admin
09-13-2018 11:33:08.585 INFO  UserManager - Done setting user context: NULL -> admin
09-13-2018 11:33:08.585 INFO  UserManager - Setting user context: admin
09-13-2018 11:33:08.585 INFO  UserManager - Done setting user context: NULL -> admin
09-13-2018 11:33:08.585 INFO  UserManager - Setting user context: admin
09-13-2018 11:33:08.585 INFO  UserManager - Done setting user context: NULL -> admin
09-13-2018 11:33:08.652 INFO  ChunkedExternProcessor - Running process: /opt/splunk/bin/python /opt/splunk/etc/apps/testCommand/bin/testexecution.py
09-13-2018 11:33:08.767 INFO  SearchOrchestrator - Starting the status control thread.
09-13-2018 11:33:08.767 INFO  SearchOrchestrator - Starting phase=1
09-13-2018 11:33:08.767 INFO  UserManager - Setting user context: admin
09-13-2018 11:33:08.767 INFO  UserManager - Done setting user context: NULL -> admin
09-13-2018 11:33:08.767 INFO  SearchStatusEnforcer - Enforcing disk quota = 10485760000
09-13-2018 11:33:08.767 INFO  PreviewExecutor - Preview Enforcing initialization done
09-13-2018 11:33:08.767 INFO  UserManager - Setting user context: admin
09-13-2018 11:33:08.767 INFO  UserManager - Done setting user context: NULL -> admin
09-13-2018 11:33:08.767 INFO  ReducePhaseExecutor - Stating phase_1
09-13-2018 11:33:08.767 INFO  DispatchExecutor - BEGIN OPEN: Processor=testexecution
09-13-2018 11:33:08.771 INFO  DispatchExecutor - END OPEN: Processor=testexecution
09-13-2018 11:33:08.776 ERROR ChunkedExternProcessor - stderr: This should only be seen once.
09-13-2018 11:33:08.779 INFO  ReducePhaseExecutor - Ending phase_1
09-13-2018 11:33:08.779 INFO  UserManager - Unwound user context: admin -> NULL
09-13-2018 11:33:08.779 INFO  DispatchManager - DispatchManager::dispatchHasFinished(id='1536831188.107', username='admin')
09-13-2018 11:33:08.786 INFO  UserManager - Unwound user context: admin -> NULL
09-13-2018 11:33:08.786 INFO  UserManager - Unwound user context: admin -> NULL
09-13-2018 11:33:08.787 INFO  UserManager - Unwound user context: admin -> NULL
09-13-2018 11:33:08.787 INFO  UserManager - Unwound user context: admin -> NULL
09-13-2018 11:33:08.787 INFO  UserManager - Unwound user context: admin -> NULL
09-13-2018 11:33:08.787 INFO  UserManager - Unwound user context: admin -> NULL
09-13-2018 11:33:08.787 INFO  UserManager - Unwound user context: admin -> NULL
09-13-2018 11:33:08.787 INFO  UserManager - Unwound user context: admin -> NULL

I appreciate your help understanding Splunk better.

0 Karma

rasad4468
Explorer

 

It seems like your custom search command is being executed twice because of the way Splunk handles the map/reduce phases in a distributed search environment. This behavior is expected in certain scenarios, and it's often related to how Splunk distributes and processes data across different search peers.

In your specific case, the key information is found in the search.log:

09-13-2018 11:33:08.462 INFO ParallelReducePolicy - Current Search Head doesn't have any usable peers to use.
09-13-2018 11:33:08.462 INFO PhaseNodeGenerationVisitor - User lacking run_multi_phased_searches, rolling back to 2-phase mode.

Splunk is running the map phase twice because it's not able to parallelize the search across multiple search peers. As a result, it falls back to a two-phase execution (map and reduce) on the search head itself.

To avoid this behavior, you might want to consider the following:

  1. Parallelization: Ensure that your search environment has multiple search peers that can be utilized for parallel processing. This may involve configuring search peer clustering or making adjustments to your distributed environment.

  2. run_multi_phased_searches: The log mentions that the user lacks the run_multi_phased_searches capability. This capability allows Splunk to run multiple phases on different search peers simultaneously. You might want to check the user's role and capabilities to ensure it has the necessary permissions.

  3. Debugging: Consider adding more logging statements to your custom search command to trace the execution flow and see if it provides additional insights into why it's being called twice.

  4. Splunk Documentation: Check the Splunk documentation for custom search commands and distributed search to see if there are any specific guidelines or recommendations for handling distributed environments.

  5. Splunk Answers: Search the Splunk Answers community for similar issues or ask a question there. The community is active, and you might find someone who has encountered and resolved a similar problem.

Keep in mind that Splunk's distributed search behavior can be complex, and understanding how it distributes and processes data is crucial for developing efficient custom search commands.

 

I hope this will help

Thanks

rasad4468

 

0 Karma

spunk_enthusias
Path Finder

Throughout the custom search command process, splunkd and the Python script exchange metadata through a series of getinfo and execute commands.

  1. splunkd sends the getinfo command to request information, including the command type and required fields, from the Python script.
  2. splunkd sends a separate execute command for each chunk of search results in the pipeline

So what you're seeing is the getinfo step. As far as I know there's no way to disable it or easily shorten it. This also means that any latency that your script has will be doubled and given to the next person.

Great stuff. It looks to me like Splunk doesn't care about custom search commands and the feature is pretty much unmaintained. 

This is understandable of course, since it's only a small, young company, run by amateurs with only few resources to spare.

0 Karma
Get Updates on the Splunk Community!

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...

State of Splunk Careers 2024: Maximizing Career Outcomes and the Continued Value of ...

For the past four years, Splunk has partnered with Enterprise Strategy Group to conduct a survey that gauges ...

Data-Driven Success: Splunk & Financial Services

Splunk streamlines the process of extracting insights from large volumes of data. In this fast-paced world, ...