Splunk Dev

Why doesn't this custom search command call class method?

plucas_splunk
Splunk Employee
Splunk Employee

Given an excerpt from custom search command:

logger = logging.getLogger( 'nbclosest' )
logger.setLevel( logging.DEBUG )

K_STAG  = 'stop_tag'
K_TIME  = '_time'
K_VDIST = 'vehicle_distance'
K_VID   = 'vehicle_id'

@Configuration()
class NextBusClosestStop( EventingCommand ):
    class ConfigurationSettings( EventingCommand.ConfigurationSettings ):
        required_fields = ConfigurationSetting(value=[K_TIME, K_VID, K_VDIST, K_STAG])

    def __init__( self ):
        super( NextBusClosestStop, self ).__init__()
        # ...

    def drain( self ):
        logger.debug( 'enter drain()' )
        # do drain code

    def transform( self, records ):
        logger.debug( 'enter transform()' )
        for rec in records:
            # ...
            yield rec

        logger.debug( 'exit transform()' )
        self.drain()

The transform() function is called and both enter transform() and exit transform() are in search.log, but I never see enter drain() logged --- and the code is indeed never called (because the results produced are wrong).

However, if I copy & paste the code from drain() and put it "inline" in place of self.drain(), then the code executes.

How can it be the case that self.drain() isn't called?

1 Solution

plucas_splunk
Splunk Employee
Splunk Employee

After doing some more reading on yield, it turns out that putting yield into a sub-function turns that function into a generator and, in order to get results out of it, one has to iterate over that generator:

    def drain( self ):
        recs = self.vdict.values()
        for rec in sorted( recs, key=operator.itemgetter( K_TIME ) ):
            yield rec
        self.vdict.clear()

Then to call it:

            for rec in self.drain():
                yield rec

In Python >= 3.3, one can instead do:

            yield from self.drain()

but Splunk currently ships with Python 2.7.11.

View solution in original post

0 Karma

plucas_splunk
Splunk Employee
Splunk Employee

After doing some more reading on yield, it turns out that putting yield into a sub-function turns that function into a generator and, in order to get results out of it, one has to iterate over that generator:

    def drain( self ):
        recs = self.vdict.values()
        for rec in sorted( recs, key=operator.itemgetter( K_TIME ) ):
            yield rec
        self.vdict.clear()

Then to call it:

            for rec in self.drain():
                yield rec

In Python >= 3.3, one can instead do:

            yield from self.drain()

but Splunk currently ships with Python 2.7.11.

0 Karma

jkat54
SplunkTrust
SplunkTrust

looks like self.drain() is called from within self.transform()?

I see you yield results in a for loop too. Doesn't that cause the transform function to yield and exit when records exist?

I think you can fix by calling self.drain() after you use "for x in self.transform():" instead of wrapping it up inside of self.transform()

0 Karma

jkat54
SplunkTrust
SplunkTrust
0 Karma

plucas_splunk
Splunk Employee
Splunk Employee

Yes, that's the code.

0 Karma

jkat54
SplunkTrust
SplunkTrust

I went for help and had it carefully pointed out to me that the exit transfor log happens even though it's outside of the loop too. This is very interesting indeed.

0 Karma

plucas_splunk
Splunk Employee
Splunk Employee

Yes, that's the part that is most confusing.

0 Karma

jkat54
SplunkTrust
SplunkTrust

http://stackoverflow.com/questions/231767/what-does-the-yield-keyword-do

I'm not sure what you mean by inline but my statement still holds some water. Yield returns a generator object and Python will iterate through the function until it hits the yield. As long as there is a records array to be had, you'll never see self.drain called. When you call self.transform without a records array self.drain will happen because the "for rec in records:" doesn't execute and this the yield doesn't take place.

0 Karma

plucas_splunk
Splunk Employee
Splunk Employee

By "inline" I mean that instead of calling drain(), I copy the code from drain() and paste a copy of it to where I call drain().

0 Karma

plucas_splunk
Splunk Employee
Splunk Employee

looks like self.drain() is called from within self.transform()?

Yes.

I see you yield results in a for loop too.

Yes, as does every example I've ever seen.

Doesn't that cause the transform function to yield and exit when records exist?

AFAIK, transform() is called with multiple records that Splunk sends in "chunks" (which is why this is called a "Chunked External Processor" in version 2 of the Python SDK). The transform() function then iterates over the records doing something with them. For those it wishes to return to Splunk, it calls yield. However, despite its name, I doubt yield actually yields control because then --- somehow --- the for loop would have to pick up from where it left off the next time transform() is called. Hence, I believe yield is probably closer to "print."

I think you can fix by calling self.drain() after you use "for x in self.transform():" instead of wrapping it up inside of self.transform()

But the way the API works is that transform() is called by Splunk --- you do not call it yourself.

And this doesn't explain why "inline" code would be executed while the call to the function would not.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas     Cisco Live 2026 is almost here, and this ...

Data Management Digest – May 2026

Welcome to the May 2026 edition of Data Management Digest!   As your trusted partner in data innovation, the ...