Splunk Search

Showing >100% Complete status

kcull997
Observer

Using Python in Jupyter notebooks to run Splunk API. The queries run fine from both Python and Splunk itself. However, when running in Python, my status messages show completeness over 100%, sometimes as high as 49434736695157.4% , which shouldn't be mathematically possible. Again, the actual stats end up being correct. All the code was not originally written by me, but I've been tasked with trying to find the flaw. I'll add the non-query code below where I believe the problem lies. I'm brand new to Splunk and I'm not bad at Python, so it is a little tricky for me to see the issue. Appreciate any help on this. Thanks!

 

# Execute search in export mode and return results as dataframe
def export_to_dataframe(_service, _query):
          df = pd.DataFrame() # create empty dataframe to store results
          job = _service.jobs.create(_query) # execute search query in export mode
         while True:
                 while not job.is_ready():
                        pass
         stats = {

                         "isDone": job["isDone"],
                         "doneProgress": float(job["doneProgress"])*100,
                         "scanCount": int(job["scanCount"]),
                         "eventCount": int(job["eventCount"]),
                        "resultCount": int(job["resultCount"])

          }

         status = ("\r%(doneProgress)03.1f%% %(scanCount)d scanned "
                         "%(eventCount)d matched %(resultCount)d results") % stats

         sys.stdout.write(status)
         sys.stdout.flush()
         if stats["isDone"] == "1":
              sys.stdout.write("\n\nDone!\n\n")
              break
     sleep(2)
     # wait for results
     jobResults = results.ResultsReader(job.results()) # read job results
     for result in jobResults:
           if isinstance(result, dict):
                  df = df.append(result, result.keys()) # append message to df
      return df

# Execute search in normal mode (non-blocking) and return results as dataframe
def normal_to_dataframe(_service, _query):
       from time import sleep
       df = pd.DataFrame() # create empty dataframe to store results
       job = _service.jobs.create(_query) # execute search query in normal mode
      while not job.is_done(): # Poll for Splunk search job completion
              sleep(1)
       jobResults = results.ResultsReader(job.results(count=0)) # read job results
       for result in jobResults:
               if isinstance(result, dict):
                         df = df.append(result, result.keys()) # append message to df
         return df

Labels (2)
0 Karma
Register for .conf21 Now! Go Vegas or Go Virtual!

How will you .conf21? You decide! Go in-person in Las Vegas, 10/18-10/21, or go online with .conf21 Virtual, 10/19-10/20.