Splunk Search

How to export the results of a Splunk search that contains transforming commands?

CarbonCriterium
Path Finder

I am looking to export the results of a Splunk search that contains transforming commands.  When I run the same search in the web GUI the live results "hang" on 50,000 stats, but once the search is complete it shows more than 300,000.  (screenshots provided below)

  • Using the Splunk API, I want to export all results in a .json format, and
  • I only want to view the final results; I do not want to view the results as they are streamed
  • In essence I want to avoid the API returning any row where:
"preview":true

What am I missing?

 

While performing searchWhile performing searchFinished resultsFinished results

 

 

 

 

 

 

 

 

Using python 3.9's  requests, my script contains the following:

 

 

 

headers={'Authorization': 'Splunk %s' % sessionKey}
parameters={'exec_mode': "oneshot", 'output_mode':output_type, 'adhoc_search_level':'fast', 'count':0}    

with post(url=baseurl + '/services/search/jobs/export',params=parameters, data=({'search': search_query}), timeout=60, headers=headers, verify=False, stream=True) as response:

 

 

 

Labels (1)
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

How long does your search take? You have a timeout of 60 (seconds?) - could this be increased?

0 Karma

CarbonCriterium
Path Finder

The trouble is not the timeout.  When the Splunk search is complete the results are successfully moved into an output file.  

I am interested in learning how to return only the final results of ~300,000 rows from Splunk.  The current URI path and parameters are leading to an output of more than 1,924,900 rows.  I think this is some indication that Splunk is streaming "live" results and I only want the final results.

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

I don't know how this works but I did notice that you have stream=True - could this be the issue? Have you tried with stream=False?

0 Karma

CarbonCriterium
Path Finder

The instance of "Stream=True" that you are referring to is not related to any part of the request to Splunk's REST API.

This is part of the python requests module/library that, if the response is available as a stream, reads the response as a stream.  Further into the script this allows for the results to be written "live" as they are returned... instead of as one massive 2,924,899 line file.  In essence, this is a band-aid to patch the issue I am actually asking in my question.

0 Karma
Get Updates on the Splunk Community!

Troubleshooting the OpenTelemetry Collector

  In this tech talk, you’ll learn how to troubleshoot the OpenTelemetry collector - from checking the ...

Adoption of Infrastructure Monitoring at Splunk

  Splunk's Growth Engineering team showcases one of their first Splunk product adoption-Splunk Infrastructure ...

Modern way of developing distributed application using OTel

Recently, I had the opportunity to work on a complex microservice using Spring boot and Quarkus to develop a ...