Splunk Search

How to export the results of a Splunk search that contains transforming commands?

CarbonCriterium
Path Finder

I am looking to export the results of a Splunk search that contains transforming commands.  When I run the same search in the web GUI the live results "hang" on 50,000 stats, but once the search is complete it shows more than 300,000.  (screenshots provided below)

  • Using the Splunk API, I want to export all results in a .json format, and
  • I only want to view the final results; I do not want to view the results as they are streamed
  • In essence I want to avoid the API returning any row where:
"preview":true

What am I missing?

 

While performing searchWhile performing searchFinished resultsFinished results

 

 

 

 

 

 

 

 

Using python 3.9's  requests, my script contains the following:

 

 

 

headers={'Authorization': 'Splunk %s' % sessionKey}
parameters={'exec_mode': "oneshot", 'output_mode':output_type, 'adhoc_search_level':'fast', 'count':0}    

with post(url=baseurl + '/services/search/jobs/export',params=parameters, data=({'search': search_query}), timeout=60, headers=headers, verify=False, stream=True) as response:

 

 

 

Labels (1)
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

How long does your search take? You have a timeout of 60 (seconds?) - could this be increased?

0 Karma

CarbonCriterium
Path Finder

The trouble is not the timeout.  When the Splunk search is complete the results are successfully moved into an output file.  

I am interested in learning how to return only the final results of ~300,000 rows from Splunk.  The current URI path and parameters are leading to an output of more than 1,924,900 rows.  I think this is some indication that Splunk is streaming "live" results and I only want the final results.

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

I don't know how this works but I did notice that you have stream=True - could this be the issue? Have you tried with stream=False?

0 Karma

CarbonCriterium
Path Finder

The instance of "Stream=True" that you are referring to is not related to any part of the request to Splunk's REST API.

This is part of the python requests module/library that, if the response is available as a stream, reads the response as a stream.  Further into the script this allows for the results to be written "live" as they are returned... instead of as one massive 2,924,899 line file.  In essence, this is a band-aid to patch the issue I am actually asking in my question.

0 Karma
Get Updates on the Splunk Community!

More Ways To Control Your Costs With Archived Metrics | Register for Tech Talk

Tuesday, May 14, 2024  |  11AM PT / 2PM ET Register to Attend Join us for this Tech Talk and learn how to ...

.conf24 | Personalize your .conf experience with Learning Paths!

Personalize your .conf24 Experience Learning paths allow you to level up your skill sets and dive deeper ...

Threat Hunting Unlocked: How to Uplevel Your Threat Hunting With the PEAK Framework ...

WATCH NOWAs AI starts tackling low level alerts, it's more critical than ever to uplevel your threat hunting ...