<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to submit a Splunk Python SDK query with a restricted time range and return more than 50000 rows? in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/How-to-submit-a-Splunk-Python-SDK-query-with-a-restricted-time/m-p/116980#M31138</link>
    <description>&lt;P&gt;Adapting from this  solution: &lt;A href="http://answers.splunk.com/answers/124848/python-sdk-paginate-result-set.html#answer-227017"&gt;http://answers.splunk.com/answers/124848/python-sdk-paginate-result-set.html#answer-227017&lt;/A&gt; (thanks @paramagurukarthikeyan for the pointer and the answer), the following seems to work:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;import sys
import io
import splunklib.results as results
import splunklib.client as client

service = client.connect(host=HOST,port=PORT,username=USERNAME,password=PASSWORD)

job = service.jobs.create(search, **{"exec_mode": "blocking", 
                                 "earliest_time": start_time, 
                                 "latest_time": end_time,
                                 "output_mode": "xml",
                                 "maxEvents": 30000000})

resultCount = int(job["resultCount"])
offset = 0;                                # Start at result 0
count = 50000;                       # Get sets of count results at a time
thru_counter = 0

while (offset &amp;lt; resultCount):
    kwargs_paginate = {"count": count, "offset": offset}

    # Get the search results and display them
    rs = job.results(**kwargs_paginate)
    reader = results.ResultsReader(io.BufferedReader(rs))

    wrt = sys.stdout.write
    for ix, item in enumerate(reader):
        if not (thru_counter % 50000):  # print only one in 50000 results as sanity check
            line = ""
            for val in item.itervalues():
                line += val + ","
            wrt(line[:-1] + "\n")
        thru_counter += 1
    # Increase the offset to get the next set of results
    offset += count
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;There is a remaining issue, that the parsing is relatively slow (I am getting ~1300 rows/sec, where each row is 100 bytes, i.e. ~130 kbps). The reason is hinted at in the answer of @ineeman on March 10 2014 in this question &lt;A href="http://answers.splunk.com/answers/114045/python-sdk-results-resultsreader-extremely-slow.html"&gt;http://answers.splunk.com/answers/114045/python-sdk-results-resultsreader-extremely-slow.html&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;I am posting a separate question to see if I can improve the speed of fetching the query results. &lt;/P&gt;</description>
    <pubDate>Wed, 27 May 2015 00:37:05 GMT</pubDate>
    <dc:creator>nikos_d</dc:creator>
    <dc:date>2015-05-27T00:37:05Z</dc:date>
    <item>
      <title>How to submit a Splunk Python SDK query with a restricted time range and return more than 50000 rows?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-submit-a-Splunk-Python-SDK-query-with-a-restricted-time/m-p/116978#M31136</link>
      <description>&lt;P&gt;I am trying to submit a query which is limited to a restricted time window AND returns more than 50000 rows in Python. &lt;/P&gt;

&lt;P&gt;I saw an answer on exceeding the 50000 row limit &lt;A href="http://answers.splunk.com/answers/39243/python-sdk-results-limited-to-50-000.html"&gt;here&lt;/A&gt; but I cannot figure out how to add a custom time range to the query. &lt;/P&gt;

&lt;P&gt;The only way I know how to submit a limited time-range query is via the one_shot query of the Python SDK:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;    import splunklib.client as client
    import splunklib.results as results

    service = client.connect(host=HOST, port=PORT, username=USERNAME, password=PASSWORD)

    kwargs_oneshot = {"earliest_time": earliest_time, 
                      "latest_time": latest_time,
                      "output_mode": "xml",
                      "count": 0}

    searchquery_oneshot = basequery

    oneshotsearch_results = service.jobs.oneshot(searchquery_oneshot, **kwargs_oneshot)

    reader = results.ResultsReader(oneshotsearch_results)

    for ix, item in enumerate(reader):
        for val in item.itervalues():
            print(val)
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;However,querying  this way limits my results to 50000 rows. Any workarounds?&lt;/P&gt;</description>
      <pubDate>Thu, 21 May 2015 21:48:01 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-submit-a-Splunk-Python-SDK-query-with-a-restricted-time/m-p/116978#M31136</guid>
      <dc:creator>nikos_d</dc:creator>
      <dc:date>2015-05-21T21:48:01Z</dc:date>
    </item>
    <item>
      <title>Re: How to submit a Splunk Python SDK query with a restricted time range and return more than 50000 rows?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-submit-a-Splunk-Python-SDK-query-with-a-restricted-time/m-p/116979#M31137</link>
      <description>&lt;P&gt;This is the link which did not show up above due to my low number of points: &lt;A href="http://answers.splunk.com/answers/39243/python-sdk-results-limited-to-50-000.html"&gt;http://answers.splunk.com/answers/39243/python-sdk-results-limited-to-50-000.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 21 May 2015 21:49:56 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-submit-a-Splunk-Python-SDK-query-with-a-restricted-time/m-p/116979#M31137</guid>
      <dc:creator>nikos_d</dc:creator>
      <dc:date>2015-05-21T21:49:56Z</dc:date>
    </item>
    <item>
      <title>Re: How to submit a Splunk Python SDK query with a restricted time range and return more than 50000 rows?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-submit-a-Splunk-Python-SDK-query-with-a-restricted-time/m-p/116980#M31138</link>
      <description>&lt;P&gt;Adapting from this  solution: &lt;A href="http://answers.splunk.com/answers/124848/python-sdk-paginate-result-set.html#answer-227017"&gt;http://answers.splunk.com/answers/124848/python-sdk-paginate-result-set.html#answer-227017&lt;/A&gt; (thanks @paramagurukarthikeyan for the pointer and the answer), the following seems to work:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;import sys
import io
import splunklib.results as results
import splunklib.client as client

service = client.connect(host=HOST,port=PORT,username=USERNAME,password=PASSWORD)

job = service.jobs.create(search, **{"exec_mode": "blocking", 
                                 "earliest_time": start_time, 
                                 "latest_time": end_time,
                                 "output_mode": "xml",
                                 "maxEvents": 30000000})

resultCount = int(job["resultCount"])
offset = 0;                                # Start at result 0
count = 50000;                       # Get sets of count results at a time
thru_counter = 0

while (offset &amp;lt; resultCount):
    kwargs_paginate = {"count": count, "offset": offset}

    # Get the search results and display them
    rs = job.results(**kwargs_paginate)
    reader = results.ResultsReader(io.BufferedReader(rs))

    wrt = sys.stdout.write
    for ix, item in enumerate(reader):
        if not (thru_counter % 50000):  # print only one in 50000 results as sanity check
            line = ""
            for val in item.itervalues():
                line += val + ","
            wrt(line[:-1] + "\n")
        thru_counter += 1
    # Increase the offset to get the next set of results
    offset += count
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;There is a remaining issue, that the parsing is relatively slow (I am getting ~1300 rows/sec, where each row is 100 bytes, i.e. ~130 kbps). The reason is hinted at in the answer of @ineeman on March 10 2014 in this question &lt;A href="http://answers.splunk.com/answers/114045/python-sdk-results-resultsreader-extremely-slow.html"&gt;http://answers.splunk.com/answers/114045/python-sdk-results-resultsreader-extremely-slow.html&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;I am posting a separate question to see if I can improve the speed of fetching the query results. &lt;/P&gt;</description>
      <pubDate>Wed, 27 May 2015 00:37:05 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-submit-a-Splunk-Python-SDK-query-with-a-restricted-time/m-p/116980#M31138</guid>
      <dc:creator>nikos_d</dc:creator>
      <dc:date>2015-05-27T00:37:05Z</dc:date>
    </item>
  </channel>
</rss>

