Splunk SDK Python error handling on quota error

anbutech17 · ‎05-27-2020

Hi Team,
Could you please advise on the below error while i'm testing the splunk sdk python code to search the given index in splunk.

HTTPError: HTTP 503 Service Unavailable -- Search not executed: This search could not be dispatched because the role-based disk usage quota of search artifacts for user "test01" has been reached (usage=497MB, quota=100MB). Use the [[/app/search/]] to delete some of your search artifacts, or ask your Splunk administrator to increase the disk quota of search artifacts for your role in authorize.conf., usage=497MB, quota=100MB, user=test01, concurrency_category="historical", concurrency_context="user_instance-wide".

This error is coming often for sometimes and after i could able to search the index in splunk.In the splunk sdk i want to handle this error properly to execute the code meaningful without any errors.
we want to pass the above quota error and refresh or kill the jobs which got stuck or over running.
ultimately, we could overcome that error as we are seeing it all the time, we should somehow catch that error and display list of running jobs/search queries which are queued to the user.
Also that when we cancelled all queries and we know that no queries currently running.how to say that it’s reached quota.What’s currently running in quota.We don’t want to submit more queries and we know about quota..
please advise how do we handle the quota error in the below code snippet to pass the quota error.

splunk sdk code

The input are as follows:

'--search_query', 'index=some index | table field1,field2,field3 | head 1000',
'--earliest_time', ' -24h',
'--latest_time', 'now'

service = splunk_connect()
splunk_search_kwargs = {"exec_mode": "blocking",
"earliest_time":args.earliest_time.strip(),
"latest_time":args.latest_time.strip(),
"enable_lookups": "true"}
jobs = service.jobs
splunk_search_job = jobs.create("search "+args.search_query, splunk_search_kwargs)
result_count = int(splunk_search_job['resultCount'])
print(f'{get_dt()} - No. of rows returned from search query... {result_count}')
if (result_count <= 100):
r = splunk_search_job.results({"count": 100, "output_mode": "json"})
obj = json.loads(r.read())
sample_results = json.dumps(obj['results'], indent=4)
print(f'{get_dt()} - displaying first 100 rows {sample_results}')
else:
try:
r = splunk_search_job.results(**{"output_mode": "json"})
obj = json.loads(r.read())
fl_nm = f'{args.save_file}/{get_dt()}.json'
with open(fl_nm, 'w') as f:
f.write(json.dumps(obj['results']))
except Exception as error:
print(error)

Thanks

Patrick_Peeters · ‎05-28-2020

Case 1
Well to catch that particular error, you'll have to do something like below. However the biggest issue you have is that you're running over the disk quota for that search. Even if you catch the error (and you can continue or retry for example) you're probably better off addressing the root cause in the first place.

try:
    <do something>
except HTTPError as e:
    <try something else, or wait and try again>

Case 2
This is a separate issue but it's pretty straightforward, you have to convert the OrderedDict to a JSON object, which can be done with json.loads.

json_payload = OrderedDict([('id', '1'), ('name', 'my_name')]
json.dumps(json_payload, default=str)

Please try to use code blocks whenever you're pasting code, it's much easier to read and to assist.

anbutech17 · ‎05-28-2020

Thank you so much ppeeters

anbutech17 · ‎05-28-2020

one clarification with except handling.you have mentioned like .can you give me some more explanations.it would be helpful for my understanding.

could you please help me on the below questions.

1) To pass the quota error and refresh or kill the jobs which got stuck or over running.
2) Is there anyway to display list of running jobs/search queries which are queued to the user.
we want to kill those long running queries and query which got stuck already.

Patrick_Peeters · ‎05-28-2020

I'm not too familiar with the exact syntax to refresh or kill jobs so can't help you with that, apart from advising you to look at try, except, continue blocks. Keep in mind though that your query is generating ~500MB of disk space (at times, seemingly not always) so unless you do something about the query or increase the disk space there is little you can do. Worst case is that you catch the error and ignore it, and continue with other operations in case the script needs to do those. You can't 'undo' the quota anyway and restarting the same query will likely result in the same error anyway.
I recommend looking at the API documentation or at the SDK examples, I don't know this off the top of my head.

anbutech17 · ‎05-28-2020

Thank you sir

anbutech17 · ‎05-28-2020

do you think the search fails when 2 people are trying to execute simultaneously?.
will it support support parallel execution by different users.
for example myself has 100 MB quota.how does parallel execution search indexes working in in this case?

Patrick_Peeters · ‎05-28-2020

It depends on how you setup and run the scripts, e.g. inside Splunk or from outside Splunk etc. and who's running or executing them. Too many variables to discuss on the forum, I recommend reading the documentation to see what's possible and how to do it.

anbutech17 · ‎05-28-2020

Thanks ppeeters.

please help me on the following error in spunk sdk input search using the following code.how do we handle the below error.
print(error) is not showing anything now.sometimes i'm getting the following error.in a few minutes later if i'm again search,it is working fine.i want to pass the quota error in the code.

Case 1:

HTTPError: HTTP 503 Service Unavailable -- Search not executed: This search could not be dispatched because the role-based disk usage quota of search artifacts for user "test01" has been reached (usage=497MB, quota=100MB). Use the [[/app/search/]] to delete some of your search artifacts, or ask your Splunk administrator to increase the disk quota of search artifacts for your role in authorize.conf., usage=497MB, quota=100MB, user=test01, concurrency_category="historical", concurrency_context="user_instance-wide".

formatted code:

service = splunk_connect()

splunk_search_kwargs = {
"exec_mode": "blocking",
"earliest_time":args.earliest_time,
"latest_time":args.latest_time,
"enable_lookups": "true"
}
try:
if (result_count <= 100):
r = splunk_search_job.results({"count": 100, "output_mode": "json"})
obj = json.loads(r.read())
sample_results = json.dumps(obj['results'], indent=4)
print(f'{get_dt()} - displaying first 100 rows {sample_results}')
else:
r = splunk_search_job.results(**{"output_mode": "json"})
obj = json.loads(r.read())
fl_nm = f'{args.save_file}/{get_dt()}.json'
with open(fl_nm, 'w') as f:
.write(json.dumps(obj['results']))
except Exception as error:
print(error)

Case 2:
oneshot search options:

This one shot search options not working for "output_mode": "json".if i removed output mode,it is returned as OrderedDict
format as follows

output:
OrderedDict([('field1', '1.2.3.3'), ('field2', '8.7.1.0'), ('field3', 'sample text msg')])
OrderedDict([('field1', '1.2.3.3'), ('field2', '8.7.1.0'), ('field3', 'sample text msg.')])
OrderedDict([('field1', '1.2.3.3'), ('field2', '8.7.1.0'), ('field3', 'sample text msg')])

how do we print the output mode as json format.i want to write the json data into a file.it is not working properly.it would be helpful,if you give me some advise on this.

sample code snippet:

service = spunk_connect()
kwargs_oneshot = {"search_mode": "normal",
"count": 0,
"output_mode": "json",
"earliest_time":args.earliest_time.strip(),
"latest_time":args.latest_time.strip()
}
searchquery_oneshot= "search " + args.search_query
try:
oneshotsearch_results = service.jobs.oneshot(searchquery_oneshot, **kwargs_oneshot)
#Get the results and display them using the ResultsReader
reader = results.ResultsReader(oneshotsearch_results)
for item in reader:
print(item)

Thanks

Patrick_Peeters · ‎05-28-2020

The code is hard to read without formatting but you might have to move the try part up a bit, also what is the print(error) showing?

anbutech17 · ‎05-28-2020

Thanks ppeeters for looking into this.Please advise on the following two cases.

Case 1:

how do we handle the following error in the splunk sdk code.this error is coming often and after a few minutes later .the search is working fine.the print(error) is not throwing anything now.sometimes i'm getting the following error.

HTTPError: HTTP 503 Service Unavailable -- Search not executed: This search could not be dispatched because the role-based disk usage quota of search artifacts for user "test01" has been reached (usage=497MB, quota=100MB). Use the [[/app/search/]] to delete some of your search artifacts, or ask your Splunk administrator to increase the disk quota of search artifacts for your role in authorize.conf., usage=497MB, quota=100MB, user=test01, concurrency_category="historical", concurrency_context="user_instance-wide".

splunk_search_kwargs = {
"exec_mode": "blocking",
"earliest_time":args.earliest_time.strip(),
"latest_time":args.latest_time.strip(),
"enable_lookups": "true"
}

jobs = service.jobs
splunk_search_job = jobs.create("search "+args.search_query, splunk_search_kwargs)
try:
r = splunk_search_job.results(**{"output_mode": "json"})
obj = json.loads(r.read())
fl_nm = "some path"
with open(fl_nm, 'w') as f:
.write(json.dumps(obj['results']))
except Exception as error:
print(error)

Case 2:

Also that using one shot search options output mode json is not working properly.if i'm removing " output mode":" json",the output returned as OrderedDict format.how do we convert into json format .i want to write the output as json data in a file.

OrderedDict([('dvc', '1.2.3.3'), ('dest_ip', '8.7.1.0'), ('message', 'sample text msg')])
OrderedDict([('dvc', '1.2.3.3'), ('dest_ip', '8.7.1.0'), ('message', 'sample text msg.')])
OrderedDict([('dvc', '1.2.3.3'), ('dest_ip', '8.7.1.0'), ('message', 'sample text msg')])

kwargs_oneshot = {"search_mode": "normal",
"count": 0,
"output_mode": "json",
"earliest_time":args.earliest_time.strip(),
"latest_time":args.latest_time.strip()
}
try:
oneshotsearch_results = service.jobs.oneshot(searchquery_oneshot, **kwargs_oneshot)
#Get the results and display them using the ResultsReader
reader = results.ResultsReader(oneshotsearch_results)
for item in reader:
print(item)
except Exceptions as error:
print(error)

Thanks

Splunk SDK Python error handling on quota error

Python

SDK

Prove Your Splunk Prowess at .conf25—No Prereqs Required!

Splunk Observability Cloud's AI Assistant in Action Series: Observability as Code

Splunk Answers Content Calendar, July Edition I

Are you a member of the Splunk Community?

Splunk SDK Python error handling on quota error

Python

SDK

Prove Your Splunk Prowess at .conf25—No Prereqs Required!

Splunk Observability Cloud's AI Assistant in Action Series: Observability as Code

Splunk Answers Content Calendar, July Edition I