Splunk Dev

Python SDK : How to retrieve search results by saved search name?

vickypandya
Engager

Hi folks,
I am new to python and splunk. I have been trying to get saved search results via splunk sdk python. I have tried using job.py(example in sdk) which outputs sid for all the search jobs which can be used to find search name and then use the sid to get the results.

I have also tried to do GET to Services/search/jobs and I get a list of all the jobs which is in turn is tons of data to parse the xml output to get desired search name.

Are there any other approaches to get the saved search results by search name rather than Search ID ? if not what are the available options through sdk route ?

Any help is much appreciated.

Thanks

sklass
Path Finder

Here is a loose example on how to do this.

    search_params = {'name': "Some lame search",
                     'search': "<FILL ME IN>",
                     'dispatch.ttl': 60 * 60 * 24 * 7 }

    search_params_update = {
        'description': 'Some description',
        'is_scheduled': True,
        'cron_schedule': '0 1 * * *',      # Daily at 1am
        'schedule_window': 120,
    }

    credentials = SplunkAuth._asdict()
    service = client.connect(**credentials)

    try:
        saved_search = service.saved_searches.create(**search_params)
    except HTTPError as err:
        if "A saved search with that name already exists." not in "{}".format(err):
            log.warning("Unable to set off search - {}".format(" :: ".join("{}".format(err).split("\n"))))
            raise
        else:
            saved_search = service.saved_searches[search_params.get('name')]
            update_required = False
            for k, v in search_params_update.items():
                if saved_search.content.get(k) != v:
                    update_required = True
                    break
            if update_required:
                saved_search.update(**search_params_update).refresh()
    else:
        saved_search.update(**search_params_update).refresh()

    # Do we have a job that is ready to go..
    job_data = json.load(service.jobs.get(output_mode='json').get('body'))
    completed_jobs = [x for x in job_data.get('entry') if x.get('content', {}).get('label') == search_params['name']
                      and x.get('content', {}).get('isDone')]
    try:
        latest = completed_jobs[0]
        last_update = datetime.datetime.strptime(latest.get('published').rpartition("-")[0], "%Y-%m-%dT%H:%M:%S.%f")
        if (datetime.datetime.now() - last_update).total_seconds() > 60 * 60 * 12:
            log.info("Launching new job it's pretty old. {}".format(last_update))
            saved_search.dispatch()
        log.info("Getting latest completed job {}".format(latest.get('updated')))
        job = service.jobs[latest.get('content').get('sid')]
    except KeyError:
        # What do we have in progress.
        in_process_jobs = [x for x in job_data.get('entry') if
                          x.get('content', {}).get('label') == search_params['name']
                          and not x.get('content', {}).get('isDone')]
        if not in_process_jobs:
            saved_search.dispatch()
            log.info("New Job has been dispatched")
            return {'message': "Job has been dispatched"}
        else:
            in_process_job = in_process_jobs[-1]
            log.info("Job previously dispatched and is at {:.2%}".format(
                in_process_job.get('content', {}).get('doneProgress')))
            return {'message': "Job previously dispatched and is at {:.2%}".format(
                in_process_job.get('content', {}).get('doneProgress'))}
0 Karma

hexx
Splunk Employee
Splunk Employee

To add to Andrea's answer, search results can only be retrieved by referencing the search ID of your search from the /services/jobs/{search_id} endpoint and its sub-nodes such as /services/search/jobs/{search_id}/results.

For more detailed information, take a look at the endpoints listed for /services/search/jobs.

You should be able to achieve this goal with this sort of pseudo-code:

  • List all search jobs with a GET against /services/search/jobs/
  • Identify the search jobs that match the saved search name that you are looking for (isSaved=1 AND label={saved search name})
  • Pick the most recent search job for your saved search. It will be the one with most recent epoch time embedded in its search ID. Example: sid=admin__admin__search_dGVzdCA0_1343881451.4909
  • Use that SID to access the results of your search with a GET against /services/search//jobs/{search_id}/results

Note that these tasks can be made easier by using one of our SDKs such as the Python SDK.

You'll probably want to read more about the "job" and "jobs"" classes along with their methods in the Python SDK reference documentation:

apruneda_splunk
Splunk Employee
Splunk Employee

Check out the topic: "How to search your data using the Python SDK".

There are code examples that show how to run a saved search and see the results, and how to list your search jobs and get those results. The beginning of the topic explains the difference between a saved search and a search job.

However, for a job, the SID is very important. You could have many jobs resulting from one saved search, so the name of the saved search is not a unique identifier. But if you want to see the names of the search for each search job, you could modify the code sample for listing the search jobs (which lists each job.sid) and have it display the job's name (job.name).

Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...