Splunk Search

Creating a New Search on Data Gathered by a Previous Search

landen99
Motivator

Is it possible to take the search results from a report which was run the night before and pipe it into a new search? The old search report included a fields section to ensure that the needed fields were included in the results for the post-processing work to be done later. I read about the newsearch/append functional combination but when I paste the sid from the report's inspect job page into the oldsearchid (below) for "All Time", there are no matches and it seems to take as much time as if the old search were being done again.

newsearch | append [ loadjob oldsearchid ] | post_processing_search_terms

How does the append function work and how is it used correctly with reports to perform various post-processing views of the data?

Follow-on points:

| loadjob oldsearchid | post_processing_search_terms

This search does the trick with some unexpected twists. The results are not displayed like normal searches and do not include all of the results. In one loadjob search, the Events tab contains no results while the Statistics tab contains 260 of the expected 529 Event tab results.

1 Solution

landen99
Motivator

Based on the comments of somesoni2, in order to take a completed job's saved results with the sid=1396375083.9938 and do post-processing on those results only, the following search should be used:

| loadjob 1396375083.9938 | post_processing_search_terms

| loadjob savedsearch="me99:search:My Saved Search" | post_processing_search_terms

depending on whether the search is saved or not. The following search simply executes the saved search again exactly the same as if you went to the saved search under Settings-Saved searches and reports and then clicked "Run"

| savedsearch "My Saved Search"

The key is to start the search with the pipe symbol | and nothing before it. The second key for saved searches is to use the actual name of the search; above the search line or at the start of the line in the saved search listing.

The first search in this answer worked extremely fast because the results were already pulled from the data and ready to be filtered and manipulated. The third search actually behaved as if it were starting the search from scratch, so I really don't know what to make of that.

View solution in original post

0 Karma

somesoni2
Revered Legend

Search results of a completed (non-expired job) are saved locally in the folder $SPLUNK_HOME/var/run/splunk/dispatch/ folder within a folder with job id name. Within the folder with job name, there are various files providing information about the job, results.csv.gz holds the search results in CSV format with all the field names as header. The loadjob simply get the results from this file and make available all the search results with fields names, so that they can be used for post-processing.

0 Karma

somesoni2
Revered Legend

I tested it and it seems you can't use eventtype at the output of loadjob command, even with 'events=t'.

0 Karma

landen99
Motivator

Let's say that you have an eventtype defined with several search terms specified, but that there is no field called eventtype. Does the loadjob function prevent the use of the eventtype and require that those search terms be entered manually?

0 Karma

somesoni2
Revered Legend

The syntax is correct here, but it will work only when there is a field name eventtype in the search result of the job specified.

0 Karma

landen99
Motivator

The big question is, "How do we search/process only one particular saved search results file?"

Let's say that I want to apply the following filter to the following saved job only: eventtype=misc

Let's say that when I use the job inspector to locate a job identifier on a saved job, I find the following value: scheduler_c2dsNzQ4X2E_search_RMD528b25b5bc01cf29d_at_1396306800_5538

What would the search look like? Would it look like the following?

loadjob scheduler_c2dsNzQ4X2E__search__RMD528b25b5bc01cf29d_at_1396306800_5538 | search eventtype=misc
0 Karma

landen99
Motivator

I took the liberty of putting your comments into the form of an answer to this question. Please let me know if the answer needs any corrections.

0 Karma

landen99
Motivator

Based on the comments of somesoni2, in order to take a completed job's saved results with the sid=1396375083.9938 and do post-processing on those results only, the following search should be used:

| loadjob 1396375083.9938 | post_processing_search_terms

| loadjob savedsearch="me99:search:My Saved Search" | post_processing_search_terms

depending on whether the search is saved or not. The following search simply executes the saved search again exactly the same as if you went to the saved search under Settings-Saved searches and reports and then clicked "Run"

| savedsearch "My Saved Search"

The key is to start the search with the pipe symbol | and nothing before it. The second key for saved searches is to use the actual name of the search; above the search line or at the start of the line in the saved search listing.

The first search in this answer worked extremely fast because the results were already pulled from the data and ready to be filtered and manipulated. The third search actually behaved as if it were starting the search from scratch, so I really don't know what to make of that.

0 Karma

landen99
Motivator

I am still struggling with using this function with consistent success. The documentation is helpful but it is not enough: http://docs.splunk.com/Documentation/Splunk/6.0.1/SearchReference/Loadjob

I think that this function needs a lot more development.

0 Karma

somesoni2
Revered Legend

The savedsearch command simple executes a savedsearch's query. One of the reason to create saved search is to reuse a search query and savedsearch command enables the reuse.

0 Karma

somesoni2
Revered Legend

Yes... as long as your get results from the base search (in this case, loadjob), your can do post process on the results.
Also, if the old job is a scheduled job and you want to take the last run of it, you can use '| loadjob savedsearch="owner:app:YouSavedSearch"'.

landen99
Motivator

Let's say that I just do the following:

loadjob oldsearchid | post_processing_search_terms

Would that allow me to take on old search and do something just to its results? or in other words, to search on data gathered already by a report?

0 Karma

somesoni2
Revered Legend

The post processing will be for whole resultset (newsearch and old job). You can change the order of your searches but you have to use append or anyother means to combine them.
new search | append [loadjob oldsearchid] | post process
OR
|loadjob oldsearchid | append [new search] | post process
OR
|multisearch [new search][loadjob oldsearchid] | postprocess

0 Karma

landen99
Motivator

What if the search was changed to something like the following?

replace [ loadjob oldsearchid ] | newsearch |  post_processing_search_terms

Would that work?

0 Karma

landen99
Motivator

So is append simply saying add oldsearchid results to the results of newsearchid? Added (example): The new search results from parsing all of the data in the database is joined together with the old search results for post-processing on both? Or does it limit the new search to filtering and processing the results of the oldsearch results only?

0 Karma

somesoni2
Revered Legend

append command will just append the result of the subsearch (loadjob) into the current result (from newsearch). You can do any post processing based on the fields available.
new search-
fieldA1, fieldB1,fieldC1
fieldA2, fieldB2,fieldC2

sub search-
fieldA1, fieldB1, fieldD3
fieldA3 fieldB3, fieldD3

new search | append [sub search]
fieldA1, fieldB1,fieldC1
fieldA2, fieldB2,fieldC2
fieldA1, fieldB1, NULL, fieldD3
fieldA3 fieldB3, NULL, fieldD3

0 Karma

landen99
Motivator

Is this more appropriate as a feature request or are there tools already available for running sub-searches from existing results? Ideally, the process of taking search results and running separate sub-searches on them would be a simple, easy and fast process.

0 Karma
Get Updates on the Splunk Community!

New This Month in Splunk Observability Cloud - Metrics Usage Analytics, Enhanced K8s ...

The latest enhancements across the Splunk Observability portfolio deliver greater flexibility, better data and ...

Alerting Best Practices: How to Create Good Detectors

At their best, detectors and the alerts they trigger notify teams when applications aren’t performing as ...

Discover Powerful New Features in Splunk Cloud Platform: Enhanced Analytics, ...

Hey Splunky people! We are excited to share the latest updates in Splunk Cloud Platform 9.3.2408. In this ...