Splunk Search

How to use a large search result set to do quick, ad hoc transforming searches from the Search App?

neiljpeterson
Communicator

I am not sure if I am even wording this question correctly (which is probably why I didn't find any good results)

What is the best practice for working with large result sets in Splunk without rerunning the search every time you make a small change?

We all know how to use the HiddenPostProcess module in dashboards and do multiple transforming searches off a base search, but how do I do the same thing with ad hoc searches?

I have a very large base search, that takes tens of seconds (or even minutes) to run, but I want to be able to quickly adjust (experiment, fiddle with, etc) the transforming searches quicker than that. I don't care if the data is fresh, but I do need to use a significant chunk of the large result set to produce meaningful results.

TLDR: When using the Search App, can I "cache" the base search, and rerun ad hoc transforming searches against it?

I am sure there is a way to do what I am describing, but I can't find it.

Can anyone help me out?

Thanks!

0 Karma

woodcock
Esteemed Legend

I have 2 suggestions.

The free answer: Instant Pivot. Make sure your base search has everything that you need and do all of the post-pipe stuff with pivot (should work but untested; IMHO this is the main reason to create this brand-new feature):

http://docs.splunk.com/Documentation/Splunk/6.2.5/ReleaseNotes/MeetSplunk#Instant_pivot

The expensive answer: Tableau. Tableau has a Splunk connector that will allow you to run a base search and pull the raw events back into memory (you can save this to a local file, too!) and do your post-pipe stuff inside Tableau.

0 Karma

David
Splunk Employee
Splunk Employee

In my tests, instant pivot doesn't give you those benefits. When I open in search, it actually reverts to a stats command, which doesn't leverage any of the caching of pivot. But pivot on an accelerated data model would definitely give a lot of this benefit -- check out my .conf preso from last year for an example of how to do that: http://conf.splunk.com/sessions/2014/conf2014_DavidVeuve_Splunk_UsingTrack_SecurityNinjutsu.pdf

0 Karma

woodcock
Esteemed Legend

Bummer; now I have absolutely no reason to try instant-pivot. Why in the world would they not implement it with any caching?!?!?!?!

0 Karma

David
Splunk Employee
Splunk Employee

Well, a lot of the goal of instant pivot is to give you an easy onramp to using pivot. Once you save the data model, you'll have a formal data model that does the caching, can be accelerated, etc. Instant Pivot allows uers who might not normally play with pivot, to onramp to creating data models easily.

0 Karma

MuS
SplunkTrust
SplunkTrust

Hi neiljpeterson,

you can create a scheduled saved search out of your base search and use the latest search results like this :

| loadjob NameOfYourSavedSearch | ...

See docs for more details http://docs.splunk.com/Documentation/Splunk/6.2.6/SearchReference/Loadjob

Hope this helps ...

cheers, MuS

David
Splunk Employee
Splunk Employee

You can also use loadjob without a scheduled search. If you run your base search, you will get a search id (you can grab it from the URL or search inspector). You can reference that search id in loadjob, which I've done a bunch of times. You can even click the Share (or Send Job to Background) buttons in your original search, which will up the TTL of your search to 7 days. That means you can continue to manipulate those results over a period of many days.

Keep in mind that it usually makes sense to grab the first level of base statistics with your original search (rather than the raw events) so that you can make the amount of data being written to, and read off disk on the search head manageable -- it will speed up your loadjob if it doesn't have to read 4 GB of search results in order to do your processing.

MuS
SplunkTrust
SplunkTrust

Thanks for pointing that out! But for a manner of simplicity, the scheduled saved search and using its name with loadjob would be easier in a dashboard.

0 Karma
Get Updates on the Splunk Community!

Splunk Forwarders and Forced Time Based Load Balancing

Splunk customers use universal forwarders to collect and send data to Splunk. A universal forwarder can send ...

NEW! Log Views in Splunk Observability Dashboards Gives Context From a Single Page

Today, Splunk Observability releases log views, a new feature for users to add their logs data from Splunk Log ...

Last Chance to Submit Your Paper For BSides Splunk - Deadline is August 12th!

Hello everyone! Don't wait to submit - The deadline is August 12th! We have truly missed the community so ...