Splunk Search

Is there a way to save the results for parts of a search so when I modify the tail end, I don't have to run the whole search?

CREVITCH
Path Finder

I am executing the following search and it is taking a long time to execute. Is there a way to save the results of parts of a search so that when I modify the tail end I don't have to run the whole search? I.e. can I save the results of user=* | dedup _ raw and then run those saved results through subsequent searches?

user=* | dedup _raw | transaction user date_minute date_second
0 Karma
1 Solution

jeffland
SplunkTrust
SplunkTrust

To save an intermediate result, you could also use

some search | outputlookup temp.csv

and from here on start a new search with

| inputlookup temp.csv | continue search

If some search is a complex (time-consuming) search and you just want to play around with different ways of doing it in continue search, then this method will allow you to do so without any hassle. The only thing you may want to look out for is if the intermediate results are too numerous for a .csv file (say, some hundred thousand lines of result).

View solution in original post

0 Karma

woodcock
Esteemed Legend

Use | outputcsv to send to disk and then use | inputcsv to pull back in. You can also use Tableau which has a Splunk connector so you can pull in your raw data and save to disk and then do all of the "stuff" to it from the disk image.

0 Karma

jeffland
SplunkTrust
SplunkTrust

To save an intermediate result, you could also use

some search | outputlookup temp.csv

and from here on start a new search with

| inputlookup temp.csv | continue search

If some search is a complex (time-consuming) search and you just want to play around with different ways of doing it in continue search, then this method will allow you to do so without any hassle. The only thing you may want to look out for is if the intermediate results are too numerous for a .csv file (say, some hundred thousand lines of result).

0 Karma

BernardEAI
Communicator

Thanks for this interesting suggestion. 

I have tried applying this, but I'm getting strange results. Consecutive identical searched is returning different results. My suspicion is that different parts of the search is performed asynchronously, causing the data in an earlier version of temp.csv being read before the new version of temp.csv is written.

Could this be possible? 

Note: I'm using "| inputlookup temp.csv" inside a subsearch. Maybe the subsearch is executed  asynchronously with the main search?

UPDATE: after looking at the Splunk documentation on subsearches, I read this: "The subsearch is in square brackets and is run first. " This explains the strange behaviour. 

0 Karma

javiergn
SplunkTrust
SplunkTrust

Apply filtering as soon as possible and do not use transaction unless you have to.
Specify your index name and sourcetype because it will speed things up.
Also restrict your search by time using earliest and latest.

If you post the whole query I can try to be more specific:

index=foo sourcetype=bar user=* 
| fields user date_minute date_second
| stats list(user) by date_minute, date_second

Let me know if that helps

CREVITCH
Path Finder

If I only have one index and one sourcetype, will this speed things up? I want to look at all events, and not just within a time window.

Is there a way to reuse the results of a search?

0 Karma

javiergn
SplunkTrust
SplunkTrust

Even if there's only one index and one sourcetype it's always better to be as specific as possible and apply that filter as early as possible in your query.

You can reuse the results of a search via different ways but it all depends on what you are trying to achieve, if you give us more details we might be able to help.

For instance, you can use subsearches, output and inputcsv, collect, etc.

0 Karma

CREVITCH
Path Finder

the dedup _raw takes so long I am hoping to store its result to pipe to subesequent searches. I need to do thsi step because I have many duplicate events for some reason.

0 Karma

javiergn
SplunkTrust
SplunkTrust

But why do you need to dedup the whole RAW event if you are then only using the following three fields: user date_minute date_second?

Doesn't the following query work for you?

index=foo sourcetype=bar user=* 
 | fields user date_minute date_second
 | stats list(user) by date_minute, date_second

Or the alternative that uses values instead of list to remove duplicates:

index=foo sourcetype=bar user=* 
 | fields user date_minute date_second
 | stats values(user) by date_minute, date_second
0 Karma

somesoni2
SplunkTrust
SplunkTrust

You'd probably achieve the same result by using just the stats command, which will be much faster. What is the search requirement here?

0 Karma

CREVITCH
Path Finder

I am looking to group events by transaction. Will the stats command do this for me?

I have a lot of events. By doing user=*, I narrow it to login events since they have a user field. I end up with duplicate events, and I go through dedup. Finally i am left with events, some of which group together (i.e. password accepted and session opened). This is why I want to group as transactions: want to preserve individual events, but want to know the number of independent transactions.

It would be nice to know if there is a way to re-use the results of previous searches. Is there a way to do this?

0 Karma

somesoni2
SplunkTrust
SplunkTrust

What all field you're interested in? all the fields OR just _raw?

As @javiergn mentioned, restrict your base search by specifying index/sourcetype/source etc. To remove duplicates, group events based on user, date_minute, date_second, try this stats option.

index=blah sourcetype=blah user=* | stats latest(user) as user latest(date_minute) as date_minute latest(date_second) as date_second by _raw | stats list(_raw) as _raw by user date_minute date_second

If you want to preserve more fields add the to both the stats in similar way.

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...