Splunk Search

Best way to merge results? (ver 5.0.5)

yuwtennis
Communicator

Hi!

I would like to get an advice for how to merge to results.

I have a search as below.

index=A [
search [ index=A
.....
field a b
]

The parent search takes the field a and b and search indexA again.
However , this is bit slow if I have thousands of result from the subsearch.

As a work-around , I believe you can merge results by either way.

  1. Combination of lookup table and inner join
    index=A [
    search [ index=A
    .....
    fields a b
    outputlookup hoge.csv
    return ""
    ]
    | join type=inner a b [|inputlookup hoge.csv]

  2. Use map
    index=A
    ......
    | fields a b
    | map search="search index=A a=$a$ b=$b$" maxsearches=xxxxxx

Since map command heavily relies on number of lists so I prefer using combination of join and lookuptable.

What will be a best way to merge results?

Thanks,
Yu

Tags (2)
0 Karma
1 Solution

martin_mueller
SplunkTrust
SplunkTrust

For filtering a search based on a different search's results your first approach usually is best.

Let's make up a realistic example: You have events that form a transaction with some transaction_id... somewhere down the line of that transaction there is a user field, and you want to grab the transactions for user=yuwtennis.
A slow search would go like this:

sourcetype=transactions | transaction transaction_id | search user=yuwtennis

That'll build ALL the transactions and then throw out most of them.

Pre-filtering like this doesn't work if the user field isn't present in every event:

sourcetype=transactions user=yuwtennis | transaction transaction_id

So you'll have to pick out the transaction_id values you need before you build the transaction:

sourcetype=transaction [search sourcetype=transaction user=yuwtennis | dedup transaction_id | fields transaction_id] | transaction transaction_id

That will take a bit more time due to running two searches, but will almost always be miles faster than the first naïve search.

Your workaround #1 looks slow because joining will always be very slow compared to filtering before loading events.
Your workaround #2 is probably going to be worse when as you say there may be thousands of values returned from the subsearch, so the map would have to run thousands of searches - that can't be fast.

View solution in original post

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

For filtering a search based on a different search's results your first approach usually is best.

Let's make up a realistic example: You have events that form a transaction with some transaction_id... somewhere down the line of that transaction there is a user field, and you want to grab the transactions for user=yuwtennis.
A slow search would go like this:

sourcetype=transactions | transaction transaction_id | search user=yuwtennis

That'll build ALL the transactions and then throw out most of them.

Pre-filtering like this doesn't work if the user field isn't present in every event:

sourcetype=transactions user=yuwtennis | transaction transaction_id

So you'll have to pick out the transaction_id values you need before you build the transaction:

sourcetype=transaction [search sourcetype=transaction user=yuwtennis | dedup transaction_id | fields transaction_id] | transaction transaction_id

That will take a bit more time due to running two searches, but will almost always be miles faster than the first naïve search.

Your workaround #1 looks slow because joining will always be very slow compared to filtering before loading events.
Your workaround #2 is probably going to be worse when as you say there may be thousands of values returned from the subsearch, so the map would have to run thousands of searches - that can't be fast.

0 Karma

lguinn2
Legend

I am unclear about why you are going to "merge results"

I can't figure out why you can't simply do the search on index=A and be done. More details are needed to figure out the best approach.

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to July and August Tech Talks, Office Hours, and Webinars!

Dive into our sizzling summer lineup for July and August Community Office Hours and Tech Talks. Scroll down to ...

Edge Processor Scaling, Energy & Manufacturing Use Cases, and More New Articles on ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Get More Out of Your Security Practice With a SIEM

Get More Out of Your Security Practice With a SIEMWednesday, July 31, 2024  |  11AM PT / 2PM ETREGISTER ...