Splunk Search

Subsearch for multiple sourcetypes and fieldnames

DerBastler
New Member

I need to do a search in two different sourcetypes and use the result to do additional searches in these queries.
But I have the problem that, while both sourcetypes have similar values, they use different prefixes. So in sourcetype=A the ip is called aIP and in sourcetype=B the ip is called bIP respectively.

So you could search with
aIP=”192.168.0.1” OR bIp=”192.168.0.1”

However if you want to use these ips from a subsearch of both of these tables it becomes problematic and I am not sure what the best solution is.
So let’s assume I want to find the IPs used on a specific page called “MAINPAGE”. And use these IPs to search for other pages visited by them on both sources.
I tried to minimize the code as far as possible. It might not make any sense anymore, but I hope it’s enough to bring across my point.

Index=web (sourcetype=a OR sourcetype=b)
[search index=web sourcetype=a apage=MAINPAGE | table aIP]

OR
[search index=web sourcetype=a apage=MAINPAGE | rename aIP as bIP | table bIP]
OR
[search index=web sourcetype=b bpage=MAINPAGE | table bIP]
OR
[search index=web sourcetype=b bpage=MAINPAGE | rename bIP as aIP | table aIP]

| eval page = coalesce (apage, bpage)
| eval ip = coalesce (aIP, bIP)

| table page, ip

So because the table of the subsearch is automatically the search parameters for the parent searches and I need to search for both results, I don’t see a better way than doing both searches twice and just rename the field name of the output table.

Is there any way to reduce it to two subsearches in this case? Eg. renaming the fields without doing the search an additional time?

0 Karma
1 Solution

miteshvohra
Contributor

Have you tried creating 'Field Alias' for both aIP and bIP?

If not, create one as src_ip and try this:

index=web (sourcetype=a OR sourcetype=b) | table page, src_ip

HTH, Mitesh.

View solution in original post

0 Karma

miteshvohra
Contributor

Have you tried creating 'Field Alias' for both aIP and bIP?

If not, create one as src_ip and try this:

index=web (sourcetype=a OR sourcetype=b) | table page, src_ip

HTH, Mitesh.

0 Karma

DerBastler
New Member

Thank you! This works perfectly!

0 Karma

micahkemp
Champion

It sounds like you may be running into limitations of subsearch. Subsearches finalize after 60 seconds (or 10k rows, though my previous answer worked around that specific limitation), so if your inside search takes too long to run you will get potentially incomplete results.

Instead it's recommended you find a non-subsearch means to fetch your needed data, potentially like this:

index=_internal sourcetype=a OR sourcetype=b
| eval ip=if(sourcetype="a", aIP, bIP)
| stats values(apage) AS apage values(otherfield) AS otherfield BY ip
| search apage=MAINPAGE
0 Karma

DerBastler
New Member

Unfortunately that's not possible in my case. The initial data is too big. So the idea was to have rather "specific" subsearches, just leaving some thousand search values for the main search. Using the main search, grab everything and filter later on will use up too much data that the job gets stuck (and times out).

0 Karma

micahkemp
Champion

Try:

index=web (sourcetype=a OR sourcetype=b)
[search index=web sourcetype=a apage=MAINPAGE | stats values(aIP) AS aIP | eval search="aIP=".mvjoin(aIP, " OR aIP=")." OR bIP=".mvjoin(aIP, " OR bIP=")]

When a subsearch returns a field named search, that field's value will be dropped inline into your search. This results in a search that looks like:

index=web (sourcetype=a OR sourcetype=b) aIP=1.2.3.4 OR aIP=4.3.2.1 OR bIP=1.2.3.4 OR bIP=4.3.2.1

Or maybe:

index=web
[search index=web sourcetype=a apage=MAINPAGE | stats values(aIP) AS aIP | eval search="(sourcetype=a (aIP=".mvjoin(aIP, " OR aIP=").")) OR (sourcetype=b (bIP=".mvjoin(aIP, " OR bIP=")."))"]

Which results in a search like:

index=web (sourcetype=a (aIP=1.2.3.4 OR aIP=4.3.2.1)) OR (sourcetype=b (bIP=1.2.3.4 OR bIP=4.3.2.1))

Which might be a tad more efficient.

0 Karma

DerBastler
New Member

I just ran it, but it doesn't work.
I tried the first and simpler variation, and fiddled around with it. Maybe there is some limit on how much can be in any field or something similar.
It works for testing purpose if you insert an | eval aIP=specific single entry from valid result into it.
But otherwise I guess that the following commands break down. Maybe there is some kind of limitation as the search "field" would contain thousands of numbers, likely almost a million characters.

0 Karma

micahkemp
Champion

How long do these searches take to run:

index=web sourcetype=a apage=MAINPAGE | stats values(aIP) AS aIP | eval search="aIP=".mvjoin(aIP, " OR aIP=")." OR bIP=".mvjoin(aIP, " OR bIP=")

index=web sourcetype=a apage=MAINPAGE | stats values(aIP) AS aIP | eval search="(sourcetype=a (aIP=".mvjoin(aIP, " OR aIP=").")) OR (sourcetype=b (bIP=".mvjoin(aIP, " OR bIP=")."))"
0 Karma

DerBastler
New Member

We moved away from tackling the problem from this end now. I appreciate your help though.

0 Karma

DerBastler
New Member

Thanks! Can't say right now if it works or not, but that's definitely something I didn't come across and didn't test yet. Will try it asap!

0 Karma

cmerriman
Super Champion

why not something like this:

Index=web (sourcetype=a OR sourcetype=b)
[search index=web sourcetype=a apage=MAINPAGE | eval bIP=aIP|table aIP bIP|format]
OR
[search index=web sourcetype=b bpage=MAINPAGE | eval aIP=bIP| table aIP bIP|format]
| eval page = coalesce (apage, bpage)
| eval ip = coalesce (aIP, bIP)
| table page, ip
0 Karma

DerBastler
New Member

The problem in this case is, that both of the results are passed on with an AND.
So if a subsearch returns
|table aIP bIP
that means that the main search now searches for aIP AND bIP.
What I need in this case is an OR though as some of these fields might be null.

0 Karma

cmerriman
Super Champion

what about adding this after format

|rex mode=sed field=search "s/AND bIP/OR bIP/g"
0 Karma

DerBastler
New Member

Actually the "format" itself already breaks the search. While the splunk documentations states that it's implicitly used (https://docs.splunk.com/Documentation/SplunkCloud/6.6.3/Search/Changetheformatofsubsearchresults)
It's actually not. Well, the functionality is. But the search isn't generated the same way. It generates a big step in between which makes a 5 minute search into a 20 minute search, which then leads then to breaking subsearches.
But I guess you can only use |rex if the format statement was used before. But as this breaks the search...
(can be verified by increasing search time span. While it works with tiny timespans, at some point it takes so long that splunk starts to ignore subsearches)

0 Karma
Get Updates on the Splunk Community!

Splunk Enterprise Security 8.0.2 Availability: On cloud and On-premise!

A few months ago, we released Splunk Enterprise Security 8.0 for our cloud customers. Today, we are excited to ...

Logs to Metrics

Logs and Metrics Logs are generally unstructured text or structured events emitted by applications and written ...

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...