Splunk Search

help on subsearch wich truncate events

jip31
Motivator

Hi

I use the search below but I lose some events because I have the following message :

[subsearch]: Subsearch produced 124329 results, truncating to maxout 50000.

`software` earliest=-90d latest=now 
| fields MachineID ProductVersion00 ProductName00 
| stats last(ProductVersion00) as ProductVersion00 by MachineID ProductName00 
| join max=0 type=inner MachineID 
    [| search `machineID` 
    | fields MachineID Name0 
    | stats last(Name0) as Hostname by MachineID] 
| stats last(Hostname) as Hostname by ProductName00 ProductVersion00 
| rename ProductVersion00 as "Product version", ProductName00 as "Product name

Is there a workaround concerning this issue please?

Labels (1)
Tags (1)
0 Karma

jip31
Motivator

I have another thing strange in the search below  concerning | search Hostname=302013154

 

`software` earliest=-30d latest=now 
| fields MachineID ProductVersion00 ProductName00 
| stats last(ProductVersion00) as ProductVersion00 by MachineID ProductName00 
| join max=0 type=inner MachineID 
    [| search `machineID` 
    | fields MachineID Name0 
    | stats last(Name0) as Hostname by MachineID] 
| search Hostname=302013154
| stats last(Hostname) as Hostname by ProductName00 ProductVersion00 
| rename ProductVersion00 as "Product version", ProductName00 as "Product name"

If I am doing the | search Hostname=302013154 before the stats command it works

But if I do it after the stats I havent results

`software` earliest=-30d latest=now 
| fields MachineID ProductVersion00 ProductName00 
| stats last(ProductVersion00) as ProductVersion00 by MachineID ProductName00 
| join max=0 type=inner MachineID 
    [| search `machineID` 
    | fields MachineID Name0 
    | stats last(Name0) as Hostname by MachineID] 
| stats last(Hostname) as Hostname by ProductName00 ProductVersion00 
| rename ProductVersion00 as "Product version", ProductName00 as "Product name" 
| search Hostname=302013154

 

what is the problem please?

0 Karma

tread_splunk
Splunk Employee
Splunk Employee
| stats last(Hostname) as Hostname by ProductName00 ProductVersion00 

...is removing all Hostname except the LAST one for each combination of ProductName00 and ProductVersion00.  I'd assume Hostname=302013154 isn't the LAST one for any combination of ProductName/ProductVersion and so is being eliminated from your results.

0 Karma

jip31
Motivator

I am going to be clearer

If I am doing :

 

`machineID` 
    | fields MachineID Name0 
    | stats last(Name0) as Hostname by MachineID
| search Hostname=302013154

 

 I have a MachineID for this Hostname

If I am doing the global search :

 

`software` earliest=-90d latest=now 
| fields MachineID ProductVersion00 ProductName00 
| stats last(ProductVersion00) as ProductVersion00 by MachineID ProductName00 
| join max=0 type=inner MachineID 
    [| search `machineID` 
    | fields MachineID Name0 
    | stats last(Name0) as Hostname by MachineID] 
| search Hostname=302013154
| stats last(Hostname) as Hostname by ProductName00 ProductVersion00 
| rename ProductVersion00 as "Product version", ProductName00 as "Product name"

 

I have the ProductName00 and the ProductVersion00 for this Hostname

But I need to put the | search Hostname=302013154 after the stats command because I use a loadjob command

So if I put this :

`software` earliest=-90d latest=now 
| fields MachineID ProductVersion00 ProductName00 
| stats last(ProductVersion00) as ProductVersion00 by MachineID ProductName00 
| join max=0 type=inner MachineID 
    [| search `machineID` 
    | fields MachineID Name0 
    | stats last(Name0) as Hostname by MachineID] 
| stats last(Hostname) as Hostname by ProductName00 ProductVersion00 
| rename ProductVersion00 as "Product version", ProductName00 as "Product name"
| search Hostname=302013154

I have no results

0 Karma

tread_splunk
Splunk Employee
Splunk Employee

I've making some assumptions about your data here.

1. Query 1 returns fields called ProductName00, ProductVersion00, MachineId.

2. Query 2 returns fields called MachineId, Name0 and DOESN'T return a field called ProductName00.

 

(`software` earliest=-90d latest=now) OR (search `machineID`)
| fillnull value="Dummy" ProductName00 
| stats latest(ProductVersion00) as ProductVersion00 latest(Name0) as Hostname by MachineId, ProductName00 
| eventstats last(Hostname) as Hostname by MachineId 
| stats last(Hostname) as Hostname by ProductName00 ProductVersion00

I've also experimented with / interchanged LAST and LATEST.  You definitely need to test this against your data set.

 

0 Karma

jip31
Motivator

Sorry but i have the same problem....

I cant retrieve some hostname I need with your search

In my search, when I just execute the subsearch I can find the hostname I need

`machineID`
| fields MachineID Name0
| search Name0=302005408
| stats last(Name0) as Hostname by MachineID

But when I do the jointure, I lost this hostname

And I repeat with your search, I lose this hostname too

0 Karma

tread_splunk
Splunk Employee
Splunk Employee

What do you get if you try ...?

(`software` earliest=-90d latest=now) OR (search `machineID`)
| fillnull value="Dummy" ProductName00 
| stats last(ProductVersion00) as ProductVersion00 last(Name0) as Hostname by MachineId, ProductName00 

 Do you see the hostname you are looking for?

0 Karma

jip31
Motivator

when an hostname exists I have no ProductVersion00 and no ProductName00, just a dummy results

It's normal because there is no jointure between the 2 sourcetype

conversely when I havent an hostname, I have results for MachineID ProductName00 and ProductVersion00

The only field in common in the 2 sourcetype is MachineID resaon why I am doing the join command with this field

Tags (1)
0 Karma

tread_splunk
Splunk Employee
Splunk Employee

Precisely.  You should see what you've just outlined when you run this search...

(`software` earliest=-90d latest=now) OR (search `machineID`)
| fillnull value="Dummy" ProductName00 
| stats last(ProductVersion00) as ProductVersion00 last(Name0) as Hostname by MachineId, ProductName00 

You should see events where a hostname exists, in which case you will have no ProductVersion00, "dummy" in  ProductName00 and a value in MachineId.  This is effectively your subsearch. 

You'll also have results where you haven't got a hostname, and you will have values for MachineId ProductName00 and ProductVersion00.  This is effectively your main search.

These 2 queries produce the common value in MachineId which you can use in the subsequent EVENTSTATS command in the solution I provided to distribute the hostname.

(If the search for 'MachineID' produces fields called ProductVersion00 and/or ProductName00, or the search for 'software' produces a field called Name0, this might require a different approach).

0 Karma

tread_splunk
Splunk Employee
Splunk Employee

Are you specifying time ranges in your main search and subsearch?  Or are they defaulting to the time picker?

0 Karma

jip31
Motivator

this global search (main search + sbsearch) is a scheduled search on the last 90 days

0 Karma

tread_splunk
Splunk Employee
Splunk Employee

How many results do each of your 2 aggregation searches produce?

0 Karma

jip31
Motivator

The main search and the subsearch use the same index with 2 different sourcetype

0 Karma

jip31
Motivator

14 millions in the main search and 500000 in the subsearch

Tags (1)
0 Karma

tread_splunk
Splunk Employee
Splunk Employee

Is there a way (e.g a combination of interesting fields) to distinguish between events returned by the first search vs the second search?  Perhaps source, index, sourcetype?

0 Karma

me74fhfd
Path Finder

You may want to apply workaround for this since join should be used as command of last resort since its very expensive. Refer to this post here: https://community.splunk.com/t5/Splunk-Search/How-to-overcome-sub-search-limitation-only-10k-records.... Alternatively you can try to change subsearch limit in limits.conf file:
maxout = <integer>
* Maximum number of results to return from a subsearch.

0 Karma

jip31
Motivator

Yes, but it doesn't seems to be linked to the join subsearch :

why when I run this search for a specific hostname I have results

 

`software` earliest=-90d latest=now 
| fields MachineID ProductVersion00 ProductName00 
| stats last(ProductVersion00) as ProductVersion00 by MachineID ProductName00 
| join max=0 type=inner MachineID 
    [| search `machineID` 
    | fields MachineID Name0 
    | stats last(Name0) as Hostname by MachineID] 
| search Hostname=302005408 
| stats last(Hostname) as Hostname by ProductName00 ProductVersion00 
| rename ProductVersion00 as "Product version", ProductName00 as "Product name" 
| fields - Hostname

and why when I use a scheduled search for the same hostname I havent results?

| loadjob savedsearch="admin:FOAna_sh:Identity - Installed weather"
| search Hostname=302005408
| stats last(Hostname) as Hostname by ProductName00 ProductVersion00
| rename ProductVersion00 as "Product version", ProductName00 as "Product name"
| fields - Hostname
0 Karma
Get Updates on the Splunk Community!

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...

.conf24 | Learning Tracks for Security, Observability, Platform, and Developers!

.conf24 is taking place at The Venetian in Las Vegas from June 11 - 14. Continue reading to learn about the ...

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...