Hi,
I have 2 indexes and i am performing a join between both indexes to get the top 10 categories per region. Categories come from one index and region comes from the other index. I am able to perform the join but I am unable to incorporate the top function to get the top 10 categories per region.
Here is my query:
Can you please help?
Many thanks,
Patrick
Hi @POR160893,
at first avoid join as soon as possible (Splunk sin't a DB) and use a different approach, something like this:
(index=msg sourcetype=MWAccess3) OR index=solarwind
| lookup cybersecurity_infrastructure NodeID OUTPUT Asset_State NodeName
| eval NodeName=lower(coalesce(NodeName,host))
| search Asset_State=Production
| stats values(Region) AS Region values(legacy_environment) AS legacy_environment BY NodeName
Use my hint as an approach.
in addition, please, for the next search put the search as readable text so I can copy it without errors.
Ciao.
Giuseppe
Sure, here is the query:
index=mwg sourcetype=MWGaccess3 | rename host as NodeName | table NodeName, category | dedup NodeName
| join NodeName [ search index=solarwinds | lookup Cybersecurity_Infrastructure NodeID as NodeID | search Asset_State="Production" NodeName="*mwgproxy*" | eval NodeName=lower(NodeName) | table NodeName, Region, Legacy_Environment | dedup NodeName ]
Your query generated the following error for me though:
Hi @POR160893,
please check the names of fields of the lookup, probably one of them isn't correct.
Ciao.
Giuseppe
I still get the same error.
So, my current WORKING query is still: index=mwg sourcetype=MWGaccess3 | rename host as NodeName | table NodeName, category | dedup NodeName
| join NodeName [ search index=solarwinds | lookup Cybersecurity_Infrastructure NodeID as NodeID | search Asset_State="Production" NodeName="*mwgproxy*" | eval NodeName=lower(NodeName) | table NodeName, Region, Legacy_Environment | dedup NodeName ]
Conceptually, I have 2 tables in the form of indexes. I am using these tables to find the top 5 or 10 categories per Region. Now, the total number of Regions is 3 and the total number of categories is 10. The common field across both indexes is NodeName and there are 41 different HostNames.
So, my question is .... what is the BEST way to present the top N Categories per Region in this case?
Like, for each of the 3 Regions, there would need to be N Categories, so the final output table would be
Region: N Top Categories for that Region
Is this correct?
Thank you btw for all your help as I am new to Splunk and come from a SQL background, so I really, really appreciate ALL your help 😀
Hi @POR160893
when you say "top" you're meaning the regions with the most count value, is it correct?
if this is your need, try to follow my approach, don't join:
(index=msg sourcetype=MWAccess3) OR index=solarwind
| lookup cybersecurity_infrastructure NodeID OUTPUT Asset_State NodeName
| eval NodeName=lower(coalesce(NodeName,host))
| search Asset_State=Production
| stats values(Region) AS Region BY NodeName
| top 5 count BY Region
In few words:
If you want to exclude NodeNames present in only one index, you could add a condition to exclude these NodeNames.
If you have more regions for each NodeName, you can separate them.
See the Splunk Search Tutorial to learn how to use SPL: https://docs.splunk.com/Documentation/SplunkCloud/latest/SearchTutorial/WelcometotheSearchTutorial
Ciao.
Giuseppe
Yes, I mean the top N categories per Region (based on count).
Your query gives the following error though:
Here is the query in writing too:
(index=mwg sourcetype=MWGaccess3) OR index=solarwinds
| lookup Cybersecurity_Infrastructure NodeID OUTPUT Asset_State NodeName
| eval NodeName=lower(coalesce(NodeName,host))
| search Asset_State="Production"
| stats values(Region) AS Region BY NodeName
| top 5 count BY Region
Hi @POR160893,
sorry: too fast answer, please try this:
(index=mwg sourcetype=MWGaccess3) OR index=solarwinds
| lookup Cybersecurity_Infrastructure NodeID OUTPUT Asset_State NodeName
| eval NodeName=lower(coalesce(NodeName,host))
| search Asset_State="Production"
| stats values(Region) AS Region BY NodeName
| top 5 Region
Ciao.
Giuseppe
Yes, this query gives output:
BUT it doesn't even touch the category field from "mwg" index. nodeName was just the primary index between both indexes.
For the 3 Regions (EMEA, AMER, APJ), I simply need the top 10 categories for that region based on count.
So, the line "| stats values(Region) AS Region BY NodeName" makes it seem I want Region by NodeName ......but I want category by region.
So, 3 columns:
Region | Categories |count of each category
And there should be N categories per Region.
When I tried to change your current query by incorporating category, I get no output though:
Hi @POR160893,
the last search should be correct: what about the search without the last row (the top command)?
Is the category field valued or empty?
where does the category field come from (index or lookup)?
if from lookup, add it to the lookup OUTPUT.
Ciao.
Giuseppe
Without the Top command, I receive the following with no category values:
This is strange as if I run the query just as a search, I get data with both the Region and category fields I need:
and:
It is when I add extra commands like top or stats onto this data that it seems to not read the Region or category fields. Like in this case, where it gives no category values:
Hi @POR160893,
as I said, category is from index or from lookup?
please try this:
(index=mwg sourcetype=MWGaccess3) OR index=solarwinds
| lookup Cybersecurity_Infrastructure NodeID OUTPUT Asset_State NodeName
| eval NodeName=lower(coalesce(NodeName,host))
| search Asset_State="Production"
| stats values(category) AS category BY NodeName Region
| top 5 category BY Region
if you have a multivalue in category field,
(index=mwg sourcetype=MWGaccess3) OR index=solarwinds
| lookup Cybersecurity_Infrastructure NodeID OUTPUT Asset_State NodeName
| eval NodeName=lower(coalesce(NodeName,host))
| search Asset_State="Production"
| stats values(category) AS category BY NodeName Region
| mvexpand category
| top 5 category BY Region
Ciao.
Giuseppe
Category is from index, i.e. (index=mwg sourcetype=MWGaccess3) OR index=solarwinds
Hi @POR160893,
probably the position of the filter for Asset_State is the error:
(index=mwg sourcetype=MWGaccess3) OR index=solarwinds
| lookup Cybersecurity_Infrastructure NodeID OUTPUT Asset_State NodeName
| eval NodeName=lower(coalesce(NodeName,host))
| stats values(category) AS category values(Asset_State) AS Asset_State BY NodeName Region
| search Asset_State="Production"
| top 5 category BY Region
Ciao.
Giuseppe
This is what FINALLY worked foe me btw 😀 :
Hi @POR160893,
good for you, please accept an answer for the other peopel of Community.
Ciao and happy splunking.
Giuseppe
P.S.: Karma Points are appreciated by all the Contributors 😉
Karma given 😉