Splunk Search

Join Join Help

matthewcanty
Communicator

Hello, I feel like I am close to figuring this out. If there was a way to just pump out all the fields you have available after a join I'd like to know.

Anyway. I am working on the following query:

Action=NormInsert Source | join GroupId [search Action=PublishedGtpRule] | dedup Source GroupId | stats count by GroupId | fields Description count

What I am trying to figure out is how to show the results of the stats query, with the Description found from the join??

*************MORE INFO*************

Example NormInsert Data:

2013-01-28 14:45:13.5131 Action="NormInsert" Id="1942454" GroupId="47201bcf-2079-401e-855a-4f2ec14c1370" SportId="9aa265c0-1b55-4044-8548-757a8cb247eb" Source="965c6b51-e6be-4e60-aa08-dbb42c076ebe"

Example Published Data:

2013-01-28 14:47:32.0531 Action="PublishedGtpRule" Id="" GroupId="226e67be-5e58-4136-984d-d1df1fafe197" SportId="ab3af3be-0827-4311-b271-0ae568a13726" Source="GtpRule" Description="West Virginia v Kansas" Date="29/01/2013 02:00:00" ........

What I am aiming to discover is for each GroupId how many unique Sources have been received. The count of unique sources can be found using the NormInserts. However the Published data is an aggregate of data so individual sources are not available here.

So I want to count, and then add descriptive data for a content team to be able to analyse.

I do have alternatives:

  • Just use the GroupIds and let them work out what is missing, but is likely to be a tiresome process for them.
  • Add the extra data straight onto the Published log.
Tags (1)
0 Karma
1 Solution

martin_mueller
SplunkTrust
SplunkTrust

Stats does what you ask of it, and computes a table of counts by GroupId. Stats cannot know that you want to keep Descriptions as well.

Assuming there is a lookup GroupId -> Description you could call that lookup after the stats.
If the Description is in the events you're putting in to the stats you need to tell the stats to include them in its output.

View solution in original post

sideview
SplunkTrust
SplunkTrust

Instead of using a join at all, I would look into using a disjunction (fancy term for "OR") and then a simple stats command.

( Action=NormInsert Source ) OR (Action=PublishedGtpRule) | stats dc(Source) as sourceCount values(Source) as Source values(Description) as Description by GroupId

The stats command will take each unique value of GroupId that it sees (on either side of the disjunction), and make each one a row in its output. Then along the way it will count the distinct values of Source seen (dc(Source)), the distinct values of Source (values(Source)) and the distinct values of Description (values(Description))

This will perform a great deal better than your join, the raw data wont have to be pulled back to your search head, which means most of the work will be done out at your indexers, and the search wont be subject to any of the pesky limits in number of rows or execution time, that you're subject to with join.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Stats does what you ask of it, and computes a table of counts by GroupId. Stats cannot know that you want to keep Descriptions as well.

Assuming there is a lookup GroupId -> Description you could call that lookup after the stats.
If the Description is in the events you're putting in to the stats you need to tell the stats to include them in its output.

matthewcanty
Communicator

That did the trick! Thanks so much.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

I understand you want to do something like this (pseudosplunk)?

search for source, action | stats count by GroupId | add description

That add description part can be done by a join along these lines:

... | stats ... | join GroupId [search for published data or whatever | fields + GroupId Description]

I've cooked up an example that should do the basics without real data:

| gentimes start=-1 increment=5m | eval gid = (starttime % 1200) / 300 | eval sid = (starttime % 900) / 300 | join gid [gentimes start=-1 increment=5m | eval gid = (starttime % 1200) / 300 | eval d = gid * 10]

matthewcanty
Communicator

I've added some example data. When telepathy is here there will be no need for these forums.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

In this context stats and chart behave the same, both would only keep a count and the GroupId.

Give this a try:

... | stats count by GroupId Description

Depending on the way your data is composed it'll either produce what you want or some garbage, I can't telepathically guess how your data looks.

0 Karma

matthewcanty
Communicator

How do I tell stats I want to see the Description? Could I use chart?

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...