Splunk Search

Problem using Join function

JHFRDANALYSIS
Engager

I'm a novice working in fraud prevention; appreciate your help.  When running the following, I'm getting a failure error and job inspector shows excessive time (106.46) on dispatch.evaluate.join.  Can you help identify what needs to change to output chart of Condition_Attrib_17 by Treatment Group.  I'm a novice working in fraud prevention; appreciate your help.  

index=TEST sourcetype="TEST:user_activity" application_id=ABC123 policy_id=" UPDATE" "*"

 | dedup data.condition_attrib_22

 | rename data. condition_attrib_22 AS data.params.policy

 | fields data.params.policy

| eval join_key=data.params.policy

| fields join_key, data.treatment_group

| join type=inner join_key

 [search index=TEST sourcetype="TEST:user_activity" application_id=ABC123 policy_id="UPDATE" "*"]

    | stats latest(data.request.condition_attrib_17) as Condition_Attrib_17 by data. request.condition_attrib_22

    | rename data.request.condition_attrib_22 as join_key

    | fields join_key, Condition_Attrib_17

    | chart count by Condition_Attrib_17 by data.treatment_group

Labels (1)
0 Karma

MuS
Legend

If possible share sanitised sample events otherwise we will not be able to actually help 😉

Cheers, MuS

0 Karma

livehybrid
SplunkTrust
SplunkTrust

Hi @JHFRDANALYSIS 

I would try and avoid using join unless absolutely necessary, you can get the chart in a single pass with stats, then chart. Also it looks like the chart syntax is wrong; it should be “chart count over X by Y”, not “chart count by X by Y”.

Something like this should work:
 
 
index=TEST sourcetype="TEST:user_activity" application_id=ABC123 policy_id=UPDATE 
| stats latest(data.request.condition_attrib_17) as Condition_Attrib_17 latest(data.treatment_group) as treatment_group by data.request.condition_attrib_22 
| where isnotnull(Condition_Attrib_17) AND isnotnull(treatment_group) 
| chart count over Condition_Attrib_17 by treatment_group

If your key is data.condition_attrib_22 (not data.request.condition_attrib_22), change the stats “by” field accordingly. Also, if multiple treatment_group values can exist per key and you want one, replace latest(...) with values(...) and then mvexpand treatment_group before charting.

🌟 Did this answer help you? If so, please consider:

  • Adding karma to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

JHFRDANALYSIS
Engager

This was helpful by giving me new techniques.  But, it didn't return data and the data is there in the json.  One thing I note:  The policy_id for Condition_Attrib_17 is UPDATE, but the policy_id for Treatment_Group is SRF.  Modified to policy_id IN (UPDATE,SRF) but it still didn't return any data which I can see in the json data.  I'm thankful that you voluntarily give thoughts to help me learn.

 

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Everybody has already told you that you shouldn't use join in the first place.  @MuS asked you to illustrate your data, which is always the best recommendation.  Now that you mention your dateset is in JSON, you really have to share/mock data.  Sanitize any sensitive information but make sure to maintain structures that matter.

Also, instead of telling volunteers "error when I run this complex SPL snippet", follow these golden rules; nay, call them the four commandments:

  • Illustrate data input (in raw text, anonymize as needed), whether they are raw events or output from a search (SPL that volunteers here do not have to look at).
  • Illustrate the desired output from illustrated data.
  • Explain the logic between illustrated data and desired output without SPL.
  • If you also illustrate attempted SPL, illustrate actual output and compare with desired output, explain why they look different to you if that is not painfully obvious.
0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Tech Talk Recap | Mastering Threat Hunting

Mastering Threat HuntingDive into the world of threat hunting, exploring the key differences between ...

Observability for AI Applications: Troubleshooting Latency

If you’re working with proprietary company data, you’re probably going to have a locally hosted LLM or many ...

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...