Splunk Search

Problem using Join function

JHFRDANALYSIS
Engager

I'm a novice working in fraud prevention; appreciate your help.  When running the following, I'm getting a failure error and job inspector shows excessive time (106.46) on dispatch.evaluate.join.  Can you help identify what needs to change to output chart of Condition_Attrib_17 by Treatment Group.  I'm a novice working in fraud prevention; appreciate your help.  

index=TEST sourcetype="TEST:user_activity" application_id=ABC123 policy_id=" UPDATE" "*"

 | dedup data.condition_attrib_22

 | rename data. condition_attrib_22 AS data.params.policy

 | fields data.params.policy

| eval join_key=data.params.policy

| fields join_key, data.treatment_group

| join type=inner join_key

 [search index=TEST sourcetype="TEST:user_activity" application_id=ABC123 policy_id="UPDATE" "*"]

    | stats latest(data.request.condition_attrib_17) as Condition_Attrib_17 by data. request.condition_attrib_22

    | rename data.request.condition_attrib_22 as join_key

    | fields join_key, Condition_Attrib_17

    | chart count by Condition_Attrib_17 by data.treatment_group

Labels (1)
0 Karma

MuS
SplunkTrust
SplunkTrust

If possible share sanitised sample events otherwise we will not be able to actually help 😉

Cheers, MuS

0 Karma

livehybrid
SplunkTrust
SplunkTrust

Hi @JHFRDANALYSIS 

I would try and avoid using join unless absolutely necessary, you can get the chart in a single pass with stats, then chart. Also it looks like the chart syntax is wrong; it should be “chart count over X by Y”, not “chart count by X by Y”.

Something like this should work:
 
 
index=TEST sourcetype="TEST:user_activity" application_id=ABC123 policy_id=UPDATE 
| stats latest(data.request.condition_attrib_17) as Condition_Attrib_17 latest(data.treatment_group) as treatment_group by data.request.condition_attrib_22 
| where isnotnull(Condition_Attrib_17) AND isnotnull(treatment_group) 
| chart count over Condition_Attrib_17 by treatment_group

If your key is data.condition_attrib_22 (not data.request.condition_attrib_22), change the stats “by” field accordingly. Also, if multiple treatment_group values can exist per key and you want one, replace latest(...) with values(...) and then mvexpand treatment_group before charting.

🌟 Did this answer help you? If so, please consider:

  • Adding karma to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

JHFRDANALYSIS
Engager

This was helpful by giving me new techniques.  But, it didn't return data and the data is there in the json.  One thing I note:  The policy_id for Condition_Attrib_17 is UPDATE, but the policy_id for Treatment_Group is SRF.  Modified to policy_id IN (UPDATE,SRF) but it still didn't return any data which I can see in the json data.  I'm thankful that you voluntarily give thoughts to help me learn.

 

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Everybody has already told you that you shouldn't use join in the first place.  @MuS asked you to illustrate your data, which is always the best recommendation.  Now that you mention your dateset is in JSON, you really have to share/mock data.  Sanitize any sensitive information but make sure to maintain structures that matter.

Also, instead of telling volunteers "error when I run this complex SPL snippet", follow these golden rules; nay, call them the four commandments:

  • Illustrate data input (in raw text, anonymize as needed), whether they are raw events or output from a search (SPL that volunteers here do not have to look at).
  • Illustrate the desired output from illustrated data.
  • Explain the logic between illustrated data and desired output without SPL.
  • If you also illustrate attempted SPL, illustrate actual output and compare with desired output, explain why they look different to you if that is not painfully obvious.
0 Karma

JHFRDANALYSIS
Engager

I need to know proper syntax of a Splunk query with appended secondary query to be used for outputting table.  The file attached has been sanitized and provides exemplary data table information, the logic we want to use to create an output table and a mockup of the output table that accurately reflects what we would expect to see.  I've tried many different search queries without success.  Appreciate any assistance you can provide.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Come on. PDF?

1. It's not indexed and searchable so if anyone has similar problem in the future won't be able to find this thread.

2. It's not easy to read and copy/paste from. Especially on mobile devices.

3. Opening untrusted files from the internet isn't many people's idea of fun.

 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

You mock data doesn't appear to be consistent. For example, REFERENCE_ID = 6 is REGION = NORTHSIDE, yet INPUT_REFERENCE_ID = 6 is LOCATION = WESTSIDE. Please can you clarify and/or supply some consistent mock data?

0 Karma
Get Updates on the Splunk Community!

Fun with Regular Expression - multiples of nine

Fun with Regular Expression - multiples of nineThis challenge was first posted on Slack #regex channel ...

[Live Demo] Watch SOC transformation in action with the reimagined Splunk Enterprise ...

Overwhelmed SOC? Splunk ES Has Your Back Tool sprawl, alert fatigue, and endless context switching are making ...

What’s New & Next in Splunk SOAR

Security teams today are dealing with more alerts, more tools, and more pressure than ever.  Join us on ...