I'm a novice working in fraud prevention; appreciate your help. When running the following, I'm getting a failure error and job inspector shows excessive time (106.46) on dispatch.evaluate.join. Can you help identify what needs to change to output chart of Condition_Attrib_17 by Treatment Group. I'm a novice working in fraud prevention; appreciate your help.
index=TEST sourcetype="TEST:user_activity" application_id=ABC123 policy_id=" UPDATE" "*"
| dedup data.condition_attrib_22
| rename data. condition_attrib_22 AS data.params.policy
| fields data.params.policy
| eval join_key=data.params.policy
| fields join_key, data.treatment_group
| join type=inner join_key
[search index=TEST sourcetype="TEST:user_activity" application_id=ABC123 policy_id="UPDATE" "*"]
| stats latest(data.request.condition_attrib_17) as Condition_Attrib_17 by data. request.condition_attrib_22
| rename data.request.condition_attrib_22 as join_key
| fields join_key, Condition_Attrib_17
| chart count by Condition_Attrib_17 by data.treatment_group
If possible share sanitised sample events otherwise we will not be able to actually help 😉
Cheers, MuS
I would try and avoid using join unless absolutely necessary, you can get the chart in a single pass with stats, then chart. Also it looks like the chart syntax is wrong; it should be “chart count over X by Y”, not “chart count by X by Y”.
index=TEST sourcetype="TEST:user_activity" application_id=ABC123 policy_id=UPDATE
| stats latest(data.request.condition_attrib_17) as Condition_Attrib_17 latest(data.treatment_group) as treatment_group by data.request.condition_attrib_22
| where isnotnull(Condition_Attrib_17) AND isnotnull(treatment_group)
| chart count over Condition_Attrib_17 by treatment_group
If your key is data.condition_attrib_22 (not data.request.condition_attrib_22), change the stats “by” field accordingly. Also, if multiple treatment_group values can exist per key and you want one, replace latest(...) with values(...) and then mvexpand treatment_group before charting.
🌟 Did this answer help you? If so, please consider:
Your feedback encourages the volunteers in this community to continue contributing
This was helpful by giving me new techniques. But, it didn't return data and the data is there in the json. One thing I note: The policy_id for Condition_Attrib_17 is UPDATE, but the policy_id for Treatment_Group is SRF. Modified to policy_id IN (UPDATE,SRF) but it still didn't return any data which I can see in the json data. I'm thankful that you voluntarily give thoughts to help me learn.
Everybody has already told you that you shouldn't use join in the first place. @MuS asked you to illustrate your data, which is always the best recommendation. Now that you mention your dateset is in JSON, you really have to share/mock data. Sanitize any sensitive information but make sure to maintain structures that matter.
Also, instead of telling volunteers "error when I run this complex SPL snippet", follow these golden rules; nay, call them the four commandments: