Need an SPL to find the threshold for the domain

AL3Z · ‎10-27-2023

Hi,

I need an spl to find the threshold for the respective domains.
index=ss group="Threat Intelligence"
| stats values(attacker_score) as attacker_score by domain
eg.

admin.com

110

120

135

145

160

170

185

195

210

220

235

245

270

345

360

370

395

410

420

435

445

45

470

495

520

570

60

645

70

85

920

95

Thanks..

inventsekar · ‎10-29-2023

Hi @AL3Z ... by "threshold", are you looking to find out the "average".. so that if the threshold(average) is crossed, you will create some alerts, etc..

if so, then, pls check the avg command inside the stats command:

| stats avg(attacker_score) as avg_attacker_score by domain

may be we need more details from you, to suggest you better, thanks.

AL3Z · ‎10-31-2023

@inventsekar @yuanliu ,

Need a query that can help us identify the threshold at which a single source IP address hits the domain the most number of times.

Thanks

yuanliu · ‎10-31-2023

Still unclear. Your original sample code mentions no IP address. What is the field name for IP? How are volunteers going to know what data you have? And what is attacker_score? Is that the number of hits from a single IP? Are you simply looking for maximum of attacker_score?

To ask a data analytics question other people can help, illustrate relevant data (anonymize as needed), explain any characteristics others need to know, illustrate desired results, then explain the logical relationship between illustrated data and results.

AL3Z · ‎10-31-2023

Hi @yuanliu ,

Field name for ip is attackerip, exactly attacker_score Is that the number of hits from a single IP, I'm trying to find the threshold values based on the attacker_score.If suppose the attackerip max score is 2000, then threshold should be like 1500 to raise an alert .

domain	min_score	max_score	attackerip	hits
xyz.com	110	1985	191.168.1.1	2135
abc.com	520	1760	192.153.1.1	2165

Thanks

yuanliu · ‎11-01-2023

Let me see if I understand this correctly:

Fields domain, min_score, max_score, attackerip, and hits are all available to you either from index search or some other SPL manipulation.
The "threshold" mentioned in the title and initial description, and subsequent comments, is to be calculated by a mathematical formula based on max_score, e.g., 3/4 * max_score.
You want to set up an alert based on hits per attackerip per domain.

The initial description and subsequent comments till now only mentioned one and half relevant field names (domain + attacker_score) and nothing else. How are volunteers to know?

If these are the conditions, I will assume that max_score is derived from attacker_score per attackerip, that hits is derived from event count, and that min_score is of no consequence in your formula. In other words,

index=ss group="Threat Intelligence"
| stats max(attacker_score) as max_score count as hits by domain attackerip

If this speculation is correct, setting up alert is a matter of applying your formula.

| where 4 * hits > 3 * max_score

AL3Z · ‎11-01-2023

Hi @inventsekar @yuanliu

To be more precise here is my spl

index=ss group="Threat Intelligence"
``` here I'm grouping the domain names in to single group by there naming convention```
| eval domain_group=case(                                                     
like(domain_name, "%cisco%"), "cisco",
like(domain_name, "%wipro%"), "wipro",
like(domain_name, "%IBM%"), "IBM",
true(), "other"
)
| stats count as hits, min(attacker_score) as min_score, max(attacker_score) as max_score by domain_group, attackerip
| sort -hits
| eval range = max_score - min_score
| eval threshold = round(min_score + (2 * (range/3)), 0)
| streamstats max(hits) as max_hits by domain_group
| where hits >= max_hits
| table domain_group, min_score, max_score, attackerip, hits, threshold
| dedup domain_group

o/p:

domain_group	min_score	max_score	attackerip	hits	threshold
cisco	510	1635	XXXXXX	2174	1260
other	960	1760	YYYYYY	2173	1493
wipro	1985	1985	ZZZZZZ	2169	1985
IBM	335	1910	PPPPPP	2153	1385

Note: here for the wipro we get to see the same score for both the min and max , we need to fix this !
threshold for the wipro is showing the same as its max_score .

Thanks..

yuanliu · ‎11-01-2023

So, your formula includes min_score as base, and sets "threshold" at 2/3 between min and max. In this case, if your data has no range between min and max, this formula will give you the same number as min==max. Only people with intimate knowledge about that data and this particular use case can determine what the best alternative formula could be.

Say, for example, if you decide that instead of min_score + 2/3 * range for all, you want to use the existing formula when range is, say greater than 1/10 of min_score, but use 4/5 * max_score if range is too narrow, you could just express this in SPL.

index=ss group="Threat Intelligence"
``` here I'm grouping the domain names in to single group by there naming convention```
| eval domain_group=case(                                                     
like(domain_name, "%cisco%"), "cisco",
like(domain_name, "%wipro%"), "wipro",
like(domain_name, "%IBM%"), "IBM",
true(), "other"
)
| stats count as hits, min(attacker_score) as min_score, max(attacker_score) as max_score by domain_group, attackerip
| sort -hits
| eval range = max_score - min_score
| eval threshold =round(if(range > min_score / 10), min_score + (2 * (range/3)), max_score * 4 / 5), 0)
| eventstats max(hits) as max_hits by domain_group ``` eventstats instead of streamstats ```
| where hits >= threshold ``` threshold is used in place of max_hits ```
| table domain_group, min_score, max_score, attackerip, hits, threshold
| dedup domain_group

This said, I notice the streamstats and dedup in your code, and the criterion hits >= max_hits. Maybe you have a different use case in mind?

threshold is not used at all. Why calculate it? The condition hits >= max combined with streamstats (as opposed to eventstats as I illustrated above) will result in alerts for every IP that has larger hits than all previous ones (instead of the largest one, or ones that exceed calculated threshold) - is this what you wanted?
your table retains attackerip, but dedup domain_group will lose all except the highest in the group.
Maybe your use case is simpler, that you want every domain group to alert, but alert only on the IP address with largest hits?

This use case is still very unclear.

AL3Z · ‎11-01-2023

Hi @yuanliu ,

Usecase is to find the threshold for the maximum attackers_score of the domain group and it's attackerip count for the maximum attacker_score from a single ip.

Thanks

yuanliu · ‎11-01-2023

Usecase is to find the threshold for the maximum attackers_score of the domain group and it's attackerip count for the maximum attacker_score from a single ip.

Do you mean that the threshold calculation is not to be used in the alert. And that you want to select every IP with the highest count in the group and send as alert.

Now, back to the discussion about min and max. Assuming you still want a different formula when range is too small as I speculated, you can do

index=ss group="Threat Intelligence"
``` here I'm grouping the domain names in to single group by there naming convention```
| eval domain_group=case(                                                     
like(domain_name, "%cisco%"), "cisco",
like(domain_name, "%wipro%"), "wipro",
like(domain_name, "%IBM%"), "IBM",
true(), "other"
)
| stats count as hits, min(attacker_score) as min_score, max(attacker_score) as max_score by domain_group, attackerip
| sort -hits
| eval range = max_score - min_score
| eval threshold =round(if(range > min_score / 10), min_score + (2 * (range/3)), max_score * 4 / 5), 0)
| eventstats max(hits) as max_hits by domain_group ``` eventstats instead of streamstats ```
| where hits == max_hits
| table domain_group, min_score, max_score, attackerip, hits, threshold

If this does not give you the desired output, you will need to illustrate the input, actual output, (anonymize as needed) desired output, and explain the logic between input and desired output without using SPL.

yuanliu · ‎10-28-2023

Are you going to tell us what "threshold" means in your mock data?

Need an SPL to find the threshold for the domain

stats

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer Certification at ...

Share Your Ideas & Meet the Lantern team at .Conf! Plus All of This Month’s New ...