Splunk Search

Help with decoding data exfiltration- What time calculation is it doing?

Basavaraj
Engager

Reference : https://zpettry.com/cybersecurity/splunk-queries-data-exfiltration/

| bucket _time span=1d
| stats sum(bytes*) as bytes* by user _time src_ip
| eventstats max(_time) as maxtime avg(bytes_out) as avg_bytes_out stdev(bytes_out) as stdev_bytes_out 
| eventstats count as num_data_samples avg(eval(if(_time < relative_time(maxtime, "@h"),bytes_out,null))) as per_source_avg_bytes_out stdev(eval(if(_time < relative_time(maxtime, "@h"),bytes_out,null))) as per_source_stdev_bytes_out by src_ip | where num_data_samples >=4 AND bytes_out > avg_bytes_out + 3 * stdev_bytes_out AND bytes_out > per_source_avg_bytes_out + 3 * per_source_stdev_bytes_out AND _time >= relative_time(maxtime, "@h") | eval num_standard_deviations_away_from_org_average = round(abs(bytes_out - avg_bytes_out) / stdev_bytes_out,2), num_standard_deviations_away_from_per_source_average = round(abs(bytes_out - per_source_avg_bytes_out) / per_source_stdev_bytes_out,2) | fields - maxtime per_source* avg* stdev*


if you guys can decode this and let me know what is going on in this, especially with time what calculation is it doing with time could be helpful.

Labels (2)
0 Karma

richgalloway
SplunkTrust
SplunkTrust

@Basavaraj wrote:

Reference : https://zpettry.com/cybersecurity/splunk-queries-data-exfiltration/

```Group timestamps by day.  Changes the time for all events to 00:00:00```
|
bucket _time span=1d
```Total each field with a name beginning with "bytes". Group the results by user, day, and source address.``` | stats sum(bytes*) as bytes* by user _time src_ip
```Find the most recent timestamp. Compute the average value and standard deviation of the bytes_out field.``` | eventstats max(_time) as maxtime avg(bytes_out) as avg_bytes_out stdev(bytes_out) as stdev_bytes_out
```Count the number of results. Compute the average number and standard deviation of bytes sent at least an hour before the most recent timestamp for each source address```.
|
eventstats count as num_data_samples avg(eval(if(_time < relative_time(maxtime, "@h"),bytes_out,null))) as per_source_avg_bytes_out stdev(eval(if(_time < relative_time(maxtime, "@h"),bytes_out,null))) as per_source_stdev_bytes_out by src_ip ```Keep only the results with at least 4 values for a source address. Also, the number of bytes sent must be more than 3 standard deviations above the average for that source and they had to be sent within the last hour.```
|
where num_data_samples >=4 AND bytes_out > avg_bytes_out + 3 * stdev_bytes_out AND bytes_out > per_source_avg_bytes_out + 3 * per_source_stdev_bytes_out AND _time >= relative_time(maxtime, "@h") ```Compute how far the number of bytes sent is from average```
|
eval num_standard_deviations_away_from_org_average = round(abs(bytes_out - avg_bytes_out) / stdev_bytes_out,2), num_standard_deviations_away_from_per_source_average = round(abs(bytes_out - per_source_avg_bytes_out) / per_source_stdev_bytes_out,2) ```Discard fields we don't need anymore```
|
fields - maxtime per_source* avg* stdev*

 

---
If this reply helps you, Karma would be appreciated.
Get Updates on the Splunk Community!

See just what you’ve been missing | Observability tracks at Splunk University

Looking to sharpen your observability skills so you can better understand how to collect and analyze data from ...

Weezer at .conf25? Say it ain’t so!

Hello Splunkers, The countdown to .conf25 is on-and we've just turned up the volume! We're thrilled to ...

How SC4S Makes Suricata Logs Ingestion Simple

Network security monitoring has become increasingly critical for organizations of all sizes. Splunk has ...