Splunk Search

Help with decoding data exfiltration- What time calculation is it doing?

Basavaraj
Engager

Reference : https://zpettry.com/cybersecurity/splunk-queries-data-exfiltration/

| bucket _time span=1d
| stats sum(bytes*) as bytes* by user _time src_ip
| eventstats max(_time) as maxtime avg(bytes_out) as avg_bytes_out stdev(bytes_out) as stdev_bytes_out 
| eventstats count as num_data_samples avg(eval(if(_time < relative_time(maxtime, "@h"),bytes_out,null))) as per_source_avg_bytes_out stdev(eval(if(_time < relative_time(maxtime, "@h"),bytes_out,null))) as per_source_stdev_bytes_out by src_ip | where num_data_samples >=4 AND bytes_out > avg_bytes_out + 3 * stdev_bytes_out AND bytes_out > per_source_avg_bytes_out + 3 * per_source_stdev_bytes_out AND _time >= relative_time(maxtime, "@h") | eval num_standard_deviations_away_from_org_average = round(abs(bytes_out - avg_bytes_out) / stdev_bytes_out,2), num_standard_deviations_away_from_per_source_average = round(abs(bytes_out - per_source_avg_bytes_out) / per_source_stdev_bytes_out,2) | fields - maxtime per_source* avg* stdev*


if you guys can decode this and let me know what is going on in this, especially with time what calculation is it doing with time could be helpful.

Labels (2)
0 Karma

richgalloway
SplunkTrust
SplunkTrust

@Basavaraj wrote:

Reference : https://zpettry.com/cybersecurity/splunk-queries-data-exfiltration/

```Group timestamps by day.  Changes the time for all events to 00:00:00```
|
bucket _time span=1d
```Total each field with a name beginning with "bytes". Group the results by user, day, and source address.``` | stats sum(bytes*) as bytes* by user _time src_ip
```Find the most recent timestamp. Compute the average value and standard deviation of the bytes_out field.``` | eventstats max(_time) as maxtime avg(bytes_out) as avg_bytes_out stdev(bytes_out) as stdev_bytes_out
```Count the number of results. Compute the average number and standard deviation of bytes sent at least an hour before the most recent timestamp for each source address```.
|
eventstats count as num_data_samples avg(eval(if(_time < relative_time(maxtime, "@h"),bytes_out,null))) as per_source_avg_bytes_out stdev(eval(if(_time < relative_time(maxtime, "@h"),bytes_out,null))) as per_source_stdev_bytes_out by src_ip ```Keep only the results with at least 4 values for a source address. Also, the number of bytes sent must be more than 3 standard deviations above the average for that source and they had to be sent within the last hour.```
|
where num_data_samples >=4 AND bytes_out > avg_bytes_out + 3 * stdev_bytes_out AND bytes_out > per_source_avg_bytes_out + 3 * per_source_stdev_bytes_out AND _time >= relative_time(maxtime, "@h") ```Compute how far the number of bytes sent is from average```
|
eval num_standard_deviations_away_from_org_average = round(abs(bytes_out - avg_bytes_out) / stdev_bytes_out,2), num_standard_deviations_away_from_per_source_average = round(abs(bytes_out - per_source_avg_bytes_out) / per_source_stdev_bytes_out,2) ```Discard fields we don't need anymore```
|
fields - maxtime per_source* avg* stdev*

 

---
If this reply helps you, Karma would be appreciated.
Get Updates on the Splunk Community!

Data Management Digest – December 2025

Welcome to the December edition of Data Management Digest! As we continue our journey of data innovation, the ...

Index This | What is broken 80% of the time by February?

December 2025 Edition   Hayyy Splunk Education Enthusiasts and the Eternally Curious!    We’re back with this ...

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...

Hello Splunk Community,   We're thrilled to share an exciting update that will help you manage your data more ...