<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Data Exfiltration via E-Mail in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Data-Exfiltration-via-E-Mail/m-p/742172#M240789</link>
    <description>&lt;P&gt;Hey everyone,&lt;/P&gt;&lt;P&gt;I am currently trying to write a search that monitors outgoing E-Mail traffic. The goal is to see if business-relevant information is being exfiltrated via E-Mail. Since I am new to writing SPL I tried the following:&lt;/P&gt;&lt;P&gt;First, I wanted to write a simple search that would show me all E-Mails where the size of the E-Mail is exceeding a set threshold. That's what I came up with:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;| datamodel Email search&lt;BR /&gt;| search All_Email.src_user="SOMETHING I USE TO MAKE SURE THE TRAFFIC IS GOING FROM INTERNAL TO EXTERNAL" AND sourcetype="fml:*"&lt;BR /&gt;| stats&lt;BR /&gt;&amp;nbsp;values(_time) as _time&lt;BR /&gt;&amp;nbsp;values(All_Email.src_user) as src_user&lt;BR /&gt;&amp;nbsp;values(All_Email.recipient) as recipient&lt;BR /&gt;&amp;nbsp;values(All_Email.file_name) as file_name&lt;BR /&gt;&amp;nbsp;values(All_Email.subject) as subject&lt;BR /&gt;&amp;nbsp;values(All_Email.size) as size&lt;BR /&gt;&amp;nbsp;by All_Email.message_id&lt;BR /&gt;| eval size_MB=round(size/1000000,3)&lt;BR /&gt;| `ctime(alert_time)`&lt;BR /&gt;| where 'size_MB'&amp;gt;X&lt;BR /&gt;| fields - size&lt;/P&gt;&lt;P&gt;As far as I can see, it does what I initially wanted it to do.&lt;/P&gt;&lt;P&gt;Upon further testing and thinking, I noticed a flaw. If Data is exfiltrated over a given time through many different E-Mails, that search would not trigger since the threshold X would not be exceeded in one E-Mail. That's why I wanted to write a new Search using tstats (since the above search was pretty slow) where the traffic from A to the same recurring recipient is being added up in a given time period. If the accumulated traffic would exceed a given threshold, the search would trigger.&lt;/P&gt;&lt;P&gt;I then came up with this:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;| tstats&lt;BR /&gt;min(_time) as alert_time&lt;BR /&gt;max(_time) as end_time&lt;BR /&gt;values(All_Email.file_name) as file_name&lt;BR /&gt;values(All_Email.subject) as subject&lt;BR /&gt;values(All_Email.size) as size&lt;BR /&gt;from datamodel=Email&lt;BR /&gt;WHERE All_Email.src_user="SOMETHING I USE TO MAKE SURE THE TRAFFIC IS GOING FROM INTERNAL TO EXTERNAL" AND sourcetype="fml:*"&lt;BR /&gt;by All_Email.src_user, All_Email.recipient&lt;BR /&gt;| eval size_MB=round(size/1000000,3)&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;This search is not finished (threshold missing, etc.) since I noticed that an E-Mail with multiple attachments does not calculate the size correctly. It lists all the sizes of the different attachments but does not calculate a sum. I think the "by All_Email.src_user, All_Email.recipient" statement does not work as I intended it to.&lt;/P&gt;&lt;P&gt;I would be happy to get some feedback on how to improve. Maybe the Code I wrote is way to complicated or does not work as it's supposed to.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Since I am new to writing SPL, are there any standards on how to write clean SPL or any resources where I can study many different (good) searches so that I can improve in writing my own searches? I would appreciate any form of help!&lt;/P&gt;&lt;P&gt;Thank you very much!&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
    <pubDate>Wed, 19 Mar 2025 08:40:17 GMT</pubDate>
    <dc:creator>Skinny</dc:creator>
    <dc:date>2025-03-19T08:40:17Z</dc:date>
    <item>
      <title>Data Exfiltration via E-Mail</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Data-Exfiltration-via-E-Mail/m-p/742172#M240789</link>
      <description>&lt;P&gt;Hey everyone,&lt;/P&gt;&lt;P&gt;I am currently trying to write a search that monitors outgoing E-Mail traffic. The goal is to see if business-relevant information is being exfiltrated via E-Mail. Since I am new to writing SPL I tried the following:&lt;/P&gt;&lt;P&gt;First, I wanted to write a simple search that would show me all E-Mails where the size of the E-Mail is exceeding a set threshold. That's what I came up with:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;| datamodel Email search&lt;BR /&gt;| search All_Email.src_user="SOMETHING I USE TO MAKE SURE THE TRAFFIC IS GOING FROM INTERNAL TO EXTERNAL" AND sourcetype="fml:*"&lt;BR /&gt;| stats&lt;BR /&gt;&amp;nbsp;values(_time) as _time&lt;BR /&gt;&amp;nbsp;values(All_Email.src_user) as src_user&lt;BR /&gt;&amp;nbsp;values(All_Email.recipient) as recipient&lt;BR /&gt;&amp;nbsp;values(All_Email.file_name) as file_name&lt;BR /&gt;&amp;nbsp;values(All_Email.subject) as subject&lt;BR /&gt;&amp;nbsp;values(All_Email.size) as size&lt;BR /&gt;&amp;nbsp;by All_Email.message_id&lt;BR /&gt;| eval size_MB=round(size/1000000,3)&lt;BR /&gt;| `ctime(alert_time)`&lt;BR /&gt;| where 'size_MB'&amp;gt;X&lt;BR /&gt;| fields - size&lt;/P&gt;&lt;P&gt;As far as I can see, it does what I initially wanted it to do.&lt;/P&gt;&lt;P&gt;Upon further testing and thinking, I noticed a flaw. If Data is exfiltrated over a given time through many different E-Mails, that search would not trigger since the threshold X would not be exceeded in one E-Mail. That's why I wanted to write a new Search using tstats (since the above search was pretty slow) where the traffic from A to the same recurring recipient is being added up in a given time period. If the accumulated traffic would exceed a given threshold, the search would trigger.&lt;/P&gt;&lt;P&gt;I then came up with this:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;| tstats&lt;BR /&gt;min(_time) as alert_time&lt;BR /&gt;max(_time) as end_time&lt;BR /&gt;values(All_Email.file_name) as file_name&lt;BR /&gt;values(All_Email.subject) as subject&lt;BR /&gt;values(All_Email.size) as size&lt;BR /&gt;from datamodel=Email&lt;BR /&gt;WHERE All_Email.src_user="SOMETHING I USE TO MAKE SURE THE TRAFFIC IS GOING FROM INTERNAL TO EXTERNAL" AND sourcetype="fml:*"&lt;BR /&gt;by All_Email.src_user, All_Email.recipient&lt;BR /&gt;| eval size_MB=round(size/1000000,3)&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;This search is not finished (threshold missing, etc.) since I noticed that an E-Mail with multiple attachments does not calculate the size correctly. It lists all the sizes of the different attachments but does not calculate a sum. I think the "by All_Email.src_user, All_Email.recipient" statement does not work as I intended it to.&lt;/P&gt;&lt;P&gt;I would be happy to get some feedback on how to improve. Maybe the Code I wrote is way to complicated or does not work as it's supposed to.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Since I am new to writing SPL, are there any standards on how to write clean SPL or any resources where I can study many different (good) searches so that I can improve in writing my own searches? I would appreciate any form of help!&lt;/P&gt;&lt;P&gt;Thank you very much!&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 19 Mar 2025 08:40:17 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Data-Exfiltration-via-E-Mail/m-p/742172#M240789</guid>
      <dc:creator>Skinny</dc:creator>
      <dc:date>2025-03-19T08:40:17Z</dc:date>
    </item>
    <item>
      <title>Re: Data Exfiltration via E-Mail</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Data-Exfiltration-via-E-Mail/m-p/742173#M240790</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/308629"&gt;@Skinny&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I think you probably meant to use&amp;nbsp;&lt;SPAN&gt;&lt;STRONG&gt;sum(All_Email.size) as size&lt;/STRONG&gt; instead of&amp;nbsp;&lt;STRONG&gt;values(All_Email.size) as size&lt;/STRONG&gt;? Then it should sum the sizes rather than return a list.&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Please let me know how you get on and consider adding karma to this or any other answer if it has helped.&lt;BR /&gt;Regards&lt;BR /&gt;&lt;BR /&gt;Will&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 19 Mar 2025 08:50:21 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Data-Exfiltration-via-E-Mail/m-p/742173#M240790</guid>
      <dc:creator>livehybrid</dc:creator>
      <dc:date>2025-03-19T08:50:21Z</dc:date>
    </item>
    <item>
      <title>Re: Data Exfiltration via E-Mail</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Data-Exfiltration-via-E-Mail/m-p/742180#M240793</link>
      <description>&lt;P&gt;Hey&amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/170906"&gt;@livehybrid&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;Thank you very much; that solved the problem!&lt;/P&gt;&lt;P&gt;Now that it can calculate the sum of the attachments, how do I make sure that the search accumulates every event where User A sends to the same recipient and calculates the sum of the overall traffic generated? Since I don't know how to put it to words properly, here's an example:&lt;/P&gt;&lt;P&gt;E-Mail 1: from User A -&amp;gt; to User B with size=10MB - Was sent at 11:10&lt;BR /&gt;E-Mail 2: from User A -&amp;gt; to User B with size=8MB - Was sent at 12:14&lt;BR /&gt;E-Mail 3: from User A -&amp;gt; to User C with size=20MB - Was sent at 13:41&lt;BR /&gt;E-Mail 4: from User A -&amp;gt; to User B with size=23MB - Was sent at 13:55&lt;/P&gt;&lt;P&gt;As shown above, user A sent to two different recipients (B and C). I now want the search to sum up the overall traffic from A to recipient X over the span of 4 hours, like so:&amp;nbsp;&lt;/P&gt;&lt;P&gt;Traffic of A to B = 41MB&lt;BR /&gt;Traffic of A to C = 20MB&lt;/P&gt;&lt;P&gt;Let's say the threshold of my search would be 40MB over the span of 4 hours. Could you also help me with that?&lt;/P&gt;&lt;P&gt;Thank you very much so far!&lt;/P&gt;</description>
      <pubDate>Wed, 19 Mar 2025 10:19:49 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Data-Exfiltration-via-E-Mail/m-p/742180#M240793</guid>
      <dc:creator>Skinny</dc:creator>
      <dc:date>2025-03-19T10:19:49Z</dc:date>
    </item>
    <item>
      <title>Re: Data Exfiltration via E-Mail</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Data-Exfiltration-via-E-Mail/m-p/742195#M240796</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/308629"&gt;@Skinny&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;What does your search look like so far? If you're doing&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| stats sum(All_Email.size) as size by All_Email.src_user, All_Email.recipient&lt;/LI-CODE&gt;&lt;P&gt;Then I think it should already be grouping it like this?&lt;/P&gt;&lt;P&gt;Please let me know how you get on and consider adding karma to this or any other answer if it has helped.&lt;BR /&gt;Regards&lt;BR /&gt;&lt;BR /&gt;Will&lt;/P&gt;</description>
      <pubDate>Wed, 19 Mar 2025 13:28:15 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Data-Exfiltration-via-E-Mail/m-p/742195#M240796</guid>
      <dc:creator>livehybrid</dc:creator>
      <dc:date>2025-03-19T13:28:15Z</dc:date>
    </item>
  </channel>
</rss>

