Security

Detecting Beaconing Using Fourier Transform (FFT)

dokaas_2
Communicator

Does anyone know of an add-on or other script that would allow one to analyze network traffic to detect beaconing using a Fourier transform (FFT)?

0 Karma
1 Solution

tscroggins
Influencer

@dokaas_2 

You can extend Splunk Machine Learning Toolkit to include the FFT algorithm. The following is an example adapted from https://www.ritchievink.com/blog/2017/04/23/understanding-the-fourier-transform-by-example/.

First, let's generate the sample data:

| makeresults count=500 
| streamstats count as t 
| eval t=exact(t/1000)-0.001, s=sin(40*2*pi()*t)+0.5*sin(90*2*pi()*t) 
| table t s

We should have signals with frequencies of 40 and 90 cycles.

dokaas_2_samples.png

Next, let's add our algorithm stanza to $SPLUNK_HOME/etc/apps/Splunk_ML_Toolkit/local/algos.conf:

[FFT]

Restart Splunk to enable the algorithm.

Next, let's write the algorithm interface in $SPLUNK_HOME/etc/apps/Splunk_ML_Toolkit/bin/algos/FFT.py. This is just an example with no input validation:

#!/usr/bin/env python

import numpy as np
import pandas as pd

from base import BaseAlgo

class FFT(BaseAlgo):
    def __init__(self, options):
        # Option checking & initializations here
        pass

    def fit(self, df, options):
        # Fit an estimator to df, a pandas DataFrame of the search results

        s = df[self.target_variable]
        t = df[self.feature_variables]

        fft = np.fft.fft(s)
        T = t[t.columns[0]][1] - t[t.columns[0]][0]
        N = fft.size
        freq = np.linspace(0, 1 / T, N)[:N // 2]
        amp = np.abs(fft)[:N //2 ] * 1 / N

        df = pd.DataFrame({'Frequency': freq, 'Amplitude': amp}, columns=['Frequency', 'Amplitude'])

        return df

 Finally, let's try the algorithm with the fit command:

| makeresults count=500 
| streamstats count as t 
| eval t=exact(t/1000)-0.001, s=sin(40*2*pi()*t)+0.5*sin(90*2*pi()*t) 
| table t s
| fit FFT s from t

dokaas_2_fft.png

Signals were detected at 40 and 90 cycles with the amplitudes (halved) shown.

If you have a sample data set, we can test it directly.

View solution in original post

tscroggins
Influencer

@dokaas_2 

You can extend Splunk Machine Learning Toolkit to include the FFT algorithm. The following is an example adapted from https://www.ritchievink.com/blog/2017/04/23/understanding-the-fourier-transform-by-example/.

First, let's generate the sample data:

| makeresults count=500 
| streamstats count as t 
| eval t=exact(t/1000)-0.001, s=sin(40*2*pi()*t)+0.5*sin(90*2*pi()*t) 
| table t s

We should have signals with frequencies of 40 and 90 cycles.

dokaas_2_samples.png

Next, let's add our algorithm stanza to $SPLUNK_HOME/etc/apps/Splunk_ML_Toolkit/local/algos.conf:

[FFT]

Restart Splunk to enable the algorithm.

Next, let's write the algorithm interface in $SPLUNK_HOME/etc/apps/Splunk_ML_Toolkit/bin/algos/FFT.py. This is just an example with no input validation:

#!/usr/bin/env python

import numpy as np
import pandas as pd

from base import BaseAlgo

class FFT(BaseAlgo):
    def __init__(self, options):
        # Option checking & initializations here
        pass

    def fit(self, df, options):
        # Fit an estimator to df, a pandas DataFrame of the search results

        s = df[self.target_variable]
        t = df[self.feature_variables]

        fft = np.fft.fft(s)
        T = t[t.columns[0]][1] - t[t.columns[0]][0]
        N = fft.size
        freq = np.linspace(0, 1 / T, N)[:N // 2]
        amp = np.abs(fft)[:N //2 ] * 1 / N

        df = pd.DataFrame({'Frequency': freq, 'Amplitude': amp}, columns=['Frequency', 'Amplitude'])

        return df

 Finally, let's try the algorithm with the fit command:

| makeresults count=500 
| streamstats count as t 
| eval t=exact(t/1000)-0.001, s=sin(40*2*pi()*t)+0.5*sin(90*2*pi()*t) 
| table t s
| fit FFT s from t

dokaas_2_fft.png

Signals were detected at 40 and 90 cycles with the amplitudes (halved) shown.

If you have a sample data set, we can test it directly.

dokaas_2
Communicator

Drop the mic and let me buy you a drink at the next .CONF!

 

dokaas_2
Communicator

So here's a scatter chart plotting the resultant magnitude.  I find a scatter chart a little easier to see the dominant frequencies (those that show stacked columns).  Clearly there is a strong beacon at 1 Hz and even stronger one at 1/2 Hz (every 2 sec).   There are probably others to inspect. 

The data was generated looking at DNS traffic from Corelight data.  The data could have come from Splunk Stream just as easily, but we already have a Corelight infrastructure.  The query excludes internal DNS traffic and includes only A, AAAA, TXT DNS records.  Of course there's a lot of other factors such as  DNS caching and rotating ads to consider. 

Now on to some addition hunting to find and exclude benign sources and hopefully find nothing!   As an aside, if anyone wants to see an fun use of the Fourier series, lookup "Fourier" and "Homer Simpson" on YouTube and see how Fourier series can draw Homer.

 

Splunk_Beacon_Analysis.PNG

tscroggins
Influencer

@dokaas_2 

You've adapted this better than I have! I was looking for ways to define and group FFT output by specific features, e.g. src-dest tuples.

What general form did your base search take?

0 Karma

dokaas_2
Communicator

The query is something like this:

  1. The query uses CoreLight data and excludes local and well known sources.
  2. | timechart count as s  span=5ds
  3. | fillnull value = 0
  4. | eval time_interval = 0.5
  5. | eval sequence_number = 1
  6. | streamstats current=f sum(sequence_number) as seq
  7. | streamstats sum(time_interval) as time
  8. | eval time = time - time_interval
  9. | head=4096
  10. | table time, s
  11. | fit FFT s from time

Still working with it ....  Do you have any suggestions, improvements?

 

Tags (1)
0 Karma

dokaas_2
Communicator

So here's a sample dashboard.
DNS_Beacon_Detectin_Using_FFT.PNG

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Unlock What’s Next: The Splunk Cloud Platform at .conf25

In just a few days, Boston will be buzzing as the Splunk team and thousands of community members come together ...