All Apps and Add-ons

how do I filter-out 'return' traffic in AWS VPC FlowLogs?

Path Finder

I am trying to build a report for AWS FlowLogs which can be used to analyze SG usage. Specifically, I want a list of incoming traffic (by 'destip') which shows all IP/port combinations. Unfortunately, a simple 'stats count by destip,destport,protocol,srcip,srcport' does not result in a usable report -- because all the stateful return-traffic is listed, too. There are 10K's worth of incoming packets with destport in the 1024-65535 range, i.e., where that particular 'dest' server had initiated a connection using an ephemeral local port and then the return traffic went to the same port. So 99% of the 'incoming' ports are not actual listeners which we need to include in our SGs.

I have spent hours testing various combinations of filters, e.g. count<5, or dest_port>18000 or (destport>1024 AND srcport<1024) or even a 'where NOT IN(src_port,22,53,80,3389, etc)'. But we have a lot of services which use high-port numbers so all these methods accidentally remove valid traffic.

Instead, I think the only accurate method would be one where each connection is evaluated for:
- is the incoming 'dest_port' above 1024?
- if so, is there a corresponding packet in the preceding 1000 ms, i.e., identical-but-reversed dest and src IP/ports?
- if so, assume this later packet is the return from a stateful request sent on an ephemeral port -- remove it from the results!

Has anyone else run into this situation, and what was your solution? Thank you for any suggestions!

0 Karma
1 Solution

Path Finder

For the moment, this is the approach I am taking:

  • I created an unfiltered report of 'stats count by destip,destport,protocol,src_ip'
  • I temporarily sorted this report by ‘top limit=100 dest_port’
  • One at a time, I added the most common ‘destports’ to a filter on **srcport**, i.e., I removed what I knew to be return-traffic from those servers. This looked like 'where NOT IN(src_port,22,53,80)'
  • I also learned the hard-way that I had to refresh the report and double-check that the port I’d just added didn’t disappear from the results, e.g. when I removed port 0 (ICMP) it disappeared because it is a service which does not use ephemeral ports
  • Finally, once I felt I had a good list, I added a catch-all filter for ‘dest_port<16000’ because the highest valid port I’d found was 15672.

This process allowed me to identify 25 incoming ports. I don't know if this is all of them but ignoring the return-traffic from those 'src_ports' reduced the list of src/dest combinations from 300K to 10K

View solution in original post

0 Karma

Contributor

You can create another field - application port - (app_port) - which has a value lower than random client port - as follows:

| eval appport=if(((srcport<destport AND srcport!=0) OR destport==0),srcport,dest_port)

then use app_port in your stats and/or WHERE,

0 Karma

Path Finder

Unfortunately, we have app-ports all the way into the 50,000 port-range, and the ephemeral ports start at 1024. So 'src_port

0 Karma

Contributor

In our product, NetFlow Optimizer, there is a rule/module that stitches request/reply flows. It is based on a list of known application ports (https://en.wikipedia.org/wiki/List_of_TCP_and_UDP_port_numbers), but the list is configurable (you can upload your own list into the rule). Please contact us directly if have any questions or would like to try it - trials@netflowlogic.com

0 Karma

Path Finder

For the moment, this is the approach I am taking:

  • I created an unfiltered report of 'stats count by destip,destport,protocol,src_ip'
  • I temporarily sorted this report by ‘top limit=100 dest_port’
  • One at a time, I added the most common ‘destports’ to a filter on **srcport**, i.e., I removed what I knew to be return-traffic from those servers. This looked like 'where NOT IN(src_port,22,53,80)'
  • I also learned the hard-way that I had to refresh the report and double-check that the port I’d just added didn’t disappear from the results, e.g. when I removed port 0 (ICMP) it disappeared because it is a service which does not use ephemeral ports
  • Finally, once I felt I had a good list, I added a catch-all filter for ‘dest_port<16000’ because the highest valid port I’d found was 15672.

This process allowed me to identify 25 incoming ports. I don't know if this is all of them but ignoring the return-traffic from those 'src_ports' reduced the list of src/dest combinations from 300K to 10K

View solution in original post

0 Karma

Path Finder

I would like to have also filtered on 'WHERE src_port>1024' since that's always true for ephemeral ports, but I didn't know how to combine the 2 filters in 1 WHERE -- when I tried it I got unexpected results.

0 Karma