Splunk Search

Need to optimize firewall log search - auto-finalized after disk usage limit reached.

ejwade
Contributor

I need to create a table of all unique firewall connections over the last 90 days.

Our FortiGate firewall is configured to log all sessions. Here is the search I've created:

search index=firewalls sourcetype=fgt_traffic policyid!=0 earliest=-90d@m latest=now() 
| table _time, vd, policyid, srcip, dstip, dstport, proto 
| sort -_time, vd, policyid, srcip, dstip, dstport, proto 
| dedup vd, policyid, srcip, dstip, dstport, proto 

Talking it through:

I limit the search by the firewalls index, and FortiGate Traffic source type (excluding default deny) and specify the time modifiers. Then, I table the fields I need to output. Then, I sort by time in descending order and then all other fields in ascending order. Sorting this way allows connections to be grouped, with the most recent connection on top. Then, I run the dedup command on all fields, except time, to remove duplicates while retaining the most recent connection.

After the search completes, I receive the following messages:

  • Search auto-finalized after disk usage limit (10000MB) reached.
  • The sort command is truncating output to 10000 rows

Any ideas on how to optimize this search so I don't need to increase the disk usage quota for admin?

0 Karma
1 Solution

valiquet
Contributor

Use fields instead of table to discard data instead of hiding it. This will reduce the size.
Use stats instead of dedup

If you only need to keep the latest connection:
| stats last(_time) by vd, policyid, srcip, dstip, dstport, proto

Otherwise group them by time to save space:
| stats list(_time) by vd, policyid, srcip, dstip, dstport, proto

If you still have issues use summary or data model

View solution in original post

valiquet
Contributor

Use fields instead of table to discard data instead of hiding it. This will reduce the size.
Use stats instead of dedup

If you only need to keep the latest connection:
| stats last(_time) by vd, policyid, srcip, dstip, dstport, proto

Otherwise group them by time to save space:
| stats list(_time) by vd, policyid, srcip, dstip, dstport, proto

If you still have issues use summary or data model

ejwade
Contributor

This works well. Thank you, valiquet!

0 Karma

adonio
Ultra Champion

this is a huge search, why looking at 90 days? there are many ways to go about it, you can accelerate the Network traffic data model, and search it. you can use summary index, you can use stats instead of tabling sorting and deduping.
what is the problem you are trying to solve, what report are you looking to get?

0 Karma

ejwade
Contributor

I'm wanting to find the unique connections created over the last 90 days. We're creating a script to run this search via the Splunk API, and compare the results against the allowed connections, queried from the FortiGate API. We'll then have the ability to disable rules that have not been seen in the last 90 days.

I believe the issue is the size of the results. When I run a similar search, excluding _time and using stats, I don't have this issue. However, I need to retain the most recent connection, and its date/time stamp.

The report needs to output unique connections, including the last connection time, vdom, policy id, source ip, destination ip, destination port, and protocol.

0 Karma
Get Updates on the Splunk Community!

Cloud Platform | Customer Change Announcement: Email Notification Will Be Available ...

The Notification Team is migrating our email service provider since currently there’s no support ...

Mastering Synthetic Browser Testing: Pro Tips to Keep Your Web App Running Smoothly

To start, if you're new to synthetic monitoring, I recommend exploring this synthetic monitoring overview. In ...

Splunk Edge Processor | Popular Use Cases to Get Started with Edge Processor

Splunk Edge Processor offers more efficient, flexible data transformation – helping you reduce noise, control ...