Splunk Search

Need to optimize firewall log search - auto-finalized after disk usage limit reached.

ejwade
Contributor

I need to create a table of all unique firewall connections over the last 90 days.

Our FortiGate firewall is configured to log all sessions. Here is the search I've created:

search index=firewalls sourcetype=fgt_traffic policyid!=0 earliest=-90d@m latest=now() 
| table _time, vd, policyid, srcip, dstip, dstport, proto 
| sort -_time, vd, policyid, srcip, dstip, dstport, proto 
| dedup vd, policyid, srcip, dstip, dstport, proto 

Talking it through:

I limit the search by the firewalls index, and FortiGate Traffic source type (excluding default deny) and specify the time modifiers. Then, I table the fields I need to output. Then, I sort by time in descending order and then all other fields in ascending order. Sorting this way allows connections to be grouped, with the most recent connection on top. Then, I run the dedup command on all fields, except time, to remove duplicates while retaining the most recent connection.

After the search completes, I receive the following messages:

  • Search auto-finalized after disk usage limit (10000MB) reached.
  • The sort command is truncating output to 10000 rows

Any ideas on how to optimize this search so I don't need to increase the disk usage quota for admin?

0 Karma
1 Solution

valiquet
Contributor

Use fields instead of table to discard data instead of hiding it. This will reduce the size.
Use stats instead of dedup

If you only need to keep the latest connection:
| stats last(_time) by vd, policyid, srcip, dstip, dstport, proto

Otherwise group them by time to save space:
| stats list(_time) by vd, policyid, srcip, dstip, dstport, proto

If you still have issues use summary or data model

View solution in original post

valiquet
Contributor

Use fields instead of table to discard data instead of hiding it. This will reduce the size.
Use stats instead of dedup

If you only need to keep the latest connection:
| stats last(_time) by vd, policyid, srcip, dstip, dstport, proto

Otherwise group them by time to save space:
| stats list(_time) by vd, policyid, srcip, dstip, dstport, proto

If you still have issues use summary or data model

ejwade
Contributor

This works well. Thank you, valiquet!

0 Karma

adonio
Ultra Champion

this is a huge search, why looking at 90 days? there are many ways to go about it, you can accelerate the Network traffic data model, and search it. you can use summary index, you can use stats instead of tabling sorting and deduping.
what is the problem you are trying to solve, what report are you looking to get?

0 Karma

ejwade
Contributor

I'm wanting to find the unique connections created over the last 90 days. We're creating a script to run this search via the Splunk API, and compare the results against the allowed connections, queried from the FortiGate API. We'll then have the ability to disable rules that have not been seen in the last 90 days.

I believe the issue is the size of the results. When I run a similar search, excluding _time and using stats, I don't have this issue. However, I need to retain the most recent connection, and its date/time stamp.

The report needs to output unique connections, including the last connection time, vdom, policy id, source ip, destination ip, destination port, and protocol.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Index This | What travels the world but is also stuck in place?

April 2026 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Discover New Use Cases: Unlock Greater Value from Your Existing Splunk Data

Realizing the full potential of your Splunk investment requires more than just understanding current usage; it ...

Continue Your Journey: Join Session 2 of the Data Management and Federation Bootcamp ...

As data volumes continue to grow and environments become more distributed, managing and optimizing data ...