All Apps and Add-ons
Highlighted

SonicWall Analytics: Why is the size of this lookup so huge?

Hi,
I can see in the savedsearch [http session id lookup] from Dell SonicWall Analytics App 1867 where the executed query looks every 60 minutes for the sessionID's that have bstrong texteen logged and matches them against the existing ones already present in the lookup file sonicwallhttpsession_id.csv, appends the new one and stores everything again.

index=sonicwall tid=257 app_name="General HTTPS" OR app_name="General HTTP" OR app_name="HTTP" |  inputlookup sonicwall_http_session_id.csv append=t | dedup session_id |  fields session_id, src_ip, dest_ip, app_name | fields - _* | outputlookup sonicwall_http_session_id.csv

This ended up in having a very large lookup file with currently more than 4GB size and over 70 million records in it. Running the above search takes more than 15 minutes currently and the memory consumption is significant during this time.

Anyone facing same? Or is there a recommendation around to avoid having this file growing without any limitation?

Many thanks,
Martin

0 Karma
Highlighted

Re: SonicWall Analytics: Why is the size of this lookup so huge?

Explorer

Yeah, this is a problem I've been putting off investigating until today. Are we the only two people using dsa?

I just setup a cronjob to delete the file everyday lest it grow to big and cause the distributed bundle replication manager to start choking.

View solution in original post

0 Karma
Highlighted

Re: SonicWall Analytics: Why is the size of this lookup so huge?

Many thanks for the answer. The maintenance of files within a splunk app doesn't seem the right thing to do since you will have to deal with documentation, handover and other operational questions to let everyone know..

I have tried to handle same within the dsa app and wrote two little queries using inputlookup and outputlookup, saved them as reports and configured a monthly schedule.

The first query takes the content of the sonicwallhttpsessionid.csv file and outputs the same into sonicwallhttpsessionid.bak.csv

| inputlookup sonicwall_http_session_id.csv | outputlookup sonicwall_http_session_id.bak.csv

The second query then reads the file sonicwallhttpsession_id.csv, keeps the first 15'000'000 entries (about what we have for a month) and writes the content into same file again.

| inputlookup sonicwall_http_session_id.csv | head 15000000 | outputlookup sonicwall_http_session_id.csv

With this setup, we retain the information of this file for 2 months in the system. Optionally, you can replace the .bak. from first query with todays date and keep the rotated files on the filesystem as long as you like.

Both reports are scheduled to run once a month where second is setup to run 15 minutes after the first one.

This setup has done fine for us so far. Any better idea to deal with this issue is most welcome. Currently we do above rotation with the *session_id.csv only because it's the fastest growing. But a more 'standardized' way of doing it would be more favorable.

0 Karma
Highlighted

Re: SonicWall Analytics: Why is the size of this lookup so huge?

Explorer

Yeah, this is a problem I've been putting off addressing until today.

I had setup a cronjob to delete the file everyday lest it grow to big and causes the distributed bundle replication manager to start choking.

Are we the only two people using dsa? I don't see how this app is workable with this csv file.

*sorry I "answered" but should have commented.

0 Karma
Speak Up for Splunk Careers!

We want to better understand the impact Splunk experience and expertise has has on individuals' careers, and help highlight the growing demand for Splunk skills.