Re: /opt /syslogs/ file system space issue in heav...

Hemnaath · ‎04-22-2016

Currently I am facing a file system /opt issue in the splunk heavyforwarder server, this server is used to monitor and forward the syslogs information to the indexer clusters.
Files size is keep on increasing in this folder /opt/syslogs/generic/... under generic folder there are many subdirectories and each subdirectoris contains some .logs. When validated the splunkd.log I could see the below Info
04-22-2016 12:18:07.201 -0400 INFO AggregatorMiningProcessor - Got done message for: source::/opt/syslogs/generic/xxxx/2838552.log|host::xxxx|syslog|9947901
04-22-2016 12:18:12.904 -0400 INFO AggregatorMiningProcessor - Setting up line merging apparatus for: source::/opt/syslogs/generic/xxxopt/sport.log|host::xxxopt|syslog|9946037
I have tried toexecute the log rotate but when execute the log rotate its consuming the Swap memory has many log rotate process are running and if I kill the process then space increases, as temprorary solution i am trying to add space to this opt file system. Since all the data are critical (network data) deleting the files will be create a probelm while auditing.
My question what will be the permanent solution to fix this issue.
1) Do I need to change any configuration inside the /opt/splunk/etc/apps/local/input.conf
2) Can I move the files to some other location in the same server?

Hemnaath · ‎05-10-2016

thanks, we have build additional splunk heavy forwarder server in order to manage the load. Now entire syslog loaded is pointed towards the newly build heavy forwarder server, but still files size is keep on increasing in this folder /opt/syslogs/generic. Since under generic folder there are many subdirectories its difficult to execute the log rotate as it consumes more SWAP memory.

Question
1) How to find the volume of the logs getting injected every hour from the source (SYSLOGS) in both the heavy forwarder?
I tried to execute this query but I did not get the correct result

index =* source ="syslogs" sourcetype ="/opt/syslogs/generic" | eval indextime=strftime(_indextime, "%Y-%m-%d %H:%M:%S" ) | eval length = len (_raw) /1024 | stats sum(length) count by source indextime index host

Need to get the volume of log getting injected in both the servers (Heavy Forwarder 1 & Heavy Forwarder 2). If I could find the exact volume of load in both the server, I can suggest to keep a separate server for syslogs.

lakshman239 · ‎08-16-2016

I assume, you are indexing the syslogs from heavyforwarder to an index. If you just want to know the amount of indexed data from your syslog source on a specific index, pls run |dbinspect index=

This will give you the size on the disk, raw size etc... Alternately, if you have access to GUI, go to Settings->indexes and see the size.

|dbconnect index=os | eval date=strftime(endEpoch,"%d/%m/%Y %H) | stats sum(rawSize) AS raw sum(sizeOnDiskMB) AS dskSize by date | addcoltotals

Hemnaath · ‎08-19-2016

thanks lakshman for getting into this issue. Currently I am facing an swap memory issue in the HF serves, can you guide me on this.

lakshman239 · ‎08-19-2016

Is that happening only on this server or in any other splunk server of the same specification?

If the server is shared with other applications, see if they are taking the resources. Did you try to restart the server (OS)?

Hemnaath · ‎08-19-2016

thanks lakshman, We have two heavy forwarder/syslog instances running in the same server. HF is used to forward the data (syslog) to the 5 individual indexer instances and we have an F5 load balancer that is placed before the two HF servers to route the traffic.

Problem - Most of the time we get an alert from Unix team stating that the splunkd process is consuming more CPU/Swap memory. Sometimes the swap memory becomes almost zero and kills the splunkd process. In a month we are getting almost 20 alerts for this issue. Kindly let guide us in overcoming this problem.

System details:
Splunk version 6.2.1
OS - RedHat 6.6
Memory - 6GB
CPU - 3
VMware

current status of swap memory

free -m
total used free shared buffers cached
Mem: 15947 15664 282 0 357 6361
-/+ buffers/cache: 8945 7001
Swap: 3323 304 3019

kindly let me know how to fix this issue.

Jeremiah · ‎04-23-2016

My question what will be the permanent solution to fix this issue.

It sounds like the rate of data coming in to your syslog server is higher than what your /opt filesystem can support. You've got 3 levers to work with. The rate of syslog data coming in, the amount of time you keep that syslog data around, and the size of the partition you store it on. Let's put aside the data coming in option, since thats probably out of your control. You just need to measure it; in other words, how much data is coming into your syslog server every day? If you don't know, look at the _internal metric data for your forwarder and do some calculations. Once you know that, then you can look at how long you need to keep the data on your syslog server. Splunk typically only needs seconds to see the data, but most people reasonably keep the data on a syslog server for a few days as a precaution, or you may have some explicit requirement to keep the raw syslog data for much longer. Whatever that length of time is tells you how much storage you need:

incoming data rate * days of retention = the absolute minimum amount of storage you need

the absolute minimum amount of storage you need * 120% or more = what you should provision

Grow your partition to that size, or reconfigure syslog to use another larger partition on the system and setup your logrotate to rotate the logs accordingly. Now, logrotate can also compress the logs, which is probably a good idea. That will save you quite a bit of storage. So you can factor that into your calculation. But the bottom line is, you need to keep that partition large enough to handle the incoming data rate and to meet your log retention requirements.

1) Do I need to change any configuration inside the /opt/splunk/etc/apps/local/input.conf

That is not the right path to your inputs.conf file. It's probably either in /opt/splunk/etc/apps/someapp/local/inputs.conf or /opt/splunk/etc/system/local/inputs.conf

If you change the location you are writing the syslog files, then yes you'll need to update this config. If you just grow the partition, you probably don't.

2) Can I move the files to some other location in the same server?

If Splunk has already indexed the files, then yes, move them however you like. So the rotated files could be moved off to save yourself the space. I would not move the active log files. Now, if you are talking about sending the log data to some other partition, thats fine. So you could reconfigure syslog to write to a larger partition. Splunk doesn't really care where on the local system the files are sitting, as long as you tell it where to find them (in the inputs.conf file).

martin_mueller · ‎04-23-2016

If /opt/syslogs/ is running out of space you need to talk to the person in charge of filling up /opt/syslogs/, that's rarely Splunk.

/opt /syslogs/ file system space issue in heavyforwarder

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics!

New in Observability Cloud - Explicit Bucket Histograms