Solved: Re: Universal Forwarder best setting in Linux

zksvc · ‎09-22-2024

Yesterday i was ingest new server in my Splunk

in my case, in directory /opt/splunkforwarder/etc/system/local/inputs.conf im use setting like this

[monitor:///var/log/]
disabled = false
index = <NewIndex>

[monitor:///home/*/.bash_history]
disabled = false
index = <NewIndex>
sourcetype = bash_history

I ingest 6 Server Ubuntu, in the first 4hour i got too much data like 1GB (got shocked cause it only 4hours) but after 2 Days it only get 4.88GB.

What i understand is maybe in the first 4hour it read all old data cache fom .bash_history and /var/log (maybe) because when i check it in Indexer it says Earliest Event = 15 years ago

Question is, it is normal or need to change in my inputs.conf ?

~Danke

PickleRick · ‎09-22-2024

Yes, it is perfectly normal. By default splunk reads all it can from the specified input file(s) and then keeps track of how much it has already read and only reads newly written entries. Nothing to worry about.

I have two issues with your inputs. One is that monitoring .bash_history alone makes relatively little sense (things you want to find are usually pretty easy to avoid being written to bash histiry).

Anotheris that ingesting all /var/log with single sourcetype will end with a horrible mess since you have many different kinds of logs there.

View solution in original post

PickleRick · ‎09-22-2024

Yes, it is perfectly normal. By default splunk reads all it can from the specified input file(s) and then keeps track of how much it has already read and only reads newly written entries. Nothing to worry about.

I have two issues with your inputs. One is that monitoring .bash_history alone makes relatively little sense (things you want to find are usually pretty easy to avoid being written to bash histiry).

Anotheris that ingesting all /var/log with single sourcetype will end with a horrible mess since you have many different kinds of logs there.

zksvc · ‎09-22-2024

Thankyou for your information, reason why i create /var/log because i want ingest everything in /log and Splunk do it perfectly. It will be named default by Splunk but its okay

And for .bash_history i input that because that's a request.

Once again thanks sir, now i no need worries anymore about this newIndex size.

PickleRick · ‎09-23-2024

Yes, but in /var/log there are many different kinds of files (and typically even many different kinds of events within some files) and each of them should be parsed differently. If you just ingest all of them into one big "sack", you will most definitely lose at least some info (like properly parsed timestamps on some events) and you will not have properly parsed fields for many of those events.

So if you have - for example - /var/log/exim/main.log you should ingest it separately with exim_main sourcetyp (and reject.log should have own input stanza with exim_reject sourcetype). Apache httpd access logs should be ingested separately with one of the access_* sourcetypes depending on your apache configuration.

And so on.

If you just pull everything with one generic sourcetype... well, you can do a full-text search but not much more. You're losing a lot of functionality.

zksvc · ‎09-23-2024

Hi @PickleRick
As your information Yesterday if my inputs.conf will mess the sourcetype so i was assesment all sourcetype was generated in my searchhead.

Could you please correction my inputs.conf ? here

[monitor:///var/log/audit/audit.log]
disabled = false
index = NewIndex
sourcetype = linux_audit

[monitor:///var/log/auth.log]
disabled = false
index = NewIndex
sourcetype = auth-too_small

[monitor:///var/log/cron]
disabled = false
index = NewIndex
sourcetype = kern-too_small

[monitor:///var/log/kern.log]
disabled = false
index = NewIndex
sourcetype = kern-too_small

[monitor:///var/log/messages]
disabled = false
index = NewIndex
sourcetype = syslog

[monitor:///var/log/mongodb/mongod.log]
disabled = false
index = NewIndex
sourcetype = mongod-2

[monitor:///var/log/nginx/access.log]
disabled = false
index = NewIndex
sourcetype = access_combined

[monitor:///var/log/nginx/error-NewIndex-fe.log]
disabled = false
index = NewIndex
sourcetype = error-NewIndex-fe-too_small

[monitor:///var/log/nginx/jm-click-fe.log]
disabled = false
index = NewIndex
sourcetype = jm-click-fe-too_small

[monitor:///var/log/nginx/NewIndex-ess-http-3001.log]
disabled = false
index = NewIndex
sourcetype = NewIndex-ess-http-too_small

[monitor:///var/log/nginx/NewIndex-ess-pakta-http.log.1]
disabled = false
index = NewIndex
sourcetype = NewIndex-ess-http-too_small

[monitor:///var/log/nginx/NewIndex-jmpd-http.log]
disabled = false
index = NewIndex
sourcetype = access_combined

[monitor:///var/log/nginx/NewIndex-be.log]
disabled = false
index = NewIndex
sourcetype = access_combined

[monitor:///var/log/nginx/NewIndex-cms-be.log]
disabled = false
index = NewIndex
sourcetype = NewIndex-cms-be-too_small

[monitor:///var/log/redis/redis-server.log]
disabled = false
index = NewIndex
sourcetype = redis-server-too_small

[monitor:///var/log/sssd/sssd_NewIndex.co.id.log]
disabled = false
index = NewIndex
sourcetype = sssd_NewIndex.co.id-too_small

[monitor:///var/log/syslog]
disabled = false
index = NewIndex
sourcetype = syslog

[monitor:///var/log/ubuntu-advantage-timer.log]
disabled = false
index = NewIndex
sourcetype = ubuntu-advantage-timer.log-3

[monitor:///var/log/ubuntu-advantage.log]
disabled = false
index = NewIndex
sourcetype = ubuntu-advantage-6

[monitor:///var/log/ufw.log]
disabled = false
index = NewIndex
sourcetype = syslog

[monitor:///var/log/unattended-upgrades/unattended-upgrades.log]
disabled = false
index = NewIndex
sourcetype = unattended-upgrades

[monitor:///var/log/vmware-vmtoolsd-root.log]
disabled = false
index = NewIndex
sourcetype = vmware-vmtoolsd-root

[monitor:///home/*/.bash_history]
disabled = false
index = NewIndex
sourcetype = bash_history

Or maybe you have best practice setting for my case ?

isoutamo · ‎09-28-2024

Hi

Have you look e.g Splunk Add-on for Unix and Linux https://splunkbase.splunk.com/app/833 to ingest those logs into Splunk? Usually it's best to use some TA as those do lot of need stuff like make inputs as a CIM complaint https://splunkbase.splunk.com/app/1621 Then you can easily use e.g. InfoSec app https://splunkbase.splunk.com/app/4240 to monitor what is happening in your environment.

Those which has suffix -too_small is somenthing which haven't any sourcetype definitions on splunk side. Splunk just generate that name for those. You should do a real data onboarding for those files/sources.

One other thing what you should check and change if needed. You should never run UF on those nodes as root. Use some other user like splunk or splunkfwd. Then your issue is that those user haven't access to all those logs and that you also needs to fix.

r. Ismo

zksvc · ‎09-28-2024

Hi @isoutamo

Thanks for your information, after i check it.

- Splunk Add-on for Unix and Linux [Installed]

- Splunk Common Information Model (CIM) [Installed]

- InfoSec App for Splunk [Not Installed]

For the UF issue there is no problem at all, here I can get all the logs I need. It's just that the data I get has messy fields like this picture

I think it's not okay that's why i create topic for asking this problem

isoutamo · ‎09-29-2024

I agree with @PickleRick that this is quite probably ok. It’s totally dependent on your data.

PickleRick · ‎09-28-2024

This actually looks OK-ish. You probably have some json data which gets parsed into those "multilevel" fields.

zksvc · ‎09-29-2024

Oke Thankyou @isoutamo @PickleRick atleast in splunk can ingest everything. If want get specify data Analyst can regex it

zksvc · ‎09-23-2024

Thankyou for your information, maybe i will checking it in latest Sourcetype generate default by splunk yesterday. So i can validating directory paths for inputs.conf

Universal Forwarder best setting in Linux

forwarder management

Linux

search head

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

Are you a member of the Splunk Community?