Yesterday i was ingest new server in my Splunk
in my case, in directory /opt/splunkforwarder/etc/system/local/inputs.conf im use setting like this
[monitor:///var/log/]
disabled = false
index = <NewIndex>
[monitor:///home/*/.bash_history]
disabled = false
index = <NewIndex>
sourcetype = bash_history
I ingest 6 Server Ubuntu, in the first 4hour i got too much data like 1GB (got shocked cause it only 4hours) but after 2 Days it only get 4.88GB.
What i understand is maybe in the first 4hour it read all old data cache fom .bash_history and /var/log (maybe) because when i check it in Indexer it says Earliest Event = 15 years ago
Question is, it is normal or need to change in my inputs.conf ?
~Danke
Yes, it is perfectly normal. By default splunk reads all it can from the specified input file(s) and then keeps track of how much it has already read and only reads newly written entries. Nothing to worry about.
I have two issues with your inputs. One is that monitoring .bash_history alone makes relatively little sense (things you want to find are usually pretty easy to avoid being written to bash histiry).
Anotheris that ingesting all /var/log with single sourcetype will end with a horrible mess since you have many different kinds of logs there.
Yes, it is perfectly normal. By default splunk reads all it can from the specified input file(s) and then keeps track of how much it has already read and only reads newly written entries. Nothing to worry about.
I have two issues with your inputs. One is that monitoring .bash_history alone makes relatively little sense (things you want to find are usually pretty easy to avoid being written to bash histiry).
Anotheris that ingesting all /var/log with single sourcetype will end with a horrible mess since you have many different kinds of logs there.
Thankyou for your information, reason why i create /var/log because i want ingest everything in /log and Splunk do it perfectly. It will be named default by Splunk but its okay
And for .bash_history i input that because that's a request.
Once again thanks sir, now i no need worries anymore about this newIndex size.
Yes, but in /var/log there are many different kinds of files (and typically even many different kinds of events within some files) and each of them should be parsed differently. If you just ingest all of them into one big "sack", you will most definitely lose at least some info (like properly parsed timestamps on some events) and you will not have properly parsed fields for many of those events.
So if you have - for example - /var/log/exim/main.log you should ingest it separately with exim_main sourcetyp (and reject.log should have own input stanza with exim_reject sourcetype). Apache httpd access logs should be ingested separately with one of the access_* sourcetypes depending on your apache configuration.
And so on.
If you just pull everything with one generic sourcetype... well, you can do a full-text search but not much more. You're losing a lot of functionality.
Hi @PickleRick
As your information Yesterday if my inputs.conf will mess the sourcetype so i was assesment all sourcetype was generated in my searchhead.
Could you please correction my inputs.conf ? here
[monitor:///var/log/audit/audit.log]
disabled = false
index = NewIndex
sourcetype = linux_audit
[monitor:///var/log/auth.log]
disabled = false
index = NewIndex
sourcetype = auth-too_small
[monitor:///var/log/cron]
disabled = false
index = NewIndex
sourcetype = kern-too_small
[monitor:///var/log/kern.log]
disabled = false
index = NewIndex
sourcetype = kern-too_small
[monitor:///var/log/messages]
disabled = false
index = NewIndex
sourcetype = syslog
[monitor:///var/log/mongodb/mongod.log]
disabled = false
index = NewIndex
sourcetype = mongod-2
[monitor:///var/log/nginx/access.log]
disabled = false
index = NewIndex
sourcetype = access_combined
[monitor:///var/log/nginx/error-NewIndex-fe.log]
disabled = false
index = NewIndex
sourcetype = error-NewIndex-fe-too_small
[monitor:///var/log/nginx/jm-click-fe.log]
disabled = false
index = NewIndex
sourcetype = jm-click-fe-too_small
[monitor:///var/log/nginx/NewIndex-ess-http-3001.log]
disabled = false
index = NewIndex
sourcetype = NewIndex-ess-http-too_small
[monitor:///var/log/nginx/NewIndex-ess-pakta-http.log.1]
disabled = false
index = NewIndex
sourcetype = NewIndex-ess-http-too_small
[monitor:///var/log/nginx/NewIndex-jmpd-http.log]
disabled = false
index = NewIndex
sourcetype = access_combined
[monitor:///var/log/nginx/NewIndex-be.log]
disabled = false
index = NewIndex
sourcetype = access_combined
[monitor:///var/log/nginx/NewIndex-cms-be.log]
disabled = false
index = NewIndex
sourcetype = NewIndex-cms-be-too_small
[monitor:///var/log/redis/redis-server.log]
disabled = false
index = NewIndex
sourcetype = redis-server-too_small
[monitor:///var/log/sssd/sssd_NewIndex.co.id.log]
disabled = false
index = NewIndex
sourcetype = sssd_NewIndex.co.id-too_small
[monitor:///var/log/syslog]
disabled = false
index = NewIndex
sourcetype = syslog
[monitor:///var/log/ubuntu-advantage-timer.log]
disabled = false
index = NewIndex
sourcetype = ubuntu-advantage-timer.log-3
[monitor:///var/log/ubuntu-advantage.log]
disabled = false
index = NewIndex
sourcetype = ubuntu-advantage-6
[monitor:///var/log/ufw.log]
disabled = false
index = NewIndex
sourcetype = syslog
[monitor:///var/log/unattended-upgrades/unattended-upgrades.log]
disabled = false
index = NewIndex
sourcetype = unattended-upgrades
[monitor:///var/log/vmware-vmtoolsd-root.log]
disabled = false
index = NewIndex
sourcetype = vmware-vmtoolsd-root
[monitor:///home/*/.bash_history]
disabled = false
index = NewIndex
sourcetype = bash_history
Or maybe you have best practice setting for my case ?
Hi @isoutamo
Thanks for your information, after i check it.
- Splunk Add-on for Unix and Linux [Installed]
- Splunk Common Information Model (CIM) [Installed]
- InfoSec App for Splunk [Not Installed]
For the UF issue there is no problem at all, here I can get all the logs I need. It's just that the data I get has messy fields like this picture
I think it's not okay that's why i create topic for asking this problem
This actually looks OK-ish. You probably have some json data which gets parsed into those "multilevel" fields.
Oke Thankyou @isoutamo @PickleRick atleast in splunk can ingest everything. If want get specify data Analyst can regex it
Thankyou for your information, maybe i will checking it in latest Sourcetype generate default by splunk yesterday. So i can validating directory paths for inputs.conf