I have a odd issue which seems to have been resolved but I would like to know the root cause of this issue. I inherited a splunk configuration with one of the stanza entries in inputs.conf being: [monitor:///var/log/messages*] sourcetype=syslog index = os disabled = 0
When I perform a ls -l on /var/log/messages* I get the below: -rw-------. 1 root root 7520499 Sep 23 07:15 messages -rw-------. 1 root root 4795535 Aug 28 01:45 messages-20220828 -rw-------. 1 root root 6636499 Sep 4 01:42 messages-20220904 ...
When I do a spl search on any of the possible sources, since the stanza uses "*", I get no results except for the source=messages. I do not get results for the source=messages-20220828 (even if I extend the earliest=-365d).
When the rsyslog executed and rotated the messages log file this past week, at about 2 am on saturday, splunk stopped indexing the messages log file. the messages log file kept being populated by linux so that side seems to be working as expected. the last log entry splunk recorded was: _time = 2022-09-18 01:46:40 _raw = Sep 18 01:46:40 ba-dev-web rsyslogd: [origin software="rsyslogd" swVersion="8.24.0-57.el7_9.3" x-pid="1899" x-info="http://www.rsyslog.com"] rsyslogd was HUPed
I restarted the splunkforwarder on the server with the issue and this fix the issue and splunk started indexing the messages log entries again.
To attempt to create a permanent solution to this issue because restarting the forwarder manually is not a adequate solution for this issue I created the below stanza: [monitor:///var/log/messages] index = test disabled = 0
I do not believe I need the "*" because 1) messages* sources are not being indexed by splunk, so why use "*". (only source=messages). 2) we do not need to index messages backup log files.
When I came to work today, 18 hours after the "fix" (restart of splunk forwarder), my stanza is still working and indexing log entries as expected but the previous one: [monitor:///var/log/messages*] does not index log entries any more.
I used the working one and determine that the last entries before splunk stopped indexing were: first column is _time and next column is _raw 2022-09-22 14:03:38 Sep 22 14:03:38 ba-prod-web audisp-remote: queue is full - dropping event 2022-09-22 14:03:38 Sep 22 14:03:38 ba-prod-web systemd: Stopped Systemd service file for Splunk, generated by 'splunk enable boot-start'. 2022-09-22 14:03:38 Sep 22 14:03:38 ba-prod-web systemd: Stopping Systemd service file for Splunk, generated by 'splunk enable boot-start'... 2022-09-22 14:03:38 Sep 22 14:03:38 ba-prod-web splunk: Dying on signal #15 (si_code=0), sent by PID 1 (UID 0) 2022-09-22 14:03:38 Sep 22 14:03:38 ba-qa-web audisp-remote: queue is full - dropping event 2022-09-22 14:03:37 Sep 22 14:03:37 ba-qa-web audisp-remote: queue is full - dropping event 2022-09-22 14:03:36 Sep 22 14:03:36 ba-qa-web audisp-remote: queue is full - dropping event
the last entry for the stanza that stopped working was: 2022-09-22 14:03:37 Sep 22 14:03:37 ba-qa-web audisp-remote: queue is full - dropping event
all the other monitor an dscripted inputs are working on that server except for the one above. the version of the forwarder is 7.2.3. I am running other forwarders with this version that are indexing messages log entries and they are working as expected. the stanza I used was a copy and paste from the Splunk_TA_nix add-on (except I removed the other log files and just used messages), so IMO this would be the bbest practices".
I have a few questions: 1. why might be the reason why the stanza with "*' not work anymore while the one without it works? 2. Am I correct to believe that we do not need the stanza with "*", what are the consequences that I might not be aware of not using a stanza with "*"? 3. why would uid 1 (root) kill splunk (believe this is the reason why splunk stopped indexing messages log files again the 2nd time)? 4. any insights to understand this issue would be greatly apreciated. As far as i know right now, using my stanza should be good practice if we do not need the backup messages log files but I am concern I am missing something.
... View more
How would I be able to: use those results with an append or join to filter out the alert timeframes. tis is mainly what I want to be able to iterate through the start and end _time/time (similar to what you suggested). but I do not know how I would be able to filter by these results once I have them. for example, if I do not know how many start of maintenance and end of maintenance I have in the search results, then how will i be able to use append or join? I will not know because the query will be in a dashboard and the user might select a time frame that contains several starts/ends of maintenance or it might have none. any help is apreciated as I am breaking my head over this.
... View more
Hello, I have a log file that admins can write when they start or stop their server maintenance. This is then jued to silence email alerts so admins do not get email alerts when they are doing server maintenance. When the admin will start server maintenance they will write "start of maintenance...." into a specific log file (the source). When the admin will stop server maintenance they will then write "sen of maintenance...", to that same file.
However, since the email alerts reset themselves after a period (4 hours ) after splunk read the "start of maintenance..." some admins will "forget" to write the "stop of maintenance..." to this file.
task: I need to have an "start of maintenance..." and corresponding "end of maintenance..." entry. if I only have a "start of maintenance..." then I must use SPL to insert an event that has "end of maintenance..." and that the _time (or another field that is time-related) has the time of the "start of maintenance..." + 4 hours. So for example, if "start of maintenance..." _time is 2022/08/05 16:00:00 then I must create a event that has _time (or a time field)) of 2022/08/05 20:00:00. If there is a corresponding "end of maintenance...." within 4 hours of having a "start of maintenance..." then I should do nothing.
My ultimate goal is to create a dashboard with results filtered by "start of maintenance.." _time and "end of maintenance..." _time, but in order to do this I first have to make sure I have both "start of maintenance..." and "end of maintenance..." _time/Time values.
... View more
Hello, this is the first time i post here but I have learn alot from this website by just using google search.
Situation: At work server admins ask if I could "silence" splunk email alerts when they were doing maintenance so that they do not get emails of errors during server maintenance. I was able to do this because I created a maintenance.log in the /var/log/ folder that splunk keeps track of. if the admins write: "start of maintenance..." then any alert that monitors this logs will stop sending emails. when the admins write: "end of maintenance..." then splunk knows it can start sending emails since maintenance period is completed. this was useful to silence apache access log alerts that occurred during maintenance, meaning the admins did not get alerts that the apache access log wrote while admins were during maintenance as denoted by the _time of "stat of maintenance..." and _time of "end of maintenance...."
Task: I have to show search results that do not contain any results that were reported during a maintenance period in a dashboard. this means that any search results between the _time of "start of maintenance...." and _time of "end of maintenance..." should not be included in the results. Moreover, there might be times when maintenance happened several times, for example, if maintnenace was done twice in one day or if they are searching for a time period of say, 1 month, and it shows there were 3 "starts of maintneance" and 3 corresponding "end of maintenance..." entries.
Action: I have writen SPL that will get all the results: earliest=-1d (host="Server-web" source="/var/log/httpd24/error_log") OR (host="Server-Web" index=bizapps source=/var/log/bizapps_maintenance.log)
I am not sure if splunk SPL can pull this off but am confident someone can help me out. If you need more info, let me know.
... View more
when i type in the command line (cmd not powershell): splunk search "*" -maxout 0 | find /c /v "" I get the return of about 195k records. however when i filter by one of the sourcetype, from one of the splunk free tutorials, by typeing: splunk search "*" 'sourcetype=e' -maxout 0 | find /c /v "" I still get the same result of 195k records. the "| find /c /v "" " should return only the line count. so if i filter by the sourcetype it should return around 165k records which are the number of records associated with that source type currently in my splunk db. can someone help me correct my syntax, i tried google searches but none were able to give me an example, which I think is due to the fact I do not know how the syntax should work in the first place. I am using the windows cmd, and I plan to use the windows cmd, not the website interface, thanks for understanding.
... View more