We have the following search:
index="app_foo_internal" source="*Log-Srv-1*" | rex ",(?<TransactionTime>\d+)$" | Where TransactionTime > 90000
This index contains only 2 hosts that make up the "Log-Srv-1".
We are now getting alerts per event, but the problem is that sometimes we'll get a bunch of events within a single email alert, and other times just the one event (as expected). When we get a clump of events within one email alert, most of them are less than the Transaction timeout of 90000.
In the event where we get a clump of events within one email, the event we want is always at the bottom of the list of events.
This issue is sporadic.
Do we have some sort of contention with both hosts having similar time stamps? There are a lot of logs on both hosts, so perhaps its too much volume?
It should be just the one event per email. Not between 1 & 100.
Any ideas?
Do you have SHOULD_LINEMERGE = false
and a line breaker defined on the sourcetype?
My guess is you're getting a multiline event from time to time.
Can you post props.conf and transforms.conf settings related to the sourcetype if the above doesnt help?
Do you have SHOULD_LINEMERGE = false
and a line breaker defined on the sourcetype?
My guess is you're getting a multiline event from time to time.
Can you post props.conf and transforms.conf settings related to the sourcetype if the above doesnt help?
Sorry. Stupid question.
Here is the props.conf:
[AppInternal]
TRANSFORMS-null= Appsetnull
transforms.conf (just added new line you recommended):
#Discard all events between 6pm - 6am
[Appsetnull]
REGEX = (?:\d+\/\d+\/\d+|\d+-\d+-\d+)\s(18|19|20|21|22|23|00|01|02|03|04|05):
DEST_KEY = queue
FORMAT = nullQueue
SHOULD_LINEMERGE = false
When I do this, I get the following when starting up Splunk:
Checking conf files for problems...
Invalid key in stanza [Appsetnull] in C:\Program Files\Splunk\etc\system\local\transforms.conf, line
128: SHOULD_LINEMERGE (value: false).
Your indexes and inputs configurations are not internally consistent. For more information, run 'splunk
btool check --debug'
Done
Wow. Brain fart evening for me.
I read the article and this should be configured in props.conf. Not transforms.conf.
Now Splunk starts up ok. We'll see how things go on Monday when business jumps back into the application and we starting seeing logs again.
Thanks!
So far everything looks perfect!
You da man jkat54! 🙂
As is often the case, it just takes one person looking from another angle. Happy to help.
Cheers,
jkat54
Hi @agoktas
Don't forget to resolve this post by clicking "Accept" directly below @jkat54's answer, and upvote the answer/comments that helped you with your issue.
Should this be configured in the inputs.conf on the Splunk UF where the sourcetype is defined?
Or on the indexer?
Thanks!
Hi @agoktas,
Can you go into the savedsearches.conf file and check what your "alert.digest_mode" settings are? It sounds like it might be set to "true", meaning that you get one alert per result set. "True" is the default setting. You can switch it to "false" to change this.
See: http://docs.splunk.com/Documentation/Splunk/6.3.1/Admin/Savedsearchesconf
Another couple of things to check--how is your alert configured to trigger? Do you have any throttling set up?
Let me know if this helps! We can continue discussing the alert configuration.
Best,
@frobinson_splunk
Does it matter if both hosts define the same sourcetype for the same log?
I'm looking into this...Could you let me know more about your alert configuration? Do you have any trigger condition set up?
From savedsearches.conf
[Job_TransactionTime]
action.email = 1
action.email.inline = 1
action.email.sendresults = 1
action.email.to = email1@company.com,email2@company.com
alert.digest_mode = False
alert.expires = 7d
alert.severity = 5
alert.suppress = 0
alert.track = 0
auto_summarize.dispatch.earliest_time = -1d@h
cron_schedule = */15 * * * *
description = High Transaction Time (>90000ms) in enviornment for AppInternal
dispatch.earliest_time = -15m
dispatch.latest_time = now
enableSched = 1
search = index="app_foo_internal" source="*Log-Srv-1*" | rex ",(?<TransactionTime>\d+)$" | Where TransactionTime > 90000
Thanks for posting this! I'm taking a look.
I think that it's possible that your query is trying to do the work of indicating a trigger condition (if I understand correctly). It sounds like the condition for which you want to be alerted is when transaction time > 90000.
If that is correct, I would suggest making the last portion or phrase of your query into a custom trigger condition for the alert, rather than including it in the query. When you put the trigger condition into the query, it affects the way search results are evaluated.
It looks like your alert is configured in savedsearches.conf. See:
http://docs.splunk.com/Documentation/Splunk/6.3.1/Admin/Savedsearchesconf
and look for this setting:
"alert_condition =
* Contains a conditional search that is evaluated against the results of the
saved search. Alerts are triggered if the specified search yields a
non-empty search result list.
* NOTE: If you specify an alert_condition, do not set counttype, relation, or
quantity.
* Defaults to an empty string."
Can you let me know if establishing the trigger condition separately from the query helps?
Thanks,
@frobinson_splunk
Nothing fancy.
Simple search. index="app_foo_internal" source="*Log-Srv-1*" | rex ",(?\d+)$" | Where TransactionTime > 90000
Start time: -15m
Finish time: now
Schedule type: cron: */15 * * * *
Schedule Window: 0
Alert condition: Always
Alert actions: Email
Hi frobinson,
Yes, the following is already set:
alert.digest_mode = False
For the particular saved search.
No throttling is setup. "After triggering the alert, don't trigger it again for" is unchecked.