Good day,
I'm having an issue with an email dashboard I'm attempting to create in Splunk. This dashboard filters on the various email headers fields such as sender, recipient, subject, etc. One of these fields is the attachments field. The issue is that there is *alwasy* a sender, recipient, and subject....but not all emails have attachments nor do I always want to filter by it.
In the dashboard, I'm using a text field with a default value of '*' . The problem with this is shown in the extract below.
index=email source=/var/email_0.log attachments=$file$ OR sha256=$hash$
This search will find all emails with attachments, but filter emails without any. However, what if I want search an email just by its subject while ignoring attachments? I'd love to be able to change the dashboard so that filtering by these fields could be turned on and off, but I haven't found a way to do that. I thought I could use isnotnull(attachments) inside a case() or if() function to test if the field exists, but those expressions don't appear to work in the base search.
Does anyone have any insight into how I could change the search(or dashboard) so that I'm not always filtering by attachments? Perhaps by changing the default values? Or perhaps the regex command?
You can add a prefix and suffix to the token and use a change handler on your input to reset the token to be empty if the user has not put anything in the text field.
<input type="text" token="attachment">
<label>Attachment search $attachment$</label>
<default></default>
<change>
<eval token="attachment">if($form.attachment$ == "","",$attachment$)</eval>
</change>
<prefix>attachments="</prefix>
<suffix>"</suffix>
</input>
You then use the token in your search
index=email source=/var/email_0.log $attachment$ sha256=$hash$
You can add a prefix and suffix to the token and use a change handler on your input to reset the token to be empty if the user has not put anything in the text field.
<input type="text" token="attachment">
<label>Attachment search $attachment$</label>
<default></default>
<change>
<eval token="attachment">if($form.attachment$ == "","",$attachment$)</eval>
</change>
<prefix>attachments="</prefix>
<suffix>"</suffix>
</input>
You then use the token in your search
index=email source=/var/email_0.log $attachment$ sha256=$hash$
Thank you, I'll see if I can do this. However, because the other responder had issues understanding my issue, I've placed my response to them below as well in case you had trouble understanding my request even though you've already given me a suggestion on how to rectify it.
--------------------------------------------------------------------------------------------
Got it, I'll try to explain better. This is the actual base search:
(index=email source=/var/logs/esa_0.log attachments=$file$ sha256=$hash$) OR (index=cyber source=/varlogs/fe01.log) (suser="$sender$" OR sender="$sender$") (duser="$recipient$" OR recipient="$recipient$") (subject="$subject$" OR msg="$subject$") (id="'<$email_id$>'" OR message-id="$email_id$") (ReplyAddress="$reply_add$" OR from-header="$reply_add$")
If you look, you'll see the various fields I have setup to filter by: sender, recipient, subject, etc. Part of what I'm doing is actually consolidating email information from two different sourctypes. That's why I have the various filters being matched against the equalvalient field in the other sourcetype. For instance, in this part 'suser="$sender$" OR sender="$sender$" ', it'll filter out emails by sender, keeping only the events in both sourcetypes where the sender is somebody@gmail.com, for example. However, the default value for this field(and the rest) is a wildcard * to match everything, so even if I don't fill in a value to filter by, it'll default to that. As a result, the search becomes this ' suser="*" OR sender="*" ' at search time.
You see the problem? With this kind of filter, it *requires* the suser or sender field to be present in the events lest they get filtered out, even though I'm not trying to filter by that. Now, in the case of fields like sender, recipient, subject, and even email_id, this is okay because *every* email has to have these fields. They're not optional.
In the case of email attachments, however, that isn't the case. Not all emails have attachments, therefore not all events have an 'attachments' field. However, because the search ultimately defaults to this ' attachments=* ', it requires them. This is the problem. It makes it impossible to search for emails without attachments. Ideally, I'd love to be able to simply tell Splunk not to filter by that field at all unless I fill it with something that isn't a wildcard, but that doesn't appear to be possible.
Does this clear up any confusion?
Your issue as you have expanded is as I had interpreted it in the first place. Having said that, there could still be issues if you attempt to use my solution with logic operators such as OR. For example
( $choice_token$ OR field="*" )
might give you a parsing error if the choice_token is an empty string, so you might want to consider a superfluous condition such as 1==1 or NOT 1==1 depending on how your search logic should work in this case. Another possibility is that the OR is included in the choice_token. But you would have to work this out depending on what your search is trying to do under the various different scenarios of tokens being used to filter the events.
Of course. I'll experiement and see what I can figure out. I hadn't even considered editing the HTML to achieve my goal, so it's certainly progress. Thank you.
It's not really clear how you want your dashboard to behave. If I have to guess, do you mean to say if a user specifies $file$ value as "thisfile", you want to return Emails with attachment named "thisfile" as well as Emails with no attachment? If so, you can tell the search command to do so using OR logic.
index=email source=/var/email_0.log (attachments=$file$ OR NOT attachments=*) OR sha256=$hash$
Got it, I'll try to explain better. This is the actual base search:
(index=email source=/var/logs/esa_0.log attachments=$file$ sha256=$hash$) OR (index=cyber source=/varlogs/fe01.log) (suser="$sender$" OR sender="$sender$") (duser="$recipient$" OR recipient="$recipient$") (subject="$subject$" OR msg="$subject$") (id="'<$email_id$>'" OR message-id="$email_id$") (ReplyAddress="$reply_add$" OR from-header="$reply_add$")
If you look, you'll see the various fields I have setup to filter by: sender, recipient, subject, etc. Part of what I'm doing is actually consolidating email information from two different sourctypes. That's why I have the various filters being matched against the equalvalient field in the other sourcetype. For instance, in this part 'suser="$sender$" OR sender="$sender$" ', it'll filter out emails by sender, keeping only the events in both sourcetypes where the sender is somebody@gmail.com, for example. However, the default value for this field(and the rest) is a wildcard * to match everything, so even if I don't fill in a value to filter by, it'll default to that. As a result, the search becomes this ' suser="*" OR sender="*" ' at search time.
You see the problem? With this kind of filter, it *requires* the suser or sender field to be present in the events lest they get filtered out, even though I'm not trying to filter by that. Now, in the case of fields like sender, recipient, subject, and even email_id, this is okay because *every* email has to have these fields. They're not optional.
In the case of email attachments, however, that isn't the case. Not all emails have attachments, therefore not all events have an 'attachments' field. However, because the search ultimately defaults to this ' attachments=* ', it requires them. This is the problem. It makes it impossible to search for emails without attachments. Ideally, I'd love to be able to simply tell Splunk not to filter by that field at all unless I fill it with something that isn't a wildcard, but that doesn't appear to be possible.
Does this clear up any confusion?
I think I understand the essence of the challenge. Data analytics solution all depends on data characteristics. Can you describe data further? For example, the alternative field names, do they appear in the two different sources? In other words, is there a relationship like this?
index=email source=/var/logs/esa_0.log | index=cyber source=/varlogs/fe01.log |
sender, recipient, subject, ... | suser, duser, msg, ... |
Such relationship can improve search by not using too many OR, which usually decreases efficiency. On the other hand, even if such relationships exist, if suser, duser, subject, ... do not always exist in the same event, your search will not satisfy all filters. As @PickleRick says, in that case you will have to sacrifice efficiency and fetch all events then filter.
However, you have already clarified that except attachments, sender, recipient, subject, etc., always exist, so do suser, duser, msg, and so on. This means you can take advantage of those always-on fields.
Now, to the bottom of the challenge. Yes, you can do that. But you need to change token strategy a little. For this, we will single out the token for attachments from the rest.
Just to distinguish this token, I call it attachments_tok, and set up Name-Value pairs (Label-Value in Dashboard Studio parler) like these:
Name | Value |
Any | * |
filename1 | attachments = filename1 |
filename2 | attachments = filename2 |
... |
Once attachment_tok is set up, reorganize the search like this:
(index=email source=/var/logs/esa_0.log ($attachments_tok$) sha256=$hash$
sender="$sender$" recipient="$recipient$" subject="$subject$" message-id="$email_id$" from-header="$reply_add$")
OR (index=cyber source=/varlogs/fe01.log suser="$sender$" duser="$recipient$" msg="$subject$" id="'<$email_id$>'" ReplyAddress="$reply_add$")
Hope this helps.
The very ugly solution would be to search for the "initial" results, then do fillnull and then search for particular values.
But.
That would be hopelessly ineffective because you'd need to dig through all events each time you run your search.
If the search is meant to be run relatively often you could think of summary indexing and transform your data so that it contains some default "non-present" entry.