In the past, I have used SEDCMD statements in my props.conf to remove text and whole lines from events so they would not clog our indexes and clutter the search results wiht extraneous text.
However, today, I tried to set up an SEDCMD string to remove part of a line in our IIS logs, and it appeared to work, but somehow the actual effect is that it just removed the text from the regular search result, but if you expand the actual event, where you can see the 'Event Actions' button and all the fields, the string we were trying to remove is still available and searchable (still shows in the Interesting Fields).
Here is my script:
SEDCMD-Auth = s/Basic+.*//
I am trying to remove the content of the Authorization field that has been added to our IIS logs, along with everything else to the end of the line after that field.
This is a script I used earlier to remove a line from a Windows Event Log, and it is working fine:
SEDCMD-EventType = s/EventType=4\r\n//
What is the difference (aside from not removing a CR/LF)? What did I do wrong in the first one to cause Splunk to not actually prevent the data from being indexed? These are both in the etc/system/local/props.conf on our indexer, and we only use light forwarders.
You probably have an inputs.conf on your Universal Forwarder like this:
[monitor://C:\inetpub\logs\LogFiles\W3SVC1]
sourcetype=iis
This works because IIS
is a pre-trained sourcetype
and if you go look in $SPLUNK_HOME/etc/system/default/props.conf you will see something that looks like this:
[iis]
...
INDEXED_EXTRACTIONS = w3c
In previous versions of Splunk it used CHECK_FOR_HEADER = True
but because the order and number of fields that IIS logs can be modified at any time, splunk replaced this with the stanza INDEXED_EXTRACTIONS = w3c
. However, this new feature works very differently and unfortunately for you, this "easy button" creates index-time fields at a time BEFORE the SEDCMD
code runs. The only thing that you can do is either pre-process the logs before the forwarder gets them, or revert to the older method of handling sourcetype iis
which happens after SEDCMD
runs. Just download a version 6.* Splunk and do it that way (I just showed you what file to check earlier in this answer).
You probably have an inputs.conf on your Universal Forwarder like this:
[monitor://C:\inetpub\logs\LogFiles\W3SVC1]
sourcetype=iis
This works because IIS
is a pre-trained sourcetype
and if you go look in $SPLUNK_HOME/etc/system/default/props.conf you will see something that looks like this:
[iis]
...
INDEXED_EXTRACTIONS = w3c
In previous versions of Splunk it used CHECK_FOR_HEADER = True
but because the order and number of fields that IIS logs can be modified at any time, splunk replaced this with the stanza INDEXED_EXTRACTIONS = w3c
. However, this new feature works very differently and unfortunately for you, this "easy button" creates index-time fields at a time BEFORE the SEDCMD
code runs. The only thing that you can do is either pre-process the logs before the forwarder gets them, or revert to the older method of handling sourcetype iis
which happens after SEDCMD
runs. Just download a version 6.* Splunk and do it that way (I just showed you what file to check earlier in this answer).
That is indeed how we are set up. I will test the old way and see what happens.
Thanks for coming back to Accept
. How did you end up handling it? There is another new feature called INGEST_EVAL
that might be able to help you here. I believe that it can modify index-time values in the manner that you require. Be aware of the difference between INGEST_EVAL =
and INGEST_EVAL :=
.
On my test system, we went with the older method, which worked, but ultimately I was able to convince my superiors that including the extra data in the log in the first place was a security risk, and we were able to just stop sending the information to the log (it was literally a hashed version of the credentials, and was only being logged because someone felt they had to check a box in a security checklist). As a result, the problem is no more.
But now I am curious to see what INGEST_EVAL can do for us.
Hi @DaClyde,
Have you found an answer to your question yet ? Are you using index time extractions ? this could be why your raw data is not showing events anymore but the fields are still saved.
I have not, and yes, we are using index time extractions. Do you have any suggestions? My confusion is that one SED script is working fine, but the other is not, and they are in the same props.conf on the indexer.
My short term, security by obscurity solution was to create a bogus field alias for the Authorization field, and that at least prevents the field from being searchable, but the data it still appears in the raw event in the index.
Are you looking for literal "Basic+" (the plus sign is also literal string)? If yes, then try this
SEDCMD-Auth = s/Basic\+.*//
Ah, that could be it. There is a literal plus sign in the string. I will add that and see what happens.