An index receives events which are reviewed by an internal team. Some events needs a new status - I consider that by adding a new field by using eval command and adding it as a new event entity to index (in order to keep the history) by using collect command:
index=source | ... | eval new_status="a new status" | collect index=source
but the new field is not kept and saved - is any workaround upon this?
Experiments on a test environment show that I can't create new event entities, based on existing ones, adding new fields and saving them to the same index. But it's possible to use collect to "copy" to another index.
However the question - how do I create a new event, based on an existing one, by changing a field value (not adding a new field - just changing the value) and save the new one to the same index still remains?
Seems that I'll respond to my issue - the problem is that I didn't take into consideration _raw field - all required changes need to be done within it.
Why do you want to "index" new field, rather than just show that field in search time? It is MUCH better to do in Search -time
You could create an "app" and do an entry for your source type
[yourSourceType] EVAL-new_status="a new status"
and it should come automatically when your users search in Splunk. All I'm trying to say is you don't need to manipulate the source data to do that
If you need summary indexing, it is NOT a good practice to index the _raw events again. Just do the key fields into summary index.
I found a way to "edit" the events through modifying the _raw field - I agree it's not an elegant at all, but it works and, which is more important, the bosses are happy:
index=[index_name] | ... | eval _raw = replace(_raw,"severity_id=\"".$severity_id$."\"", "severity_id=\"".$new_severity_id$."\"") | table _time, _raw | collect index=[index_name]
The method you used works because collect only grab the selected fields. If you don't define any
statscommands it will grab whatever is in the _raw which is why in your case you had to run a replace command on the _raw field.
Your solution works but you should try to avoid collect whenever possible. I see two other solutions that could also help :
1- Using calculated fields and keeping the existing field :
If you use this approach you can define a new field that has the value you wish to use, this field will be used on search time. This will allow you to keep both fields, the original one and the new one which will be applied on search time.
2-Changing the value of your
severity_id on index time. As data goes into Splunk you can apply a
sed command to replace the value of the severity_id with whatever you need and have it indexed with that new value. This won't apply on existing data but will be applied on all newly indexed data. In your case configuration should look something like this :
[yourSourceType] SEDCMD-abc = y/severity_id="abc"/severity_id=ABC/g
You can check the link here for more info on how sedcmd is used :
In that case go for calculated fields. You can replace the field at search time without modifying raw data which means you will be able to run stats and get the new results... Collect isn't very reliable as you might end up with missing data in case your scheduled collect search doesnt run 🙂