I have a heavy search on multiple sources that I want to schedule to populate a summary index. I am basically interested in certain events so I want to populate the summary index with only those events. That way I can run searches on the summary index quickly as opposed to the normal index that contains hundreds of millions of events.
I can populate the summary index like this:
index=windows OR index=linux OR index=something my search | addinfo | collect index=mysummaryindex
This works fine, however the problem is that the host field is not saved so I don't know which host generated the event.
Is there a way to add the host field into the summary index as well? The marker option for collect just adds a certain string field which is not useful in this case.
after further testing, this is my favorite solution
Just add the following after your base search and orig_host, orig_sourcetype, orig_source and orig_index will all be in your summary index :-) | rename _raw as orig_raw
# a much simpler solution that I got from Splunk guru "D" :-) # turns out renaming the _raw field corrects the issue of missing some of the "orig" fields, i.e. orig_sourcetype # this approach is proabaly not as relavant to Splunk 6 which has many automatic acceleration features # note: the "| collect " command is optional not needed if you are using the summary index checkbox in a saved search index=other | rename _time as time | rename _raw as raw | stats count by time raw index host sourcetype source | collect index=collect
# I was having trouble recording the raw event, original host, sourcetype and source fields and putting them into a summary index as they were always overridden with the values of the host which runs the search populating the summary index - here's one solution # step 1 - populate summary index # search events from an index namded "other" and prepend the _time, host, sourcetype and source fields to the _raw field with "|" as a delimeter and put into a summary index named "collect" index=other | eval _raw=_time+"|"+host+"|"+sourcetype+"|"+source+"|"+_raw | collect index=collect # step 2 - read from summary index named "collect" # extract time, host, sourcetype and source fields that are stashed in the _raw field in the summary index named "collect" index=collect | rex "^(?<time1>[^|]+)\|(?<host1>[^|]+)\|(?<sourcetype1>[^|]+)\|(?<source1>[^|]+)\|(?<raw1>[^|]+)"
Oh, yah. I see what you're saying. I was using stat command.
I'm trying to do something similar but, I additionally want to eliminate unwanted fields when I write to summary but, no answer for me so far: