I have a little problem with summary indexing seemingly ignoring some fields.
My logfile looks like this:
# /home/splunk/foo.log 2011-06-29 12:00:00,000 tx=12345 Starting to process order. orderId=31415 2011-06-29 12:00:01,500 tx=12345 Done processing order. outcome=SUCCESS, info=[orderId=31415, execution_time_ms=1500] 2011-06-29 12:05:00,000 tx=98765 Starting to process order. orderId=67890 2011-06-29 12:05:01,200 tx=98765 Done processing order. outcome=FAILURE, info=[orderId=67890, execution_time_ms=1200]
I've scheduled an index-populating query called "index-populating-query" that runs every 15 minutes and saves its results to the summary index:
When I run this query from search, Splunk correctly shows all the discovered fields on the left hand side: tx, orderId, outcome, execution_time_ms.
But when I run queries against the summary index, it seems that the fields tx and outcome aren't contained in the index:
index=summary source="index-populating-query" oútcome=*
produces an empty result set, and
index=summary source="index-populating-query" *
shows the fields orderId and execution_time_ms on the left hand side, but no outcome or tx.
Does anyone have an explanation for this behaviour?
I noticed that the missing fields are the ones that aren't following a comma in the log file.
The outcome field could probably be extracted during my queries against the summary index using a regex (e.g. rex "(?i) outcome=(?P
In general, summary index generating searches need to use a transforming/reporting command such as timechart, stats, chart, etc.
So, you could change your search to be something like:
source="/home/splunk/foo.log" oútcome=* | stats count(eval(outcome="FAILURE")) as failures count(eval(outcome="SUCCESS")) as successes
Then, your search against the summary index becomes something like:
index=summary source="index-populating-query" | timechart span=5m sum(successes) as successes sum(failures) as failures | fillnull value=0
And would yield a result like:
_time successes failures 12/05/11 00:00:00 4 0 12/05/11 00:05:00 7 1 12/05/11 00:10:00 6 3
If you insist on storing each message id in the summary index, you can try:
eventtype=cisco_esa [search | fields mid] | stats values(to) as to values(from) as from values(subject) as subject by mid | collect index=summary
However, it seems like this is a heavy handed and probably wrong approach. You would probably be better off using a python script or building some additional business logic in via the Splunk view system, depending on how you want to represent the data.
here is the command used to populate the summary index:
eventtype=cisco_esa [search | fields mid] | transaction fields=mid mvlist=t | collect index=summary
there is more context in this answer, where folks were recommending the summary index for this use case, even though it isn't the standard usage:
is there a way to preserve the fields while still populating the index with the full set of events like the requestor has?
im running into the same issue, but need to run ad hoc search commands... not just preset timecharts.
for ex. ironport logs all of the "from, to, subject, etc" attributes as separate events on new lines. most searches i run need to return the full results associated with that mail.
so i am using a summary index to store the events in transaction form, and then run my searches against that.