Knowledge Management

Missing fields in summary index


I have a little problem with summary indexing seemingly ignoring some fields.

My logfile looks like this:

# /home/splunk/foo.log
2011-06-29 12:00:00,000 tx=12345 Starting to process order. orderId=31415
2011-06-29 12:00:01,500 tx=12345 Done processing order. outcome=SUCCESS, info=[orderId=31415, execution_time_ms=1500]
2011-06-29 12:05:00,000 tx=98765 Starting to process order. orderId=67890
2011-06-29 12:05:01,200 tx=98765 Done processing order. outcome=FAILURE, info=[orderId=67890, execution_time_ms=1200]

I've scheduled an index-populating query called "index-populating-query" that runs every 15 minutes and saves its results to the summary index:

source="/home/splunk/foo.log" oútcome=*

When I run this query from search, Splunk correctly shows all the discovered fields on the left hand side: tx, orderId, outcome, execution_time_ms.

But when I run queries against the summary index, it seems that the fields tx and outcome aren't contained in the index:

 index=summary source="index-populating-query" oútcome=*

produces an empty result set, and

index=summary source="index-populating-query" *

shows the fields orderId and execution_time_ms on the left hand side, but no outcome or tx.

Does anyone have an explanation for this behaviour?

I noticed that the missing fields are the ones that aren't following a comma in the log file.

The outcome field could probably be extracted during my queries against the summary index using a regex (e.g. rex "(?i) outcome=(?P[^,]+)"), but doesn't that somehow defeat the purpose of summary indexing?

Tags (1)

Splunk Employee
Splunk Employee

In general, summary index generating searches need to use a transforming/reporting command such as timechart, stats, chart, etc.

So, you could change your search to be something like:

 source="/home/splunk/foo.log" oútcome=* | stats count(eval(outcome="FAILURE")) as failures count(eval(outcome="SUCCESS")) as successes

Then, your search against the summary index becomes something like:

 index=summary source="index-populating-query" | timechart span=5m sum(successes) as successes sum(failures) as failures | fillnull value=0

And would yield a result like:

 _time               successes   failures
 12/05/11 00:00:00   4           0
 12/05/11 00:05:00   7           1
 12/05/11 00:10:00   6           3

Splunk Employee
Splunk Employee

If you insist on storing each message id in the summary index, you can try:

 eventtype=cisco_esa [search | fields mid] | stats values(to) as to values(from) as from values(subject) as subject by mid | collect index=summary

However, it seems like this is a heavy handed and probably wrong approach. You would probably be better off using a python script or building some additional business logic in via the Splunk view system, depending on how you want to represent the data.

0 Karma


here is the command used to populate the summary index:

eventtype=cisco_esa [search | fields mid] | transaction fields=mid mvlist=t | collect index=summary

there is more context in this answer, where folks were recommending the summary index for this use case, even though it isn't the standard usage:

0 Karma


is there a way to preserve the fields while still populating the index with the full set of events like the requestor has?

im running into the same issue, but need to run ad hoc search commands... not just preset timecharts.

for ex. ironport logs all of the "from, to, subject, etc" attributes as separate events on new lines. most searches i run need to return the full results associated with that mail.

so i am using a summary index to store the events in transaction form, and then run my searches against that.

0 Karma
Get Updates on the Splunk Community!

Observability | How to Think About Instrumentation Overhead (White Paper)

Novice observability practitioners are often overly obsessed with performance. They might approach ...

Cloud Platform | Get Resiliency in the Cloud Event (Register Now!)

IDC Report: Enterprises Gain Higher Efficiency and Resiliency With Migration to Cloud  Today many enterprises ...

The Great Resilience Quest: 10th Leaderboard Update

The tenth leaderboard update (11.23-12.05) for The Great Resilience Quest is out >> As our brave ...