Solved: Why do I get different results when I search my ex...

Simeon · ‎11-11-2010

I have an extracted field called ruby_completed_call, that extracts the completion time from a ruby log:

Processing PostsController#create (for 127.0.0.1 at 2010-09-08 11:52:56) [POST]
  Session ID: BAh7BzoMY3NyZl9pZCIlMDY5MWU1M2I1ZDRjODBlMzkyMWI1OTg2zViZjYiCmZsYXNoSUM6J0FjdGl
vbkNvbnRyb2xsZXI6OkZsYXNoOjpGbGFzaEhhc2h7AAY6CkB1c2VkewA=--b18cd92fba90eacf8137e5f6b3b06c4d724596a4
  Parameters: {"commit"=>"Create", "post"=>{"title"=>"Debugging Rails",
 "body"=>"I'm learning how to print in logs!!!", "published"=>"0"},
 "authenticity_token"=>"2059c1286e93402e389127b1153204e0d1e275dd", "action"=>"create", "controller"=>"posts"}
New post: {"updated_at"=>nil, "title"=>"Debugging Rails", "body"=>"I'm learning how to print in logs!!!", "published"=>false, "created_at"=>nil}
Post should be valid: true
  Post Create (0.000443)   INSERT INTO "posts" ("updated_at", "title", "body", "published", "created_at") VALUES('2008-09-08 14:52:54', 'Debugging Rails',
 'I''m learning how to print in logs!!!', 'f', '2008-09-08 14:52:54')
The post was saved and now is the user is going to be redirected...
Redirected to #<Post:0x20af760>
Completed in 12ms (81 reqs/sec) | DB: 0.00044 (3%) | 302 Found [http://localhost/posts]

My extraction is as follows:

REGEX = Completed in (?<ruby_call_completed>\d+)ms

When I run a search that looks for all calls completed in 12ms, I don't get any results:

sourcetype=ruby ruby_call_completed=12

However, I can run the search in this manner and get the correct number of results:

sourcetype=ruby | search ruby_call_completed=12

I have checked that my search runs over all-time and these particular fields are not indexed.

araitz · ‎11-11-2010

The problem is that the term you are extracting as your field's value (in the case of the sample event, "12") doesn't actually exist in Splunk's index. The term that is actually indexed is "12ms".

There are at least three ways to solve this problem, presented in order of preference:

1) Alter your field extraction to make the value the entire indexed term.

For example:

REGEX = Completed in (?<ruby_call_completed>\d+ms)

Of course, your users would rather not have to deal with the unit of measurement. As such, you can either hide the unit from them by creating a form search that accepts a numeric value and then automatically appends the "ms" suffix, or in the case of reporting use

... | convert rmunit(ruby_call_completed)...

2) Create a fields.conf entry to let Splunk know that this field is not indexed and that the value does not exist in the index.

So in $SPLUNK_HOME/etc/apps//local/fields.conf:

[ruby_call_completed]
INDEXED=false
INDEXED_VALUE=false

The problem with this approach is that, since the field or the value is not indexed, the search will be very inefficient over a large amount of data and/or over a wide span of time, as Splunk will have to run a regex against the raw data to discover matching events.

3) Extract the field at index time, and let Splunk know the field is indexed.

In $SPLUNK_HOME/etc/apps//local/props.conf:

[ruby]
TRANSFORMS-ruby=ruby_call_completed

In $SPLUNK_HOME/etc/apps//local/transforms.conf:

[ruby_call_completed]
REGEX=\sCompleted\sin\s(\d+ms)
FORMAT=ruby_call_completed::$1
WRITE_META=true

In $SPLUNK_HOME/etc/apps//local/fields.conf:

[ruby_call_completed]
INDEXED=true
INDEXED_VALUE=false

This approach has at least two drawbacks:

Index-time field extractions are not at all malleable - in other words, you can't go back and alter them later if you change your mind.
Index-time field extractions take up more space in your index, potentially a lot more space if the field(s) occur frequently

View solution in original post

Paolo_Prigione · ‎11-11-2010

I think that's due to Splunk trying to match the 12 against the indexed words rather than the raw event: the _raw contains 12ms which is not segmented in two blocks, it has been indexed as a single term, being it a single word without any major/minor breaking character into it (ref: segmenters.conf)

This instead could work (but would return more results than expected):

sourcetype=ruby ruby_call_completed=12*

because Splunk shoud try to find indexed tokens starting with 12 (so 12ms, 123ms, ... would be found in the index) On the other side, your second example:

sourcetype=ruby | search ruby_call_completed=12

first acts on the indexed data matching sourcetype=ruby, then fields are extracted, THEN the secondary search is executed.

I think this is due to the map-reduce paradigm:

"map" is executed on the distributed servers, and it is just a search for matching events based on the precomputed index of the logs
"reduce" extracts fields and applyes the secondary search, but this is only executed on the node where the search was first launched.

However, that's only my two cents... Paolo

araitz · ‎11-11-2010