Splunk Search

Why does adding a field filter (by clicking on the field's value in the fields listing on the left) eliminate all search results?

triest
Communicator

When I run a search ( sourcetype="fieldtest"), I see that there are two events with a field called third and a value of other-value. If I filter on that field value by clicking on it in the field listed on the left, then no events are returned. The third field is extracted via a transform (included in props as a REPORT) and the value is hard coded in the format line (see below)

When I run the following search:

sourcetype="fieldtest"  third=other-value | stats count
| append [ search sourcetype="fieldtest" third=* | stats count ]
| append [ search sourcetype="fieldtest" | search  third=other-value | stats count ]

I would expect to get the same result -- third only has the value other-value for this data.
Instead I get three rows with 0, 2, and 2

Originally I discovered this issue in March; while I opened a support ticket it has since been closed and marked unresolved. Since they were unable to reproduce the problem, I decide to see how simple a scenario I could create that would still result in the error.

Originally I was running this in a distributed search environment, but today confirmed it on a single instance test on my laptop.

I changed the field name away from action since that was causing quite a bit of stuff to get added to the search (based upon inspecting the job and looking at lit search). Today's results is very interesting because the second row indicates that Splunk knows there's a value for third but doesn't see it as other-value.

To re-create this locally:

transforms.conf:

[field_test_transform]
REGEX=(quick)
FORMAT = third::other-value

props.conf:

[fieldtest]
REPORT-field-test-transform = field_test_transform

I then indexed some data with a source type of field-test

Wed Jul 1 12:54:21.030 This is just some text
Wed Jul 1 12:54:21.055 The quick brown fox.
Wed Jul 1 12:54:21.078 No really this is just random test.
Wed Jul 1 12:54:21.100 I need to add dates.
Wed Jul 1 12:54:21.196 Anything else I should do?
Wed Jul 1 12:54:22.154 I should probably have another one with The quick brown fox in it just so I have more than one.
Wed Jul 1 12:54:23.196  Hopefully this is a fairly simple test that fails so the problem can be reproduced. 

I used the third=other-value since that's what the documentation uses in the example where they create a value directly.

* Search-time extraction examples:
    * 1. FORMAT = first::$1 second::$2 third::other-value

http://docs.splunk.com/Documentation/Splunk/6.2.3/admin/Transformsconf

That indicates it should work for search-time extractions, so I would assume using REPORT- in props is correct (vs using TRANSFORMS). I'm fairly confident that there is not another configuration causing issues as I have changed the name of the field three times.

When we first saw the issue we were running 6.1.3 (I think; it was 6.1.x), we are currently running 6.2.2 in production and see the issue. In my test on my laptop today I upgraded to 6.2.3 just to confirm it is still an issue.

I'd greatly appreciate any insights. For this specific use-case in production (populating action when the value isn't in the event), you could use a lookup table. The challenge is that creating a lookup table JUST for this requires creating a transforms.conf, a props.conf (to apply the transforms), and the look-up. Its much simpler to use an extraction in transforms.conf to create the field.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

The key issue here is that Splunk by default assumes field values to be indexed tokens, ie words that appear in the raw text of an event.
In your hardcoded search-time extraction that's not the case. Splunk will expand the third=other-value into a base lispy looking for the token other-value, which will not match any events. To verify this, check your job inspector for the search.log and search for lispy.
Your piped search works because that's applied after loading all events off disk and applying the extraction.

Do set this in your fields.conf and try again:

[third]
INDEXED_VALUE = false

That will tell Splunk to not make this assumption. Depending on your actual use case, other settings may be more efficient - see http://docs.splunk.com/Documentation/Splunk/6.2.3/admin/fieldsconf

martin_mueller
SplunkTrust
SplunkTrust

This works because of the way Splunk understands a search string of third=value. First, Splunk retrieves all events containing the token value - regardless of fields. Then it applies search-time field extraction to those events. Lastly, it filters the now-extracted field third for the value.

In your case, the first step returns zero events. Hence you need to tell Splunk "don't pre-filter events by indexed tokens when filtering for field third".

According to the docs you linked,

INDEXED_VALUE = [true|false|<sed-cmd>|<simple-substitution-string>]
...
* Defaults to true.

Your test 3 is weird. Re-run btool with --debug to see where the false is coming from.

0 Karma

triest
Communicator

Interestingly this worked (at least for the simplified test case; I'll try to test it in production tonight);

I'm confused why.

According to the docs(1) it defaults to false.

Looking at $SPLUNK_HOME/etc/default/fields it defaults to False.

I thought maybe it was because the field wasn't listed anywhere, so the "global" setting wasn't getting applied to the field and it was either undefined or in code it defaults to true. To test that I tried just having [third] in a fields.conf That breaks the search again, even though the btool output indicates that same output (see below where I outline three scenarios.

Hopefully it just works and I don't need the why, but if some one could help my understand I would greatly appreciate it.

  1. http://docs.splunk.com/Documentation/Splunk/6.2.3/admin/fieldsconf

Test 1
fields.conf

[third]
 INDEXED_VALUE = false

btool output

[third]
INDEXED = False
INDEXED_VALUE = false
TOKENIZER = 

Works: Yes

Test 2
fields.conf

[third]
 INDEXED_VALUE = False

btool output

[third]
INDEXED = False
INDEXED_VALUE = False
TOKENIZER = 

Works: Yes

Test 3
fields.conf

[third]

btool output

[third]
INDEXED = False
INDEXED_VALUE = False
TOKENIZER = 

Works: No

0 Karma

triest
Communicator

I believe the query which shows the different counts rules out _time being the issue, but just to clarify...
For the tests on my laptop, I ran the search for all time.

When I originally discovered this, I hard coded earliest and latest in my search (and the sub-searches for append)

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...