After upgrading to 4.3 I noticed one of my timecharts was not working correctly:
searchterm NOT port=16 | timechart foo
UPDATE CLARIFICATION: The above worked fine on 4.0, 4.1 AND 4.2 without any changes to configuration being required. My question is about new behaviour as shown by 4.3
I eventually figured out that this was because of the NOT port=16
part.
In both the advanced charting & standard search views, the above term is removing far, far, far more events than just those where the field port
is equal to the value 16
.
Any suggestions about why this is the case?
Here is an example:
Searching for index=hosttype host=hosttypel
gives
_raw,_time,date_hour,date_mday,date_minute,date_month,date_second,date_wday,date_year,date_zone,eventtype,host,index,linecount,port,punct,retry_indicator,source,sourcetype,splunk_server,station,success_indicator,tag::host,time_taken,timeendpos,timestartpos
" station1,10/01/2012-00:06:56,063,1,0",1326154016,0,10,6,january,56,tuesday,2012,local,poll_result,hosttype1,hosttype,1,1,"________________,//-::,,,",1,/home/splunk/logs/hosttype1/port1.stat,source-stat,vmhost-splunk,station1,0,"hosttype1,polling_local,polling_total",63,50,31
" station2,10/01/2012-00:06:27,037,0,0",1326153987,0,10,6,january,27,tuesday,2012,local,poll_result,hosttype1,hosttype,1,7,"____________________,//-::,,,",0,/home/splunk/logs/hosttype1/port7.stat,source-stat,vmhost-splunk,station2,0,"hosttype1,polling_local,polling_total",37,50,31
" station3,10/01/2012-00:06:15,041,0,0",1326153975,0,10,6,january,15,tuesday,2012,local,poll_result,hosttype1,hosttype,1,1,"____________________,//-::,,,",0,/home/splunk/logs/hosttype1/port1.stat,source-stat,vmhost-splunk,station3,0,"hosttype1,polling_local,polling_total",41,50,31
" station4,10/01/2012-00:05:32,043,1,0",1326153932,0,10,5,january,32,tuesday,2012,local,poll_result,hosttype1,hosttype,1,1,"_______________,//-::,,,",1,/home/splunk/logs/hosttype1/port1.stat,source-stat,vmhost-splunk,station4,0,"hosttype1,polling_local,polling_total",43,50,31
" station5,10/01/2012-00:05:30,057,1,1",1326153930,0,10,5,january,30,tuesday,2012,local,poll_result,hosttype1,hosttype,1,7,"____________________,//-::,,,",1,/home/splunk/logs/hosttype1/port7.stat,source-stat,vmhost-splunk,station5,1,"hosttype1,polling_local,polling_total",57,50,31
" station6,10/01/2012-00:05:29,036,1,0",1326153929,0,10,5,january,29,tuesday,2012,local,poll_result,hosttype1,hosttype,1,5,"______________,//-::,,,",1,/home/splunk/logs/hosttype1/port5.stat,source-stat,vmhost-splunk,station6,0,"hosttype1,polling_local,polling_total",36,50,31
" station7,10/01/2012-00:05:06,037,1,0",1326153906,0,10,5,january,6,tuesday,2012,local,poll_result,hosttype1,hosttype,1,8,"________________,//-::,,,",1,/home/splunk/logs/hosttype1/port8.stat,source-stat,vmhost-splunk,station7,0,"hosttype1,polling_local,polling_total",37,50,31
Searching for index=hosttype host=hosttypel NOT port=16
gives no results.
EDIT: So as suggested in the first answer, 'port!=16' seems to work fine. Why doesn't 'NOT'? Its still in the documentation? Is there anywhere I can file a bug?
hexx was on the right track - the problem is that you are extracting the port from the source file path. By default, Splunk will look for the value of the extracted field in the event's text. In cases where it is not in the event text, you need to use fields.conf:
[port]
INDEXED_VALUE=false
This tells Splunk that the value of port is not in the indexed data. Once you do this, you still might need to change the field name wherein the port is extracted from the event itself - perhaps call it port_number. Then your search could be:
searchterm NOT port=16 NOT port_number=16 | eval port=if(isnotnull(port), port, port_number) | timechart foo
hexx was on the right track - the problem is that you are extracting the port from the source file path. By default, Splunk will look for the value of the extracted field in the event's text. In cases where it is not in the event text, you need to use fields.conf:
[port]
INDEXED_VALUE=false
This tells Splunk that the value of port is not in the indexed data. Once you do this, you still might need to change the field name wherein the port is extracted from the event itself - perhaps call it port_number. Then your search could be:
searchterm NOT port=16 NOT port_number=16 | eval port=if(isnotnull(port), port, port_number) | timechart foo
Thanks @araitz, that explains the behaviour. I do remember that "feature" coming up before and wondering whether it was the correct word between my quotes.. Thanks for the help!
You could also use "NOT port=16 OR port!=16
" which should cover you for both cases without having to edit conf files. Just in case you're interested 🙂
Put another way:
By default, Splunk is looking for the term "16" in the index, then looks for the field "port" with value "16". This is why fields.conf is required - to tell Splunk "don't bother looking for the term 16 in the index".
The search "port!=16" tells Splunk just to look for events with the field "port" with values not equal to "16", in effect achieving the same effect as fields.conf.
I also considered that bloom filters (new in 4.3) might be the cause, but running './splunk cmd searchtest "search NOT port=15"' did not indicate that the filters were excluding any buckets.
Aha right, thanks for the explanation. So I guess there's a good possibility that I will lose out in the future with more complex searches involving NOT
where !=
isn't good enough. Not exactly the most ideal situation. I'll go for the 'indexed=false' and see how things go. Thanks for the help.
The reason it behaves differently, as I previously stated, is because you have some values of the port field extracted from the source, which is not an indexed value, and some extracted from indexed values. This is the way Splunk is expected to work.
I was just able to confirm this on 4.3 - if I have fields extracted from either indexed values (exclusively) or non-indexed values (exclusively), both NOT and != function as expected. If I have a field that is a mishmash of indexed and non-indexed values, it does not work, as expected. If this worked before 4.3, I don't think it should have.
Also, I'm not sure you should accept your own answer on my question. I've undone your acceptance until we can get the details worked out.
So why does NOT port=16
behave differntly to port!=16
? Why did I not need to do what you are suggesting in 4.0, 4.1 OR 4.2? This is the first time it has been an issue with identical setup.
You have the right to your own opinion, but your comment above doesn't make much sense. fields.conf has been required for fields extracted from non-indexed data for several versions of Splunk.
So why doesn't this still work properly with the data where port was extracted from the event itself? Why does !=
work but NOT
doesn't? I'm going to vote you down because of this until I know better. Your "answer" involves too much work which should be unnecessary in an ideal world. @milestulett is currently in the lead, even if you voted him down.
@hexx "port" is an inline field extraction. For one sourcetype its extracted from the source file path. For another sourcetype its extracted from the log file line.
Perhaps try port!=16
. This is shown in some of the training videos (which use 4.3).
ie: searchterm port!=16 | timechart foo
It might be that one is interpretive, while the other one is blind? Maybe I'm making things up, but that's the only explanation I can come up with at the moment..?
This seems to work... I might accept it later but for now I'm a little miffed that 'NOT' doesn't work despite it being in the documentation. Is this in the documentation? Can I rely on it?
Thank you. So, how is the "port" field extracted, and what is its value for that field in the events you've provided as samples?
@hexx here's 7! 🙂
Could you give us an example of an event excluded by the "NOT port=16" term which shouldn't be?
Oh and I also tried wrapping 16 in "s but that didn't help..