Hello all,
We are having some problems defining a time-based kvstore lookup on Splunk 6.2.0.
We tried defining a similar time_based csv lookup and it works!
[timed_test_kv]
collection = timed_test
external_type = kvstore
fields_list = _key,_time,username,ip,test_kv_user
time_field = _time
[timed_test]
enforceTypes = true
field.kvs__time = time
field.kvs_username = string
field.kvs_ip = string
field.kvs_test_kv_user = string
and
[timed_test]
field.kvs__time = string
field.kvs_username = string
field.kvs_ip = string
field.kvs_test_kv_user = string
_time ip username test_kv_user
2015-01-14 12:53:32 10.15.182.117 carva test_user_1
2015-01-14 12:53:42 10.15.182.117 carva test_user_2
2015-01-14 12:53:52 10.15.182.117 carva test_user_3
2015-01-14 12:54:02 10.15.182.117 carva test_user_4
2015-01-14 16:47:32 10.15.182.117 carva test_user_5
2015-01-14 16:47:42 10.15.182.117 carva test_user_6
2015-01-14 16:47:52 10.15.182.117 carva test_user_7
2015-01-14 16:48:02 10.15.182.117 carva test_user_8
2015-01-14 18:28:02 10.15.182.117 carva test_user_9
[timed_test]
filename = timed_test.csv
time_field = _time
_time ip username test_temp_user
2015-01-14 12:53:32 10.15.182.117 carva test_user_1
2015-01-14 12:53:42 10.15.182.117 carva test_user_2
2015-01-14 12:53:52 10.15.182.117 carva test_user_3
2015-01-14 12:54:02 10.15.182.117 carva test_user_4
2015-01-14 16:47:32 10.15.182.117 carva test_user_5
2015-01-14 16:47:42 10.15.182.117 carva test_user_6
2015-01-14 16:47:52 10.15.182.117 carva test_user_7
2015-01-14 16:48:02 10.15.182.117 carva test_user_8
2015-01-14 18:28:02 10.15.182.117 carva test_user_9
Via | Inputlookup the _time field appears parsed but all lookup versions were created with the same epoch times on the _time field.
The lookup search query is the same (except the lookup name) but the last lookup field test_*_user appears empty on the kvstore version but not on the csv version.
We've restarted Splunk each time each lookup was created (just to be sure 🙂 ).
I'm afraid we might be missing some parameter to make the kvstore time-based lookup work but the documentation doesn't say anything specific for the kvstores.
Thank you, and sorry for the long text.
I think we manage to understand why it didn't worked, but it's still strange as time-based csv lookups worked that way (or it was all a big coincidence).
our problem was that, despite you define the time_field as last_seen...the match between the events and the lookup is always done with _time = last_seen on the lookup....and not last_seen = last_seen as we might expect.
well, we have a time-based kvstore lookup working now.
Sooo, after over three years when you're searching for "time-based kv store lookups", you'll find this thread. Because the answer isn't really clear what you need to define (and more interesting, what not), I'll give you an example because we had the exact same issue. The issue is that Splunk doesn't recognise the time field as being a time format.
So, what you want is something like this. I've taken the examples from above. Feel free to upvote if you run into the same issues and think this helps. Because it did for us atleast.
I'd recommend storing the time as epoch (standard).
KV store lookup definition:
[cdp_proxy]
collection = cdp_proxy
external_type = kvstore
fields_list = _key, src_ip, username, time
time_field = time
I'd rather not name the field in the KV store "_time".
In the collection definitions (collections.conf):
[cdp_proxy]
enforceTypes = true
field.time = time
The "field.time" is actually the field followed by its name and then defined as the type "time". Just refer to the collections.conf spec file for further information.
Skalli
Thank you @skalliger!!
I also dropped some feedback on https://docs.splunk.com/Documentation/Splunk/8.0.1/Knowledge/Defineatime-basedlookupinSplunkWeb so hopefully that will help as well.
Can someone post a final and working example for using kvstore for time based lookup? Maybe splunk should post it on their blog or in their documentation?
It seems that the confusing issues are the following:
I think we manage to understand why it didn't worked, but it's still strange as time-based csv lookups worked that way (or it was all a big coincidence).
our problem was that, despite you define the time_field as last_seen...the match between the events and the lookup is always done with _time = last_seen on the lookup....and not last_seen = last_seen as we might expect.
well, we have a time-based kvstore lookup working now.
I think we manage to understand why it didn't worked, but it's still strange as time-based csv lookups worked that way (or it was all a big coincidence).
our problem was that, despite you define the time_field as last_seen...the match between the events and the lookup is always done with _time = last_seen on the lookup....and not last_seen = last_seen as we might expect.
well, we have a time-based kvstore lookup working.
thank you very much.
Great to hear that it is working for you now!
From the defintion of the "time_field" in http://docs.splunk.com/Documentation/Splunk/latest/Admin/Transformsconf
Used for temporal (time bounded) lookups. Specifies the name of the field in the lookup
table that represents the timestamp.
So the right logic is "_time = last_seen".
Following your suggestion, i've tried the following:
[cdp_proxy]
enforceTypes = true
field.src_ip = string
field.username = string
field.first_seen = time
field.last_seen = time
[cdp_proxy]
collection = cdp_proxy
external_type = kvstore
fields_list = _key, src_ip, username, first_seen, last_seen
[cdp_proxy_time]
collection = cdp_proxy
external_type = kvstore
fields_list = _key, src_ip, username, first_seen, last_seen
time_field = last_seen
last_seen src_ip username
1421664188 10.15.182.115 carvajp6
1421664638 10.15.182.115 carvajp6
last_seen src_ip username
1421664188 10.15.182.115 carvajp6
1421664638 10.15.182.115 carvajp6
...
| lookup cdp_proxy src_ip username output last_seen as active_session
...
| lookup cdp_proxy_time src_ip username output last_seen as active_session
result: doing the same search and a similar lookup command, the time-based kvstore case has a null active_session field while the kvstore case has the two lookup values there.
this is strange. 😞
Why strange?
Does it actually means that you just don't have any records after "1421664188" and "1421664638" for src_ip = "10.15.182.115"?
that's strange because the index has records on the defined time field after those dates...and a time-based csv lookup with the same structure and info work as expected.
i've opened a support case with Splunk and as soon as they reply i'll update this page.
thank you for your help.
Did you get answers from splunk support?
How did you import values in KVStore collection? Could you check your log for any errors / warnings?
KVStore collection stores time values in Unit epoch format (numbers) - so my guess this can be just a formatting issue somewhere.
thank you for your reply.
so, to test 'every' possibility we defined 6 different kvstore collections :
- one where the time field was a number with enforced_fields on.
- one where the time field was a number with enforced_fields off.
- one where the time field was a string with enforced_fields on.
- one where the time field was a string with enforced_fields off.
- one where the time field was a time with enforced_fields on.
- one where the time field was a time with enforced_fields off.
then we defined 6 different time-based kvstore lookups with each collection defined above. the only parameters used were the time_field and the fields list. no time-based lookup limits defined. then we defined a time-based csv lookup with the same fields/structure as the collections. the next step was to populate the collections/csv with data from the index we inserted the same data on every collection/csv. we inserted a couple of lines, one with the time_field as an eval attribution of _time, other as a string value and as a number value. used input lookup to check the contents, everything ok. lookup permissions, global, read, etc...restarted splunk just to be sure. tried a query, with a lookup command to each lookup defined above but only the CSV version had the output field with a value...every other lookup had that field blank. updated our search head to 6.2.1 but i got the same result.
do you know if there's a working example of a time-based kvstore lookup on the web/doc?
I looked again on your example and found one issue, the names of fields in the collections.conf, for example "field.kvs__time" should be "field._time" as everywhere else you are using "_time" as a field.
I guess you just copied one of the examples from http://dev.splunk.com/view/SP-CAAAEZJ where we are talking about document with fields without kvs_X but after that using kvs_X fields. This is a bug in this example.
Please take a look on the collections.conf.spec file http://docs.splunk.com/Documentation/Splunk/6.2.1/Admin/Collectionsconf for more details.
Hope that this will fix your issue.
oh, i see the big mistake. i read the documentation but i believe the examples were stronger than the previous text. 😞
for some reason i've understood all kv store fields should start with field.kvs_ (i thought it was strange but...you never know :)).
one very strange thing is...before we defined that time-based keystore we defined a similar kvstore lookup just like that (defining field.kvs_something on the collection and working with the something field on the search query and on the lookup definition...) and it worked!
i'll try that fix on monday. thank you very much for your patience and support.
Yes, we will fix this documentation "confusion".
Having string types for fields is not necessary, as everything in Splunk is a string in some sense and this will be default type when you will do outputlookup. I believe this is why everything worked for you before, because all your lookups were string-based.
ok, so if, in a stanza, we define:
field.test
it's the same as defining:
field.test = string
?
in our searches we will use time-based kvstore lookups with the values for the time_field coming from the _time builtin field or the result of latest/earliest(_time) operations, that is, epoch time...what do you think should be the time of the time_field defined on the collections? number? time? it doesn't matter?
thank you once again.
I meant that you do not need to define all used fields in collections.conf, so "field.test = string" is equal to not having this field defined at all. This is mostly true for case when you use this collection only in lookup. Defining "field.test = string" is useful only when you also can modify your data from JavaScript (REST endpoints) - so you will have protection that nobody will try to put number in this field.
I would suggest you to use time. At current moment time field is stored and implemented as number type, but because this can be changed in future or maybe we will handle it differently - I would suggest to keep type "time".