I have a kv store that has several fields (ip addresses, time stamps etc) tied to a unique key (the default mode) - when I do an inline
|lookup kvstore myip as ip, if the ip is apparent in several rows, all the other values become multi-valued. I have no way of telling which of the values in each of the multi-valued fields belong to which key. I'd like the output to be tabulated instead, just as with the
inputlookup command, which outputs every row per key.
Is this possible? This seems like a huge flaw. One would expect to be able to get the data tabulated or at least separated in some way, from the database?
@christoffertoft how many records do you have in kvstore? Also is kvstore has ip along with timestamp (i.e. not having each row with unique IP address), is there any reason why it is not indexed rather pushed to KVStore?
@niketnilay it's not necessarily a big KV-store, say around 3-10k entries. It has the use-case requirements of being read/written to constantly however and as such I don't see a csv file as being the right option for me.
Additionally, since the data is time bound in a sense, indexes wouldn't suffice - I'm having a need for CRUD-like interfacing with the data which kv-stores give me. Each row (possibly) has an IP, and an IP may be apparent in several rows.
If the KV Store does not have more than 10K entries and inputlookup seem to work fine for you, maybe you can write a join with inputlookup as inner subsearch and your index query as outer search. Community would be able to assist you better if you can provide your current search with some sample data (mock/anonymized) from lookup and index.
I think you need to elaborate more on what your expected behaviour is. If you really need all entries in a multivalued field in an individual event, you should use
| mvexpand <field_name> after your lookup. This will give you one event per multi value entry in the specified field. Suppose the following data in your kv store:
my_ip dns_name 1234 foo 5678 bar 1234 baz
Note that this is two dns_name values for the same ip, with no other way to discern the two rows (except for the hidden field _key). Now with the following events:
_time ip sourcetype 1000 1234 sourcetype_a 2000 5678 sourcetype_b 3000 1234 sourcetype_c
I would expect the output of your lookup to create these results:
_time ip sourcetype dns_name 1000 1234 sourcetype_a foo baz 2000 5678 sourcetype_b bar 3000 1234 sourcetype_c foo baz
This is because the input to your lookup matches more than one row and you haven't specified a maximum number of matches. If you want one row per output result of your lookup, you could use
| mvexpand dns_name and get the following result:
_time ip sourcetype dns_name 1000 1234 sourcetype_a foo 1000 1234 sourcetype_a baz 2000 5678 sourcetype_b bar 3000 1234 sourcetype_c foo 3000 1234 sourcetype_c baz
If that's not what you need, please explain what you'd like to see.
Hi @jeffland, thanks for your reply!
first question first - how do you tabulate your code lika that on answers?
Second part - I have a list similar to the one you described, with ip, macs and hostnames, along with timestamps when each of these were added or updated. This is intended to be the representation of a machine.
At any point in time, i need to, for example, look up a hostname, or a mac, or a timestamp for one of these values. If a hostname for example is apparent in several rows, there is no way for me to determine the exact values belonging to that row since all of them are bundled together. mvexpand on the other hand explodes the multi-values into every possible combination. A result which has 5 values in 5 fields will return 5^5 events.
If the events that contained my value i wnated to search for, say host=jeffland would be represented on rows, i wouldnt have this problem.
The only solution right now is to use a combination of the hostname and, say, the updated time for that value, by using two
|lookup commands to first single out the host, and subsequently expand every value i need to search for and again use a
| lookup filtering for that host and finding the time field i'm looking for.
A follow-up issue of is that I can't really remove IP collisions within the kvstore since there is no way of telling which of the values belong to which key or row or what have you.