Splunk Search

Matching log events with dynamic context information (e.g which version of package X was installed?)

almin
Engager

Hi everyone,

I am using Splunk Enterprise 7.0.8.5 with the Universal Forwarder 6.5.2/6.5.3 on multiple hosts running Ubuntu 14.04 LTS or 16.04 LTS, and I am trying to find a way to tie the versions of specific debians installed on these hosts to the generated log events for specific sourcetypes.

Example: assuming I configured Splunk so that all log lines for sourcetype mysourcetype are generated by programs installed by the mydebian debian package, I'd like to know for each log event on mysourcetype which version of the mydebian package was installed on the host when this log event was created. This would help me among other things to correlate the occurrences of various events to the versions of software installed on the hosts ("Did failure X occur more on v2.3 than on v1.5?")

Note that I only need to tie log events <> package versions on splunkcloud when doing searches there, I don't technically need this association at index time on the hosts.

With a bit of research on these forums I found the _meta 'tag' for inputs.conf that I can use to attach key/value pairs to log events in the form _meta = field1::foo field2::bar ( https://answers.splunk.com/answers/1453/how-do-i-add-metadata-to-events-coming-from-a-splunk-forward... ), but 1) I understand that tying these metadata directly on the hosts (index time) would increase the size of every ingested message and most of the time ingest the same information (these package versions might change every few days at most), 2) I think I understand how _meta can be used with extracted fields, but not how I would use it with, say, the output of a shell command ( dpkg [...] )

Retrieving these package versions and the timestamps of when they changed is fairly easy to do with our splunk set up, so I am thinking of an alternative to _meta where I would create some temporal look up table (easy) in the form:

mydebian_installation_time,host,mydebian_version_installed
2018-11-13 13:10:05.908,host1,2.3.4
2018-12-18 19:26:45.000,host1,2.3.5
2018-12-31 21:03:03.000,host2,1.2
[...]

And then tie the corresponding package version to each log events in mysourcetype. Ideally I'd only have to do this association once, and maybe update it periodically, but not with every single search. I haven't found how to do this last part though (tying info from a temporal look up table to log events), so I am looking for pointers there too.

Thanks

0 Karma
1 Solution

tom_frotscher
Builder

Hi,

if i understand you problem correctly, and you are easily able to keep your list of versions up to date and as a lookup file for splunk, you might be able to solve this problem with a time based lookup. See the docs for time based lookups here: https://docs.splunk.com/Documentation/Splunk/7.2.6/Knowledge/Defineatime-basedlookupinSplunkWeb

There is even an answer post with a solution with a quite similar host list like yours:
https://docs.splunk.com/Documentation/Splunk/7.2.6/Knowledge/Defineatime-basedlookupinSplunkWeb

Basically, a time based lookup is not only using a key field to match your events with the lookup data. It also uses a time field (could in your case be "mydebian_installation_time"), as the earliest time it is matching an event to this lookup entry.

Greetings

Tom

View solution in original post

tom_frotscher
Builder

Hi,

if i understand you problem correctly, and you are easily able to keep your list of versions up to date and as a lookup file for splunk, you might be able to solve this problem with a time based lookup. See the docs for time based lookups here: https://docs.splunk.com/Documentation/Splunk/7.2.6/Knowledge/Defineatime-basedlookupinSplunkWeb

There is even an answer post with a solution with a quite similar host list like yours:
https://docs.splunk.com/Documentation/Splunk/7.2.6/Knowledge/Defineatime-basedlookupinSplunkWeb

Basically, a time based lookup is not only using a key field to match your events with the lookup data. It also uses a time field (could in your case be "mydebian_installation_time"), as the earliest time it is matching an event to this lookup entry.

Greetings

Tom

almin
Engager

Thank you Tom, I did search for documentation on time based lookups but somehow missed that page. This is great.
You copied the same link twice in your answer, I think you meant to include something else.
Anyhow, the help page you linked was enough for me to figure out how to define my versions_installation_time time-based lookup via the settings menu on Splunk Web, and I've been able to apply it to a simple example that looks something like:

makeresults
| lookup versions_installation_time host OUTPUTNEW mydebian_version_installed
| timechart some_metric by mydebian_version_installed

Can I/can you promote your comment as an answer?

0 Karma

tom_frotscher
Builder

Hi,

i promoted it to an answer, so you can vote on it 🙂

Sorry for the duplicated link, the second one should have been this one:
https://answers.splunk.com/answers/617407/how-to-configure-a-time-based-lookup-temporal-look.html

Get Updates on the Splunk Community!

Federated Search for Amazon S3 | Key Use Cases to Streamline Compliance Workflows

Modern business operations are supported by data compliance. As regulations evolve, organizations must ...

New Dates, New City: Save the Date for .conf25!

Wake up, babe! New .conf25 dates AND location just dropped!! That's right, this year, .conf25 is taking place ...

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud  In today’s fast-paced digital ...