Splunk Search

How to derive logged application's name?

Path Finder

I have several enterprise applications which are split up into multiple services and tiers, all of which are being Splunked. Each component logs to a different filename and so has a unique source per component. Further complicating things, some of the components' log files will sometimes be prefixed with a GUID (due to a non-thread-blocking concept Microsoft EntLib Logging where threads competing for the same log file will output to a one-off sibling file instead). So we end up with f57f623d-5c2b-4cff-b72b-68ee4e5b0dcaApp.Facade.log, App.Facade.log and UI.Service.log.

But I know which filenames correspond to which enterprise application. For example, *Special.Facade.log and *Another.UI.Service.log both belong to an application known as "foo", while "SomethingElse.Facade.log" and "Bar-log4net.log" belong to "bar". I'd like to roll that up into a derived field so I could search simply with "app=foo" to limit my search to only those sources which are a part of "foo".

What is my best strategy for doing this?

I've looked at tags, but tags wouldn't handle the wildcard necessary for the GUID-prefixed files. I've tried lookup table with "match_type=WILDCARD(source)", but my field never shows up. I'm willing to keep hammering at these strategies if they're the best ones, but as a Splunk novice, I'm hoping someone out there has an even more ingenious strategy.

My backup plan is to use a saved search with all of the relevant wildcarded sources OR'ed together.

0 Karma

Path Finder

I think I've learned two things (and if anyone could confirm, I'd appreciate it):

  1. Lookup with match_type=WILDCARD(xxx) doesn't honor beginning wildcards.
  2. Lookup is case-sensitive

Armed with those assumptions, I've taking a two step approach:

[1] Extract the filename from the source, being sure to exclude the possible GUID:

#props.conf
EXTRACT-SourceFilename = .*\\([0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12})?(?<sourcefilename>.*)$ in source

[2] Lookup the source filname to an app name with exact case:

    #props.conf
    LOOKUP-app = lookup_app_by_source_wildcard source as sourcefilename OUTPUTNEW app

    #my.csv
    source,app
    Special.Facade.log,foo
    Another.UI.Service.log,foo
    SomethingElse.Facade.log,bar
    Bar-log4net.log,bar

This appears to be working, except I would've thought my extracted field name ("SourceFilename") and the field referenced in the lookup ("sourcefilename") would have to match case. I fear it's not using my extracted field name, but a field name I derived in search using "rex".

Also, my regex doesn't seem to be matching some filenames.

0 Karma

Splunk Employee
Splunk Employee

Extract the GUID with a field extraction based on the source.
Then use this field with a lookup that has the correspondance GUID / Application.

Try the extraction with regexes with something like
<mysearch>| rex field=source "\\(?<_GUID_>[azAZ09-])\." | table source _GUID_

0 Karma

Path Finder

I've just learned (the hard way) that lookups are case sensitive. That has given me hope.

0 Karma

Path Finder

The GUIDs aren't anything identifiable. They're generated on the fly if one thread is writing to a log and another thread wants to also. Instead of thread #2 blocking, it just writes to a sibling file prefixed with a GUID. I wasn't clear about that in my original question, but I've tried to clarify now.

Still, you've given me two things:
1. Confidence that a lookup table is a good strategy
2. A different way to try and diagnose my problem using rex and table inline.

0 Karma