Splunk Search

Trying to use lookup table instead of tagging hosts

gfriedmann
Communicator

I am trying to settle on a method for grouping hosts into hostgroups for easy searching and reporting. I have heard enough warnings of tags not scaling well. We have about 1000-2000 host sources.

I don't know which of these practices cause tagging scalability problems:

  1. a large amount of distinct tags themselves
  2. a large amount of different field values being tagged
  3. or tagging fields that are generated at search time?

I HAVE seen eventtypes and tag::eventtypes slow down a search monstrously (windows apps).

So i am trying to work through cases using lookuptables. it looks like this:

 [| inputlookup hostgroups.csv | search group=pci-windows | fields + host]

I think i've run into two limitations with inputlookup to csv for hostgroups at search time.

  1. I can't create an easy search macro using an inputlookup.
  2. The knowledge has to be managed outside of splunk in a .csv. For example, users can no longer manage hostgroup/taggroup inside splunk UI.

Am i doing it right? Perhaps i should be returning all events and doing a 'where' clause of some sort with a lookup table?

Thank you, Answers!

Tags (3)
1 Solution

gkanapathy
Splunk Employee
Splunk Employee

Here are the scaling problems you might find with tags. They may or may not be important:

  • It's harder to manage and edit tag mappings in a file, or have them automatically generated and maintained from some other script, vs a CSV lookup table. On the other hand, yes, there's no GUI for editing them.
  • A large number of values attached to a particular tag expands into a very large search string. That's not a problem if there are fewer than, say a few thousand values, but beyond maybe 3000 or 5000 it might not work so well. However, using a lookup table doesn't offer any advantages over that because it runs into the same problems.

Eventtypes are a completely different issue from tags or lookups, and having a large number of complex searches can slow down the system overall, since basically every single event returned must be checked against every single event type search.

But, you're not using lookups quite the right way. If I were using lookups to tag hosts, I would configure an automatic lookup, say

 LOOKUP-1 = hosttogroup host OUTPUT group

This would reference a table like:

 host,group
 myserver,dev
 myserver,app
 myserver,j2ee
 myserver2,prod
 myserver2,db
 myserver3,dev
 myserver4,test
 myserver4,db

 ...

Then you would simply search using group="dev". This wouldn't require a macro at all, or the use of the inputlookup command.

View solution in original post

gkanapathy
Splunk Employee
Splunk Employee

Here are the scaling problems you might find with tags. They may or may not be important:

  • It's harder to manage and edit tag mappings in a file, or have them automatically generated and maintained from some other script, vs a CSV lookup table. On the other hand, yes, there's no GUI for editing them.
  • A large number of values attached to a particular tag expands into a very large search string. That's not a problem if there are fewer than, say a few thousand values, but beyond maybe 3000 or 5000 it might not work so well. However, using a lookup table doesn't offer any advantages over that because it runs into the same problems.

Eventtypes are a completely different issue from tags or lookups, and having a large number of complex searches can slow down the system overall, since basically every single event returned must be checked against every single event type search.

But, you're not using lookups quite the right way. If I were using lookups to tag hosts, I would configure an automatic lookup, say

 LOOKUP-1 = hosttogroup host OUTPUT group

This would reference a table like:

 host,group
 myserver,dev
 myserver,app
 myserver,j2ee
 myserver2,prod
 myserver2,db
 myserver3,dev
 myserver4,test
 myserver4,db

 ...

Then you would simply search using group="dev". This wouldn't require a macro at all, or the use of the inputlookup command.

gfriedmann
Communicator

Fricken rock. Thank you. I will test with this.

millarma
Path Finder

what would the csv lookup up for this look like. Can you paste 2-3 lines including the header?

0 Karma
Get Updates on the Splunk Community!

Splunk Observability Cloud | Unified Identity - Now Available for Existing Splunk ...

Raise your hand if you’ve already forgotten your username or password when logging into an account. (We can’t ...

Index This | How many sides does a circle have?

February 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

Registration for Splunk University is Now Open!

Are you ready for an adventure in learning?   Brace yourselves because Splunk University is back, and it's ...