Splunk Search

Trying to use lookup table instead of tagging hosts

gfriedmann
Communicator

I am trying to settle on a method for grouping hosts into hostgroups for easy searching and reporting. I have heard enough warnings of tags not scaling well. We have about 1000-2000 host sources.

I don't know which of these practices cause tagging scalability problems:

  1. a large amount of distinct tags themselves
  2. a large amount of different field values being tagged
  3. or tagging fields that are generated at search time?

I HAVE seen eventtypes and tag::eventtypes slow down a search monstrously (windows apps).

So i am trying to work through cases using lookuptables. it looks like this:

 [| inputlookup hostgroups.csv | search group=pci-windows | fields + host]

I think i've run into two limitations with inputlookup to csv for hostgroups at search time.

  1. I can't create an easy search macro using an inputlookup.
  2. The knowledge has to be managed outside of splunk in a .csv. For example, users can no longer manage hostgroup/taggroup inside splunk UI.

Am i doing it right? Perhaps i should be returning all events and doing a 'where' clause of some sort with a lookup table?

Thank you, Answers!

Tags (3)
1 Solution

gkanapathy
Splunk Employee
Splunk Employee

Here are the scaling problems you might find with tags. They may or may not be important:

  • It's harder to manage and edit tag mappings in a file, or have them automatically generated and maintained from some other script, vs a CSV lookup table. On the other hand, yes, there's no GUI for editing them.
  • A large number of values attached to a particular tag expands into a very large search string. That's not a problem if there are fewer than, say a few thousand values, but beyond maybe 3000 or 5000 it might not work so well. However, using a lookup table doesn't offer any advantages over that because it runs into the same problems.

Eventtypes are a completely different issue from tags or lookups, and having a large number of complex searches can slow down the system overall, since basically every single event returned must be checked against every single event type search.

But, you're not using lookups quite the right way. If I were using lookups to tag hosts, I would configure an automatic lookup, say

 LOOKUP-1 = hosttogroup host OUTPUT group

This would reference a table like:

 host,group
 myserver,dev
 myserver,app
 myserver,j2ee
 myserver2,prod
 myserver2,db
 myserver3,dev
 myserver4,test
 myserver4,db

 ...

Then you would simply search using group="dev". This wouldn't require a macro at all, or the use of the inputlookup command.

View solution in original post

gkanapathy
Splunk Employee
Splunk Employee

Here are the scaling problems you might find with tags. They may or may not be important:

  • It's harder to manage and edit tag mappings in a file, or have them automatically generated and maintained from some other script, vs a CSV lookup table. On the other hand, yes, there's no GUI for editing them.
  • A large number of values attached to a particular tag expands into a very large search string. That's not a problem if there are fewer than, say a few thousand values, but beyond maybe 3000 or 5000 it might not work so well. However, using a lookup table doesn't offer any advantages over that because it runs into the same problems.

Eventtypes are a completely different issue from tags or lookups, and having a large number of complex searches can slow down the system overall, since basically every single event returned must be checked against every single event type search.

But, you're not using lookups quite the right way. If I were using lookups to tag hosts, I would configure an automatic lookup, say

 LOOKUP-1 = hosttogroup host OUTPUT group

This would reference a table like:

 host,group
 myserver,dev
 myserver,app
 myserver,j2ee
 myserver2,prod
 myserver2,db
 myserver3,dev
 myserver4,test
 myserver4,db

 ...

Then you would simply search using group="dev". This wouldn't require a macro at all, or the use of the inputlookup command.

gfriedmann
Communicator

Fricken rock. Thank you. I will test with this.

millarma
Path Finder

what would the csv lookup up for this look like. Can you paste 2-3 lines including the header?

0 Karma
Get Updates on the Splunk Community!

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...