Splunk Search

creating scheduled lookup table

eranday
New Member

Hi all,

I'm looking to create a lookup table and wondering what is my best practice.

Data source is SQL and I already have the dbxlookup that imports data from SQL to Splunk
I would like to build a lookup table to be refreshed from SQL twice a day
I would like my search to import the data from the lookup and not from SQL
The table contains today 1,000 rows and I do not expect it to pass the 10,000 rows

I will appreciate if someone can guide me to the best approach.
Thanks in advance,

Tags (2)
0 Karma
1 Solution

somesoni2
Revered Legend

Since you want to refresh your data, and want to ensure it doesn't get emptied in case your db query fails, you can use your lookup generation query like this

| dbxquery .... | inputlookup yourLookup.csv append=t | dedup ...columns that uniquely identify a lookup row... | outputlookup yourLookup.csv

So it'll refresh your lookup table with new data coming from dbxquery, retaining the old values. It'll give a warming when you run it for the first time as yourLookup.csv will not exist before this queries first run.

View solution in original post

0 Karma

somesoni2
Revered Legend

Since you want to refresh your data, and want to ensure it doesn't get emptied in case your db query fails, you can use your lookup generation query like this

| dbxquery .... | inputlookup yourLookup.csv append=t | dedup ...columns that uniquely identify a lookup row... | outputlookup yourLookup.csv

So it'll refresh your lookup table with new data coming from dbxquery, retaining the old values. It'll give a warming when you run it for the first time as yourLookup.csv will not exist before this queries first run.

0 Karma

nickhills
Ultra Champion

Is there a penalty for the order in which you query your input data?
I tend to start appending in/out searches with
|inputlookup....|my new search...(dbx in this case) | outputlookup ....
I have no technical reason for doing it in this order (other than it sort-of logically feels right) but having read your (better) suggestion than mine, I now wonder about the performance?

If my comment helps, please give it a thumbs up!
0 Karma

somesoni2
Revered Legend

The dedup command keeps the latest data, i.e. data appears first in the results set, which should be from the source, so the main search comes first. Another reason we run it as base search is that it's may take a significant time to execute, depending upon the query, and we don't want the subsearch limits to apply to it (auto-finalized due to longer run or space limit).

nickhills
Ultra Champion

Take a look at outputlookup

...your_search...|table cola, colb colc, cold|outputlookup my-lookup.csv

Add that to a scheduled search, and it will create you a new lookup file each time it runs.

If my comment helps, please give it a thumbs up!

eranday
New Member

thanks.. works great!!!

0 Karma
Get Updates on the Splunk Community!

New This Month in Splunk Observability Cloud - Metrics Usage Analytics, Enhanced K8s ...

The latest enhancements across the Splunk Observability portfolio deliver greater flexibility, better data and ...

Alerting Best Practices: How to Create Good Detectors

At their best, detectors and the alerts they trigger notify teams when applications aren’t performing as ...

Discover Powerful New Features in Splunk Cloud Platform: Enhanced Analytics, ...

Hey Splunky people! We are excited to share the latest updates in Splunk Cloud Platform 9.3.2408. In this ...