Splunk Search

creating scheduled lookup table

eranday
New Member

Hi all,

I'm looking to create a lookup table and wondering what is my best practice.

Data source is SQL and I already have the dbxlookup that imports data from SQL to Splunk
I would like to build a lookup table to be refreshed from SQL twice a day
I would like my search to import the data from the lookup and not from SQL
The table contains today 1,000 rows and I do not expect it to pass the 10,000 rows

I will appreciate if someone can guide me to the best approach.
Thanks in advance,

Tags (2)
0 Karma
1 Solution

somesoni2
Revered Legend

Since you want to refresh your data, and want to ensure it doesn't get emptied in case your db query fails, you can use your lookup generation query like this

| dbxquery .... | inputlookup yourLookup.csv append=t | dedup ...columns that uniquely identify a lookup row... | outputlookup yourLookup.csv

So it'll refresh your lookup table with new data coming from dbxquery, retaining the old values. It'll give a warming when you run it for the first time as yourLookup.csv will not exist before this queries first run.

View solution in original post

0 Karma

somesoni2
Revered Legend

Since you want to refresh your data, and want to ensure it doesn't get emptied in case your db query fails, you can use your lookup generation query like this

| dbxquery .... | inputlookup yourLookup.csv append=t | dedup ...columns that uniquely identify a lookup row... | outputlookup yourLookup.csv

So it'll refresh your lookup table with new data coming from dbxquery, retaining the old values. It'll give a warming when you run it for the first time as yourLookup.csv will not exist before this queries first run.

View solution in original post

0 Karma

nickhills
Ultra Champion

Is there a penalty for the order in which you query your input data?
I tend to start appending in/out searches with
|inputlookup....|my new search...(dbx in this case) | outputlookup ....
I have no technical reason for doing it in this order (other than it sort-of logically feels right) but having read your (better) suggestion than mine, I now wonder about the performance?

If my comment helps, please give it a thumbs up!
0 Karma

somesoni2
Revered Legend

The dedup command keeps the latest data, i.e. data appears first in the results set, which should be from the source, so the main search comes first. Another reason we run it as base search is that it's may take a significant time to execute, depending upon the query, and we don't want the subsearch limits to apply to it (auto-finalized due to longer run or space limit).

nickhills
Ultra Champion

Take a look at outputlookup

...your_search...|table cola, colb colc, cold|outputlookup my-lookup.csv

Add that to a scheduled search, and it will create you a new lookup file each time it runs.

If my comment helps, please give it a thumbs up!
0 Karma

eranday
New Member

thanks.. works great!!!

0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!