I have created a csv lookup table and have successfully loaded it into splunk and used it in a search command sourcetype="access_log-2" | lookup title_lookup isbn_tag OUTPUTNEW title, pub
now when I go off and use other search commands that dont use this lookup, the data that was looked up now longer is available in the searches. I have to re-issue the lookup command to have the new fields available.
Do the new fields get added to the events permanently so I dont need to issue the lookup command on every search command?
I wanted to be able to enrich the event log with one days worth of lookup data, then create a new lookup table the next day and enrich the new events, and so on, having this data permanently part of the events and the old lookup tables would not be needed anymore.
Is this possible?
Another way I might do this is to nightly append onto the web log events the new fields that are retrieved from remote DB's. Done in Perl outside of splunk. Since this web data is ingested into splunk in real time, would a search on the web logs see the new fields that were added after the ingestion?
There seem to be some misconceptions about how splunk works. Let me see if I can clear anything of this up for you:
lookupcommand.) This how all search commands work.
If you want to setup automatic lookups based on your sourcetype, then you have to add a
LOOKUP entry in your
props.conf file. You can also do this via the web user interface; you can navigate to Manager » Lookups » Automatic lookups
Your perl-based approach would likely accomplish what you are looking for too, however once you get more familiar with what all splunk can do, you probably will find such methods unnecessary. (I know I've had quite a number of ad-hoc scripts I've been able to decommission since we started using splunk.)
Update: Just a couple other things to think about. I don't fully understand everything your are trying to do, but splunk is really flexible and gives you a bunch of options and (based on your question) you seem like the person clever enough to whip up your own solution to problems.... So, if you haven't considered either of these two other options, I'd suggest that you take a look and see if either of these give you a good starting point:
Thanks. Since I will have a new 5 million event csv file every day generated from the previous days web events, I guess loading that into splunk and adding it to the config file and restarting splunk
is not an effient way to do this.
Real time lookups are not practical either, would take forever to retrieve all the data from Oracle.
Is there a way to manage the addition of a lookup file dayly without manual intervention?
I added some additional thoughts to my post, but they may not be what you need. Not sure. I do want to point out that you can update your lookup
.csv file anytime you want. You don't have to restart splunk to get the updates. Of course if your are looking at adding millions of new records to your lookup table each day,then you probably need to also consider an expiration policy which you would probably want to manage externally as well.
In terms of managing external lookup files, you may find this post useful: http://answers.splunk.com/questions/3769/does-outputlookup-append-or-overwrite (but you may have too many events to make this option practical, idduno)
"It sounds like you want to use a date-effective lookup table." Incidentally, googling
"date-effective lookup table" splunk only returns one result - a link to this answers page. For clarity, it might be worth re-wording this as
a lookuptable that indexes on "Effective Date"
As mentioned by Lowell, you can configure AUTOMATIC lookups which would allow your custom lookup fields to be appended to your data automatically.
The props.conf file would be configured something like the following:
[access_log-2] lookup-title_lookup = title_lookup isbn_tag AS ISBN OUTPUTNEW title pub
(Or, use the UI to complete the task - a restart is not necessary)
Once props.conf (and assuming transforms.conf) is properly configured your lookup will be automatic for your sourcetype=access_log-2.
If you can refresh your lookup table via external script that will take care of making sure the table data is current. You may want to configure a scheduled search that triggers your external script on a daily basis to take care of this automatically.