The scenario that I am building is to use a dynamic txt or csv file to refine the search of an index full of syslog.
I have index=syslog that contains incoming syslog for 1000 servers, however for this specific scenario I only care about 150 high value hosts. Rather than searching index=syslog HOST1, HOST2 etc etc I would like to call on a CSV or txt file that contains the host names of the high value targets and create a search similar to index=syslog | search host=/opt/splunk/test/host.csv
I have been toying with using the transaction command to make this work but it provides undesirable outputs and I believe there should be an easier way......
Python is also an option I just hoped that splunk would have a field lookup option that does not consists of find and replace.
You could probably achieve what you want by using tags. Have a look at tags.conf.spec and tags.conf.example for info on how to write these files. If the list is fairly static in its nature you could also just manage the tags from the UI.
Just tag all the hosts you're interested in for this purpose with some value (let's say "highvaluehosts") and then issue a search looking something like this:
This search will only yield results from the hosts you've tagged.
I think you could very easily use a lookup to achieve this. Tags will work too, but lookups might be a bit more flexible.
[syslog] LOOKUP-foobar= your_lookup host OUTPUTNEW
[your_lookup] filename = your_lookup.csv
[host_value] INDEXED=false INDEXED_VALUE=false
host,host_value finance1,high vm7,low printer6,low mail3,medium
Another solution is to put your 150 hosts into a lookup, and then use it in a subsearch:
<some search terms> [ inputlookup myHosts | fields host ]
The end result will be
<some search terms> host=host1 OR host=host2 OR host=host3 OR host=host4 ...
Ordinarily gratuitous and/or strange use of subsearches is to be avoided, but if the number of hosts really is quite small this can be a very useful trick. For one thing you can have a script regenerate the csv periodically. Or you can have another splunk search running on a schedule that regenerates the lookup using the
Or you can have a splunk search that reads in the existing lookup, appends new rows to them, dedupes them, and then writes it out to the lookup. Often summary indexing is a better idea overall but still, there are neat tricks here.
Also although subsearches are never you actually end up with a pretty efficient search here. It may be slightly more efficient than using tags even, im not sure.