Currently I have my index data as below
and currently I have one lookup file with data as below:
order_id, error_pos. error_messages
12345, activation failed, port 9238 not available
and currently I have another lookup file with data as below:
order_id, ticket_id. owner
12345, T001, user1
and the result is as below:
order_id, order_status, error_pos, error_messages, ticket_id
12345, Submitted, activation failed, port 9238 not available, T001
Apparently I use
lookup method but the lookup file is always growing so the lookup file can keep updated, so what I currently doing is append the new lookup data into the lookup file and this is a bit troublesome in some ways.
My question to the experts out there is, is there any other way to automate the lookup method i.e. SPLUNK can query the lookup items into DB so the lookup can be referred to the query result.
Thanks to attending my question, any ideas are very welcome.
Lookup (an external csv file) file - you use this external file to add some extra info. to your events which is originally not present. where at times the lookup info can be modified , appended, etc. (considering the changes)
if you don't want to use lookup files. you may have to index your lookup file separately. and can use inside your original events ,using splunk commands like (join , append, etc) - where the logic of lookup changed here, and you are in trouble when the lookup file contents are to be changed. (re-index, etc)
You can do in multiple ways..
Use dbconnect and get the sample as a lookup data.
| dbquery "SELECT myfield1,myfield2,myfield3 FROM mytable"
Run a script/SQL which can dump the data into csv file from DB. Then join the lookups before feeding it into the "indexed" data. The key to join is your primarykey "orderid"
Thanks for your input too @koshyk!
My question for your suggestion is, wouldnt it grow the lookup files so much if you keep dump the data into the csv files? What is the impact with the performance later?
Because the lookup files need to be keep updated in 15 minutes with estimated rows to be fetched is 900-1000 rows per 15 minutes.
lookup is very efficient in Splunk, but dumping the data would be inefficient.
My suggestion would be then to use "dbconnect" as it can efficiently read data or delta of data
I would change the title of your question to "How to do a lookup against the results of a subsearch", which I believe is what you are asking to do, right (use one search's results as a lookup table in another search but skipping the write/read to file). Then again, if that is the case, why are you
appending to your lookup file; why aren't you
overwriting it? If
appending is necessary, then the thrust of your question does not make sense to me.
Hi @woodcock, yes it is! How to do lookup against the results of a subsearch. I was not so sure whether the solutions can use search, lookup, or maybe the other ways.
Because I want to keep maintain the historical lookup data for further data processing, hence I always appending the new data into the lookup files.
Would you be so kind to let me know any sources that I can follow to do lookup against the search results?
Just to give you an idea what am I doing:
index=sample_index_1 source=db_connect_01 |
table _time, processed_systems, error_state, order_state, product, order_id, ext_sys, task_mnemonic |
dedup order_id _time |
lookup osm_history_transactions order_id as order_id OUTPUT EXT_SYS as ext_sys|
timechart count by ext_sys |
streamstats sum(Total) as Accumulative
Hi @woodcock, I finally found myself to use JOIN function as your suggested. But it is very welcome if you have another ideas.