Hello,
I am currently correlating an index with CSV file using lookup.
I am planning to move CSV file to database and will replace lookup with dbxlookup.
Below is my search query using lookup
index=student_grade
| lookup student_info.csv No AS No OUPUTNEW Name Address
Below is my "future" search query using DBXLookup
Is it going to be this simple?
Please share your experience. Thank you so much
index=student_grade
| dbxlookup connection="studentDB" query="SELECT * FROM student_info" No AS No OUTPUT Name, Address
index=student_grade
No | Class | Grade |
10 | math | A |
10 | english | B |
20 | math | B |
20 | english | C |
student_info.csv
No | Name | Address |
10 | student10 | Address10 |
20 | student20 | Address20 |
No | Class | Grade | Name | Address |
10 | math | A | student10 | Address10 |
10 | english | B | student10 | Address10 |
20 | math | B | student20 | Address20 |
20 | english | C | student20 | Address20 |
Hi @LearningGuy ,
this means thta you have much time to spent in front of your pc!
obviously I'm jocking!
this isn't a good idea: usually the approach is the opposite: use an external DB to take static data to store in a lookup or in an index, because the data extractions from a db are usually very slow.
Ciao.
Giuseppe
Hi @LearningGuy ,
this means thta you have much time to spent in front of your pc!
obviously I'm jocking!
this isn't a good idea: usually the approach is the opposite: use an external DB to take static data to store in a lookup or in an index, because the data extractions from a db are usually very slow.
Ciao.
Giuseppe
Hello,
Do you have any other alternative if I want to move away from CSV?
Thanks
Hi @LearningGuy ,
you can save your static data:
The more frequent approach is to use a csv lookup.
Ciao.
Giuseppe
Hi @LearningGuy ,
copy your csv in a lookup and use the lookup to enrich your searches.
Remember to create also the Lookup definition.
Ciao.
Giuseppe
Do you mean KVStore?
Have you tried DBXlookup yourself? How slow is it?
Thanks
Hi @LearningGuy,
use a lookup on a csv or on KV-store, to be eventually populated using a connection with a db, but don't use dbxlookup.
Ciao.
Giuseppe
Hi,
Can you rephrase your statement? I am not clear "eventually populated using a connection with a DB"
Did you mean this flow
DBxquery ==> KVStore
Or
KVStore ==> DBxquery
Thanks
Hi @LearningGuy,
I don't know where your csv is coming from:
if it comes from a csv, you can update your lookup (both based on csv or kv-store) from the csv itself.
If instead, you have data in a database table and you extract data from the database in the csv, you can use dbxquery to extract data to populate the lookup.
Anyway, never use lookup -> database, eventually the contrary!
Ciao.
Giuseppe
Hello,
My CSV data is coming from DB which get pulled from DBXquery.
DB ==> DBXquery ==> CSV
Since CSV have size limitation, so I am thinking just to use the DB via DBXLookup
Thanks
If you're hitting csv size limits, kv-store based lookup might indeed be the solution (the additional bonus - you can update kv-store, don't have to delete and re-create it from scratch as with csv-based lookup).
As for the performance - well, it really depends on the use case. External lookups and external commands will always be slower than Splunk's internal mechanisms because you have to spawn external process, interface with it and so on. But with sufficiently small data sample (you just have a small set of results you have to enrich with something external) it might be "bearable".
Hi
KVStore and CSV are considered internal, correct?
Based on your experience which one is the fastest? KVStore?
Thanks
It's... a bit more complicated.
Both kvstore and csv-based lookups are performed internally by Splunk. There are some differences though - see the details here - https://dev.splunk.com/enterprise/docs/developapps/manageknowledge/kvstore/
It gets more complicated if you want to use a lookup early on in the search pipeline when the processing is still being done on indexers - depending on the particular collection's configuration the data might either be replicated as a part of knowledge bundle in csv form to indexers or the search might be forced to the SH tier (losing the benefits of distributed processing).
Hi @LearningGuy ,
yes both the solutions are internal lookups.
They are almonst uqual fast: if you have few rows, hundreds until few thousands, you can use csv, if you have more rows, KV-Store is better.
In addition KV-store is prefereable is you need a key in your csv, eg for thacking.
Ciao.
Giuseppe