Solved: moving CSV to Database and replace lookup with dbx...

LearningGuy · ‎05-15-2024

Hello,

I am currently correlating an index with CSV file using lookup.
I am planning to move CSV file to database and will replace lookup with dbxlookup.
Below is my search query using lookup

index=student_grade
| lookup student_info.csv No AS No OUPUTNEW Name Address

Below is my "future" search query using DBXLookup
Is it going to be this simple?
Please share your experience. Thank you so much

index=student_grade
| dbxlookup connection="studentDB" query="SELECT * FROM student_info"  No AS No OUTPUT Name, Address

index=student_grade

No	Class	Grade
10	math	A
10	english	B
20	math	B
20	english	C

student_info.csv

No	Name	Address
10	student10	Address10
20	student20	Address20

No	Class	Grade	Name	Address
10	math	A	student10	Address10
10	english	B	student10	Address10
20	math	B	student20	Address20
20	english	C	student20	Address20

gcusello · ‎05-15-2024

Hi @LearningGuy ,

this means thta you have much time to spent in front of your pc!

obviously I'm jocking!

this isn't a good idea: usually the approach is the opposite: use an external DB to take static data to store in a lookup or in an index, because the data extractions from a db are usually very slow.

Ciao.

Giuseppe

View solution in original post

gcusello · ‎05-15-2024

Hi @LearningGuy ,

this means thta you have much time to spent in front of your pc!

obviously I'm jocking!

this isn't a good idea: usually the approach is the opposite: use an external DB to take static data to store in a lookup or in an index, because the data extractions from a db are usually very slow.

Ciao.

Giuseppe

LearningGuy · ‎05-16-2024

Hello,

Do you have any other alternative if I want to move away from CSV?

Thanks

gcusello · ‎05-19-2024

Hi @LearningGuy ,

you can save your static data:

in a csv lookup,
in a kv-store lookup,
in an Index, if you need time updates on your data.

The more frequent approach is to use a csv lookup.

Ciao.

Giuseppe

gcusello · ‎05-16-2024

Hi @LearningGuy ,

copy your csv in a lookup and use the lookup to enrich your searches.

Remember to create also the Lookup definition.

Ciao.

Giuseppe

LearningGuy · ‎05-16-2024

Do you mean KVStore?
Have you tried DBXlookup yourself? How slow is it?

Thanks

gcusello · ‎05-16-2024

Hi @LearningGuy,

use a lookup on a csv or on KV-store, to be eventually populated using a connection with a db, but don't use dbxlookup.

Ciao.

Giuseppe

LearningGuy · ‎05-16-2024

Hi,

Can you rephrase your statement? I am not clear "eventually populated using a connection with a DB"
Did you mean this flow

DBxquery ==> KVStore

Or

KVStore ==> DBxquery

Thanks

gcusello · ‎05-16-2024

Hi @LearningGuy,

I don't know where your csv is coming from:

if it comes from a csv, you can update your lookup (both based on csv or kv-store) from the csv itself.

If instead, you have data in a database table and you extract data from the database in the csv, you can use dbxquery to extract data to populate the lookup.

Anyway, never use lookup -> database, eventually the contrary!

Ciao.

Giuseppe

LearningGuy · ‎05-16-2024

Hello,

My CSV data is coming from DB which get pulled from DBXquery.
DB ==> DBXquery ==> CSV

Since CSV have size limitation, so I am thinking just to use the DB via DBXLookup

Thanks

PickleRick · ‎05-16-2024

If you're hitting csv size limits, kv-store based lookup might indeed be the solution (the additional bonus - you can update kv-store, don't have to delete and re-create it from scratch as with csv-based lookup).

As for the performance - well, it really depends on the use case. External lookups and external commands will always be slower than Splunk's internal mechanisms because you have to spawn external process, interface with it and so on. But with sufficiently small data sample (you just have a small set of results you have to enrich with something external) it might be "bearable".

LearningGuy · ‎05-16-2024

Hi
KVStore and CSV are considered internal, correct?
Based on your experience which one is the fastest? KVStore?

Thanks

PickleRick · ‎05-17-2024

It's... a bit more complicated.

Both kvstore and csv-based lookups are performed internally by Splunk. There are some differences though - see the details here - https://dev.splunk.com/enterprise/docs/developapps/manageknowledge/kvstore/

It gets more complicated if you want to use a lookup early on in the search pipeline when the processing is still being done on indexers - depending on the particular collection's configuration the data might either be replicated as a part of knowledge bundle in csv form to indexers or the search might be forced to the SH tier (losing the benefits of distributed processing).

gcusello · ‎05-16-2024

Hi @LearningGuy ,

yes both the solutions are internal lookups.

They are almonst uqual fast: if you have few rows, hundreds until few thousands, you can use csv, if you have more rows, KV-Store is better.

In addition KV-store is prefereable is you need a key in your csv, eg for thacking.

Ciao.

Giuseppe

moving CSV to Database and replace lookup with dbxlookup

lookup

other

How to Monitor Google Kubernetes Engine (GKE)

Index This | How can you make 45 using only 4?

Splunk Education Goes to Washington | Splunk GovSummit 2024