topic Re: Take sample events in an index from a CSV file in Splunk Search

How to take sample events in an index from a CSV file?

phamxuantung — Mon, 19 Dec 2022 15:53:57 GMT

Hello,

I have a csv file that have some summary stats from an index, but the requirement is to show an sample event with all the info in that index.

The CSV file have a hash number (from a account number), some calculated status. For example:

PAN	total_trans	total_amount	HASHPAN
1234******5678	15	15000	ABC123

The index have all the transaction and detail. We need to take an sample in the index from the csv and output to a new csv that have sample detail. Something like this:

PAN	total_trans	total_amount	HASHPAN	_time	TRACE	TRANSACTIONID
1234******5678	15	15000	ABC123	xxxxxxx	xxxxxx	xxxxxxxxx

I don't really like to use join, because the index have a lot of events (around 32 mils events). Are there any elegants way to get the data?

Re: Take sample events in an index from a CSV file

gcusello — Mon, 19 Dec 2022 11:06:25 GMT

Hi @phamxuantung,

if you have the csv in a lookup, you can use the lookup command, something like this:

index=your_index | lookup your_lookup.csv PAN | table PAN total_trans total_amount HASHPAN _time TRACE TRANSACTIONID

if instead the csv is in another index, in general you have to find a unique Correlation Key (e.g.: PAN or HASHPAN) and use it to correlate data using the stats command, something like this:

index=index1 OR index=index2 | stats values(total_trans) AS total_trans values(total_amount) AS total_amount values(HASHPAN) AS HASHPAN earliest(_time) AS _time values(TRACE) AS TRACE values(TRANSACTIONID) AS TRANSACTIONID BY PAN

Ciao.

Giuseppe

Re: Take sample events in an index from a CSV file

Manasa_401 — Mon, 19 Dec 2022 11:08:39 GMT

Hello @phamxuantung

If I understand well, your requirement is to join data from index and csv lookup without using Join command. If yes then try the below query

index=test_index
| lookup transactions.csv common_field OUTPUT total_amount total_trans

In the output you can specify all the required fields from the lookup.

Thanks,
Manasa

Re: Take sample events in an index from a CSV file

phamxuantung — Tue, 20 Dec 2022 02:14:28 GMT

Accroding to my understanding, if you specify the index first and |lookup the csv, the addition information will populate on all the events in the index (in my case, all the events that have the same HASHPAN), is it not?

But I want the opposite. The csv hold, for example, 50 uniques HASHPAN, and I want to take sample events of that HASHPAN in the index (first 50 or last 50 events that have that HASHPAN, base on which is faster).

My code for my intergration is

|inputlookup mycsv.csv |join HASHPAN [search index=myindex |dedup HASHPAN ] |table <detail fields that I need>

looking at this, I think you'll understand what I mean, but I want a more "efficient" waty to doing this.

Re: Take sample events in an index from a CSV file

gcusello — Tue, 20 Dec 2022 07:39:20 GMT

Hi @phamxuantung,

in other words, you want to filter (find) events in the index that match the values of the csv, if this is your need, please try this:

index=myindex [ | inputlookup mycsv.csv | fields HASHPAN ] | lookup your_lookup.csv HASHPAN | table PAN total_trans total_amount HASHPAN _time TRACE TRANSACTIONID

otherwise, if you're sure that a field of the lookup (e.g. PAN) has always values, you could also use a simpler solution:

index=your_index | lookup your_lookup.csv HASHPAN OUTPUTNEW PAN total_trans total_amount | where PAN=* | table PAN total_trans total_amount HASHPAN _time TRACE TRANSACTIONID

Ciao.

Giuseppe