How to take sample events in an index from a CSV f...

phamxuantung · ‎12-19-2022

Hello,

I have a csv file that have some summary stats from an index, but the requirement is to show an sample event with all the info in that index.

The CSV file have a hash number (from a account number), some calculated status. For example:

PAN	total_trans	total_amount	HASHPAN
1234******5678	15	15000	ABC123

The index have all the transaction and detail. We need to take an sample in the index from the csv and output to a new csv that have sample detail. Something like this:

PAN	total_trans	total_amount	HASHPAN	_time	TRACE	TRANSACTIONID
1234******5678	15	15000	ABC123	xxxxxxx	xxxxxx	xxxxxxxxx

I don't really like to use join, because the index have a lot of events (around 32 mils events). Are there any elegants way to get the data?

Manasa_401 · ‎12-19-2022

Hello @phamxuantung

If I understand well, your requirement is to join data from index and csv lookup without using Join command. If yes then try the below query

index=test_index
| lookup transactions.csv common_field OUTPUT total_amount total_trans

In the output you can specify all the required fields from the lookup.

Thanks,
Manasa

gcusello · ‎12-19-2022

Hi @phamxuantung,

if you have the csv in a lookup, you can use the lookup command, something like this:

index=your_index
| lookup your_lookup.csv PAN
| table PAN total_trans total_amount HASHPAN _time TRACE TRANSACTIONID

if instead the csv is in another index, in general you have to find a unique Correlation Key (e.g.: PAN or HASHPAN) and use it to correlate data using the stats command, something like this:

index=index1 OR index=index2
| stats 
   values(total_trans) AS total_trans
   values(total_amount) AS total_amount
   values(HASHPAN) AS HASHPAN
   earliest(_time) AS _time
   values(TRACE) AS TRACE
   values(TRANSACTIONID) AS TRANSACTIONID
   BY PAN

Ciao.

Giuseppe

phamxuantung · ‎12-19-2022

Accroding to my understanding, if you specify the index first and |lookup the csv, the addition information will populate on all the events in the index (in my case, all the events that have the same HASHPAN), is it not?

But I want the opposite. The csv hold, for example, 50 uniques HASHPAN, and I want to take sample events of that HASHPAN in the index (first 50 or last 50 events that have that HASHPAN, base on which is faster).

My code for my intergration is

|inputlookup mycsv.csv
|join HASHPAN 
[search index=myindex
|dedup HASHPAN
]
|table <detail fields that I need>

looking at this, I think you'll understand what I mean, but I want a more "efficient" waty to doing this.

gcusello · ‎12-19-2022

Hi @phamxuantung,

in other words, you want to filter (find) events in the index that match the values of the csv, if this is your need, please try this:

index=myindex [ | inputlookup mycsv.csv | fields HASHPAN ]
| lookup your_lookup.csv HASHPAN 
| table PAN total_trans total_amount HASHPAN _time TRACE TRANSACTIONID

otherwise, if you're sure that a field of the lookup (e.g. PAN) has always values, you could also use a simpler solution:

index=your_index
| lookup your_lookup.csv HASHPAN OUTPUTNEW PAN total_trans total_amount
| where PAN=*
| table PAN total_trans total_amount HASHPAN _time TRACE TRANSACTIONID

Ciao.

Giuseppe

How to take sample events in an index from a CSV file?

fields

join

lookup

stats

table

Observe and Secure All Apps with Splunk

Splunk Decoded: Business Transactions vs Business IQ

Fastest way to demo Observability

Are you a member of the Splunk Community?

How to take sample events in an index from a CSV file?