Splunk Search

Dedup / merge multiple events relating to one transaction

alexandermunce
Communicator

Hi,

Our system logs events in a bizarre way in which multiple lines of data will all relate to a single transaction, however each line will have a different attribute.

Current Relationship: Multiple events TO One transaction_id
Ideal relationship: One event TO One transaction_id

This also means that must data is duplicated - I want to merge these events all into one relating to the one unique transaction_id.

Is it possible that we upload this raw data each month and then format it correctly into single events automatically.

Example as follows:

transaction_id          status     date        type       attribute    value
10001                   complete   10/10/17    request    name         Jenny
10001                   complete   10/10/17    request    company      Ford
10001                   complete   10/10/17    request    reference    564682
10001                   complete   10/10/17    request    amount       $12,345

I would instead like it formatted as follows:

transaction_id          status     date        type       name   company   reference   amount
10001                   complete   10/10/17    request    Jenny  Ford      564682      $12,345

Now all events have the same number of related attributes.

Thanks in advance!

0 Karma

DalJeanis
Legend

Okay, so you have key-value pairs where these three together make the key - transaction_id status date type.

This code will create fields named after the attributes, set their values, and then roll the data together into a single event

| eval {attribute} = value
| fields - attribute value
| stats values(*) as * by transaction_id status date type 
| eval _time = strptime(date,"%m/%d/%y")

You might want to consider putting the csv into a temporary index each month, then using collect after the above lines to write the combined transactions to the permanent index.

0 Karma

HiroshiSatoh
Champion

There may be better search sentences・・・

(your search) | dedup transaction_id,status,date,type
| join transaction_id [search (your search) | chart first(value) by transaction_id,attribut]
| table transaction_id,status,date,type,name,company,reference,amount

0 Karma

alexandermunce
Communicator

Could I perhaps use:

... | transaction transaction_id

Or is the transaction command only for combining different fields of the same value (in this case it is the same field)

0 Karma

niketn
Legend

@alexandermunce, you can try the following, this will not create separate new fields for specific attributes and values but will create single row for each transaction id.

| eval key=attribute."=\"".value."\""
| stats values(key) as key by transaction_id status date type

If multiple status, date and/or type are possible for each transaction_id then you would need to use list() or values() for each of such fields instead of using these in by clause.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...