Splunk Search

Dedup / merge multiple events relating to one transaction



Our system logs events in a bizarre way in which multiple lines of data will all relate to a single transaction, however each line will have a different attribute.

Current Relationship: Multiple events TO One transaction_id
Ideal relationship: One event TO One transaction_id

This also means that must data is duplicated - I want to merge these events all into one relating to the one unique transaction_id.

Is it possible that we upload this raw data each month and then format it correctly into single events automatically.

Example as follows:

transaction_id          status     date        type       attribute    value
10001                   complete   10/10/17    request    name         Jenny
10001                   complete   10/10/17    request    company      Ford
10001                   complete   10/10/17    request    reference    564682
10001                   complete   10/10/17    request    amount       $12,345

I would instead like it formatted as follows:

transaction_id          status     date        type       name   company   reference   amount
10001                   complete   10/10/17    request    Jenny  Ford      564682      $12,345

Now all events have the same number of related attributes.

Thanks in advance!

0 Karma


Okay, so you have key-value pairs where these three together make the key - transaction_id status date type.

This code will create fields named after the attributes, set their values, and then roll the data together into a single event

| eval {attribute} = value
| fields - attribute value
| stats values(*) as * by transaction_id status date type 
| eval _time = strptime(date,"%m/%d/%y")

You might want to consider putting the csv into a temporary index each month, then using collect after the above lines to write the combined transactions to the permanent index.

0 Karma


There may be better search sentences・・・

(your search) | dedup transaction_id,status,date,type
| join transaction_id [search (your search) | chart first(value) by transaction_id,attribut]
| table transaction_id,status,date,type,name,company,reference,amount

0 Karma


Could I perhaps use:

... | transaction transaction_id

Or is the transaction command only for combining different fields of the same value (in this case it is the same field)

0 Karma


@alexandermunce, you can try the following, this will not create separate new fields for specific attributes and values but will create single row for each transaction id.

| eval key=attribute."=\"".value."\""
| stats values(key) as key by transaction_id status date type

If multiple status, date and/or type are possible for each transaction_id then you would need to use list() or values() for each of such fields instead of using these in by clause.

| makeresults | eval message= "Happy Splunking!!!"
0 Karma
Get Updates on the Splunk Community!

SplunkTrust | Where Are They Now - Michael Uschmann

The Background Five years ago, Splunk published several videos showcasing members of the SplunkTrust to share ...

Admin Your Splunk Cloud, Your Way

Join us to maximize different techniques to best tune Splunk Cloud. In this Tech Enablement, you will get ...

Cloud Platform | Discontinuing support for TLS version 1.0 and 1.1

Overview Transport Layer Security (TLS) is a security communications protocol that lets two computers, ...