Splunk Search

Dedup / merge multiple events relating to one transaction

alexandermunce
Communicator

Hi,

Our system logs events in a bizarre way in which multiple lines of data will all relate to a single transaction, however each line will have a different attribute.

Current Relationship: Multiple events TO One transaction_id
Ideal relationship: One event TO One transaction_id

This also means that must data is duplicated - I want to merge these events all into one relating to the one unique transaction_id.

Is it possible that we upload this raw data each month and then format it correctly into single events automatically.

Example as follows:

transaction_id          status     date        type       attribute    value
10001                   complete   10/10/17    request    name         Jenny
10001                   complete   10/10/17    request    company      Ford
10001                   complete   10/10/17    request    reference    564682
10001                   complete   10/10/17    request    amount       $12,345

I would instead like it formatted as follows:

transaction_id          status     date        type       name   company   reference   amount
10001                   complete   10/10/17    request    Jenny  Ford      564682      $12,345

Now all events have the same number of related attributes.

Thanks in advance!

0 Karma

DalJeanis
Legend

Okay, so you have key-value pairs where these three together make the key - transaction_id status date type.

This code will create fields named after the attributes, set their values, and then roll the data together into a single event

| eval {attribute} = value
| fields - attribute value
| stats values(*) as * by transaction_id status date type 
| eval _time = strptime(date,"%m/%d/%y")

You might want to consider putting the csv into a temporary index each month, then using collect after the above lines to write the combined transactions to the permanent index.

0 Karma

HiroshiSatoh
Champion

There may be better search sentences・・・

(your search) | dedup transaction_id,status,date,type
| join transaction_id [search (your search) | chart first(value) by transaction_id,attribut]
| table transaction_id,status,date,type,name,company,reference,amount

0 Karma

alexandermunce
Communicator

Could I perhaps use:

... | transaction transaction_id

Or is the transaction command only for combining different fields of the same value (in this case it is the same field)

0 Karma

niketn
Legend

@alexandermunce, you can try the following, this will not create separate new fields for specific attributes and values but will create single row for each transaction id.

| eval key=attribute."=\"".value."\""
| stats values(key) as key by transaction_id status date type

If multiple status, date and/or type are possible for each transaction_id then you would need to use list() or values() for each of such fields instead of using these in by clause.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to November Tech Talks, Office Hours, and Webinars!

🍂 Fall into November with a fresh lineup of Community Office Hours, Tech Talks, and Webinars we’ve ...

Transform your security operations with Splunk Enterprise Security

Hi Splunk Community, Splunk Platform has set a great foundation for your security operations. With the ...

Splunk Admins and App Developers | Earn a $35 gift card!

Splunk, in collaboration with ESG (Enterprise Strategy Group) by TechTarget, is excited to announce a ...