Splunk Search

Regex to match string followed by varying formats (multi-line event)

Communicator

I am working with data from a database which produces information on transactions.

The problem is that transactions can have any number of related attributes, and transaction details will be replicated with a new line for each attribute.

In the format of:

[Transaction ID] [tab] [Attribute name] [tab] [Attribute value] [tab] [date]

Example:

11111    Amount    12000
11111    Reference    101010
11111    Operator    John
11111    Subject    Credit
11111    Notes    XXXXXXXX
11112    Amount    75000
11112    Reference    202020
11112    Operator    Will

I am trying to identify a REGEX expression for EACH attribute which will match on the following logic;

"Amount" - followed by TAB - followed by variable length NUMBER - followed by TAB

"Reference" - followed by TAB - followed by variable length NUMBER - followed by TAB

"Operator" - followed by TAB - followed by variable length STRING - followed by TAB

"Subject" - followed by TAB - followed by variable length STRING- followed by TAB

"Notes" - followed by TAB - followed by variable length STRING- followed by TAB

Currently when I load this data into Splunk I am using auto event break so I have multiple events related to each "Request ID" and I can extract the Attribute and Value however this is obviously not ideal as I would like to extract the attribute itself, i.e.; Amount, Reference, Operator, etc.

Thanks!

0 Karma

Esteemed Legend

Like this:

| makeresults 
| eval raw="11111    Amount    12000::11111    Reference    101010::11111    Operator    John::11111    Subject    Credit::11111    Notes    XXXXXXXX::11112    Amount    75000::11112    Reference    202020::11112    Operator    Will" 
| makemv raw delim="::" 
| mvexpand raw 
| rename raw AS _raw

| rename COMMENT AS "Everything above generates sample event data; everything below is your solution"

| rex "^\d+\s+(?<key>\S+)\s+(?<value>\S+)" 
| eval {key}=value 
| fields - key value
0 Karma

Legend

Hi alexandermunce,
let me better understand: you extracted these data from a Db, they aren't events with a timestamp, and you want to extract fields to diplay them in a report or a dashboard, is it correct?
If this is your situation you can follow two ways:

  • collect all records in one event (LINEMERGE=true)
  • have an event for each record (LINEMERGE=false)

In the second case you can aggregate events using the transaction command.
Anyway, if you have only one event you can extract fields with one regex:

(?ms)(?<Transaction_ID>\d+)\s+Amount\s+(?<Amount>\d+)\s+\d+\s+Reference\s+(?<Reference>\d+)\s+\d+\s+Operator\s+(?<Operator>\w+)\s+\d+\s+Subject\s+(?<Subject>\w+)\s+\d+\s+Notes\s+(?<Notes>\w+)

If otherwise you have each record in one event, you have to create a regex for for each field, using the same regexes

^(?<Transaction_ID>\d+)\s+Amount\s+(?<Amount>\d+)
^(?<Transaction_ID>\d+)\s+Reference\s+(?<Reference>\d+)
^(?<Transaction_ID>\d+)\s+Operator\s+(?<Operator>\w+)
^(?<Transaction_ID>\d+)\s+Subject\s+(?<Subject>\w+)
^(?<Transaction_ID>\d+)\s+Notes\s+(?<Notes>\w+)

Bye.
Giuseppe

0 Karma