Splunk Search

Regex to match string followed by varying formats (multi-line event)

Communicator

I am working with data from a database which produces information on transactions.

The problem is that transactions can have any number of related attributes, and transaction details will be replicated with a new line for each attribute.

In the format of:

[Transaction ID] [tab] [Attribute name] [tab] [Attribute value] [tab] [date]

Example:

11111    Amount    12000
11111    Reference    101010
11111    Operator    John
11111    Subject    Credit
11111    Notes    XXXXXXXX
11112    Amount    75000
11112    Reference    202020
11112    Operator    Will

I am trying to identify a REGEX expression for EACH attribute which will match on the following logic;

"Amount" - followed by TAB - followed by variable length NUMBER - followed by TAB

"Reference" - followed by TAB - followed by variable length NUMBER - followed by TAB

"Operator" - followed by TAB - followed by variable length STRING - followed by TAB

"Subject" - followed by TAB - followed by variable length STRING- followed by TAB

"Notes" - followed by TAB - followed by variable length STRING- followed by TAB

Currently when I load this data into Splunk I am using auto event break so I have multiple events related to each "Request ID" and I can extract the Attribute and Value however this is obviously not ideal as I would like to extract the attribute itself, i.e.; Amount, Reference, Operator, etc.

Thanks!

0 Karma

Esteemed Legend

Like this:

| makeresults 
| eval raw="11111    Amount    12000::11111    Reference    101010::11111    Operator    John::11111    Subject    Credit::11111    Notes    XXXXXXXX::11112    Amount    75000::11112    Reference    202020::11112    Operator    Will" 
| makemv raw delim="::" 
| mvexpand raw 
| rename raw AS _raw

| rename COMMENT AS "Everything above generates sample event data; everything below is your solution"

| rex "^\d+\s+(?<key>\S+)\s+(?<value>\S+)" 
| eval {key}=value 
| fields - key value
0 Karma

Legend

Hi alexandermunce,
let me better understand: you extracted these data from a Db, they aren't events with a timestamp, and you want to extract fields to diplay them in a report or a dashboard, is it correct?
If this is your situation you can follow two ways:

  • collect all records in one event (LINEMERGE=true)
  • have an event for each record (LINEMERGE=false)

In the second case you can aggregate events using the transaction command.
Anyway, if you have only one event you can extract fields with one regex:

(?ms)(?<Transaction_ID>\d+)\s+Amount\s+(?<Amount>\d+)\s+\d+\s+Reference\s+(?<Reference>\d+)\s+\d+\s+Operator\s+(?<Operator>\w+)\s+\d+\s+Subject\s+(?<Subject>\w+)\s+\d+\s+Notes\s+(?<Notes>\w+)

If otherwise you have each record in one event, you have to create a regex for for each field, using the same regexes

^(?<Transaction_ID>\d+)\s+Amount\s+(?<Amount>\d+)
^(?<Transaction_ID>\d+)\s+Reference\s+(?<Reference>\d+)
^(?<Transaction_ID>\d+)\s+Operator\s+(?<Operator>\w+)
^(?<Transaction_ID>\d+)\s+Subject\s+(?<Subject>\w+)
^(?<Transaction_ID>\d+)\s+Notes\s+(?<Notes>\w+)

Bye.
Giuseppe

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!