Splunk Search
Highlighted

Field extraction

Engager

I'm trying to extract fields from this event using regular expressions,

Multiple times I receive the following error.. I tried to avoid it by creating fields out of every part of the event

"The extraction failed. If you are extracting multiple fields, try removing one or more fields. Start with extractions that are embedded within longer text strings."

The fields I'd like to extract are:
FIRST ITEM (and every other item that goes after it)
FIRST ITEM AMOUNT ( The number that goes before first item)
GRAND TOTAL
LASTNAME
FIRSTNAME
The 8 digit number that goes before LASTNAME

  "receipt-text":"  COMPANY NAME INTERNATIONAL\n      Branch name goes here\n            App name\n\n         1/16/2020 22:30  \nServer:123  App Name          \nGuests:0      \n_________________________________\n          appName User\n      1 First Item     5.00\n      2 Second Item                 7.00\n        @ 3.50\n\n       \n      Gross Sales     12.00\n         Discount      0.00\n         Subtotal     12.00\n              Tax      0.00\n   Service Charge      2.16   \n  Delivery Charge      1.95  \n   Additional Tip      0.00\n            Total     16.11\n\n\nUser                       16.11\n  12345678 LASTNAME, FIRSTNAME\n\n       GRAND TOTAL         16.11\n\n______________________________________\nT1234 C300      1/16/2020 22:30\n______________________________________      \n\n \n\nAdditional Tip    $___________________\n\n\n\nTotal             $___________________\n\n\n\n\n______________________________________\nSignature\n         ***************\n   A 18% service charge is included\n  in your bill, however, please use\n   the Additional Tip line to\n  recognize outstanding service.\n         ***************\nCheck No.: 12345678\n@##12345678@"

Any help you can deliver to point me into the right direction, I'd be really happy.
I hope I'm soon able to help other people aswell
If you've come this far, thanks a lot for reading my issue!

0 Karma
Highlighted

Re: Field extraction

Builder

If the field extraction in the GUi didn't work for you, did you try six different regular expressions using the rex command? If so, what exactly were your expressions?

###

If this reply helps you, an upvote would be appreciated.
0 Karma
Highlighted

Re: Field extraction

Ultra Champion

UPDATED: add time and check_no

your_search
| eval raw=replace(_raw,"\\\n","
 ") 
| makemv delim="
 " raw 
| streamstats count as session 
| mvexpand raw 
| rename raw as _raw
| rex "^\s+(?<time>\d+.\d+.\d{4}\s\d+\:\d+)"
| rex "(?:\d+ )(?<items>[\w ]+)\s+(?<amounts>\d+\.\d{2})"
| rex "GRAND TOTAL\s+(?<ground_total>\d+\.\d{2})" 
| rex "\s+(?<user_id>\d{8})\s+(?<last_name>\w+), (?<first_name>\w+)" 
| rex "Check No\.:\s+(?<check_no>\d{8})" 
| stats list(*) as * by session 
| eval time=strptime(time,"%m/%d/%Y %H:%M")
| fieldformat time=strftime(time,"%m/%d/%Y %H:%M")
| table session time user_id first_name last_name items amounts ground_total check_no

Previous my answer:

| makeresults
| eval _raw="   \"receipt-text\":\"  COMPANY NAME INTERNATIONAL\n      Branch name goes here\n            App name\n\n         1/16/2020 22:30  \nServer:123  App Name          \nGuests:0      \n_________________________________\n          appName User\n      1 First Item     5.00\n      2 Second Item                 7.00\n        @ 3.50\n\n       \n      Gross Sales     12.00\n         Discount      0.00\n         Subtotal     12.00\n              Tax      0.00\n   Service Charge      2.16   \n  Delivery Charge      1.95  \n   Additional Tip      0.00\n            Total     16.11\n\n\nUser                       16.11\n  12345678 LASTNAME, FIRSTNAME\n\n       GRAND TOTAL         16.11\n\n______________________________________\nT1234 C300      1/16/2020 22:30\n______________________________________      \n\n \n\nAdditional Tip    $___________________\n\n\n\nTotal             $___________________\n\n\n\n\n______________________________________\nSignature\n         ***************\n   A 18% service charge is included\n  in your bill, however, please use\n   the Additional Tip line to\n  recognize outstanding service.\n         ***************\nCheck No.: 12345678\n@##12345678@"
| eval raw=replace(_raw,"\\\n","
")
| makemv delim="
" raw
| streamstats count as session
| mvexpand raw
| rename raw as _raw
| rex "(?:\d+ )(?<items>[\w ]+)\s+(?<amounts>\d+\.\d{2})"
| rex "GRAND TOTAL\s+(?<ground_total>\d+\.\d{2})"
| rex "\s+(?<user_id>\d{8})\s+(?<last_name>\w+), (?<first_name>\w+)"
| stats list(*) as * by session
| table session user_id first_name last_name items amounts ground_total

Hi, @maxitroncoso
items and amounts are one-on-one.
As I don't know how you display these, I create like above.

View solution in original post

0 Karma
Highlighted

Re: Field extraction

Engager

@to4kawa thanks a lot for your swift answer, I took what you wrote and it worked as intended!

I tried the following expression in order to add a date and time column to the table, but whenever I use it, instead of one date and time I get a lot per event

| eval indextime= strftime(_indextime,"%Y-%m-%d %H:%M:%S")

I also figured out that I would need to extract the number after Check No.
Tried to do so with the following expression:

| rex "nCheck No.:\s+(?\d{8})"

but i'm getting empty results on that column, any idea of why that may be?

0 Karma
Highlighted

Re: Field extraction

Ultra Champion

As my query runs, \n is remove.

| rex "Check No\.:\s+(?<check_no>\d{8})"

try this.

Highlighted

Re: Field extraction

Ultra Champion

hi, @maxitroncoso
I update my answer, add time and check_no. please confirm.

0 Karma
Highlighted

Re: Field extraction

Engager

Checknumber works as intended (I replaced it for order_number) but time is empty

| eval time=strptime(time,"%m/%d/%Y %H:%M")
| fieldformat time=strftime(time,"%m/%d/%Y %H:%M")
| table session time userid firstname lastname items amounts groundtotal check_no

0 Karma
Highlighted

Re: Field extraction

Ultra Champion

| rex "^\s+(?<time>\d+.\d+.\d{4}\s\d+\:\d+)"

This rex may not be working in your actual log.
Since there were actually two time displayed, this regular expression was used to exclude one.

| rex "(?<time>\d+.\d+.\d{4}\s\d+\:\d+)"
and
| eval time=strptime(mvdedup(time),"%m/%d/%Y %H:%M")
This might be better.

Run it line by line and check rex result, if time is extracted.

0 Karma
Highlighted

Re: Field extraction

Engager

| eval time=strptime(mvdedup(time),"%m/%d/%Y %H:%M")
| rex "(?\d+.\d+.\d{4}\s\d+:\d+)"

in this order it worked as expected, after the other rex expressions and before the stats, so it ended up like this

yoursearch
| eval raw=replace(
raw,"\\n","
")
| makemv delim="
" raw
| streamstats count as session
| mvexpand raw
| rename raw as raw
| rex "^\s+(?\d+.\d+.\d{4}\s\d+:\d+)"
| rex "(?:\d+ )(?[\w ]+)\s+(?\d+.\d{2})"
| rex "GRAND TOTAL\s+(?\d+.\d{2})"
| rex "\s+(?\d{8})\s+(?\w+), (?\w+)"
| rex "Check No.:\s+(?\d{8})"
| eval time=strptime(mvdedup(time),"%m/%d/%Y %H:%M")
| rex "(?\d+.\d+.\d{4}\s\d+:\d+)"
| stats list(*) as * by session
| table session time user
id firstname lastname items amounts groundtotal checkno

you've been amazingly helpful, thanks a lot!

0 Karma
Highlighted

Re: Field extraction

Ultra Champion

you're welcome and happy splunking.

0 Karma