Getting Data In
Highlighted

Setting Source Type for Log file with some Multi Line data in between lines with TimeStamp

Engager

I have a log file that is formatted like:

2019-06-06 11:10:09,029  some event
2019-06-06 11:10:10,029 ............  - Enqueuing interaction, PayCommand
TransactionInfo=[Command=Pay, TransactionId=9081161e-41d8-46ae-953b-df659c038da2
            CmdInfo=[TerminalId=1, OriginalTerminal=|null|, TableId=1048589, CheckId=1048589, CustomCommand=|null|, ScreenType=NotSet]
            PaymentInfo=[PaymentId=1048590, .............]
2019-06-06 11:12:12,00  next event

TimeStamp data
Data -- no timestamp
Data -- no timestamp
Data -- no timestamp
(Next) TimeStamp

How to set my Source Type correctly so that I can extract TransactionId GUID and PaymentId
Very confused here... thanks for any help!

0 Karma
Highlighted

Re: Setting Source Type for Log file with some Multi Line data in between lines with TimeStamp

Legend

Hi dowdag,
let me understand: your event starts with timestamp, the other rows are in the same event, correct?
In other words, your event: is

2019-06-06 11:10:10,029 ............  - Enqueuing interaction, PayCommand
 TransactionInfo=[Command=Pay, TransactionId=9081161e-41d8-46ae-953b-df659c038da2
             CmdInfo=[TerminalId=1, OriginalTerminal=|null|, TableId=1048589, CheckId=1048589, CustomCommand=|null|, ScreenType=NotSet]
             PaymentInfo=[PaymentId=1048590, .............]

Correct?
This is a very standard log so you shouldn't have problems in ingestion, anyway try something like this in props.conf:

[my_sourcetype]
SOULD_LINEMERGE = True
TIME_PREFIX = ^
TIME_FORMAT = %Y-%m-%d %H:%M:%S.%3N

TIMEPREFIX and TIMEFORMAT aren't mandatory but I prefer to use them

To extract TransactionId and PaymentId at search time, you can use rex command

| rex "TransactionId\=(?P<TransactionId>[^ ]*).*\s+.*PaymentId\=(?P<PaymentId>[^,]*)"

or put this regex in a filed extraction.
You can test it at https://regex101.com/r/IlOFp2/1

Bye.
Giuseppe

0 Karma
Highlighted

Re: Setting Source Type for Log file with some Multi Line data in between lines with TimeStamp

SplunkTrust
SplunkTrust

You'll also need this line in your props.conf:

BREAK_ONLY_BEFORE_DATE = true.

BTW, SOULD_LINEMERGE, should be SHOULD_LINEMERGE.

---
If this reply helps you, an upvote would be appreciated.
0 Karma
Highlighted

Re: Setting Source Type for Log file with some Multi Line data in between lines with TimeStamp

Builder

Hi @dowdag ,
You'll first need to configure your sourcetype to break events properly.
You should be using the following in your props.conf:


[<Your sourcetypename>]
SHOULDLINEMERGE = false
LINE
BREAKER = ([\r\n]+)\d\d\d\d-\d\d-\d\d
TIMEPREFIX = ^
TIME
FORMAT = %Y-%m-%d %H:%M:%S.%3N

This will ensure that your event breaks only happen after a line break that is followed by a XXXX-XX-XX year-month-day value.
Once you have that fixed, you'll be able to extract the field values for the data you're looking for by using a simple extract in props.conf for your sourcetype:

[<Your sourcetypename>]
EXTRACT-transactionandpayment_info = TransactionId=(?<TransactionId>[a-fA-F0-9-]+)[\S\s]+PaymentId=(?<PaymentId>[^,]+),

Although, if KV Mode is set to AUTO, once you fix the line breaking issue, your fields might be associated properly anyway.

0 Karma
Highlighted

Re: Setting Source Type for Log file with some Multi Line data in between lines with TimeStamp

Engager

re: MultiLine field extract not working...

Thanks for suggestion -- I do not want to edit the props.config - I just want to effect one of my source types.
Not matter what I try I can not get multi line field extraction to work with splunk free. I am reading through the documentation and not understanding how this feature works....

https://docs.splunk.com/Documentation/Splunk/7.3.0/Data/Configureeventlinebreaking
0 Karma
Highlighted

Re: Setting Source Type for Log file with some Multi Line data in between lines with TimeStamp

Builder

That's because the document you're referring to is telling you to make changes to props.conf, like I suggested.

You're not talking about multi-line field extraction. You're talking about multi-event field extraction, because your events are not created properly at index time. In order to NOT change the event breaking AND extract fields across multiple events, you would have to group them in the same transaction somehow, and then extract the fields from that transaction.

I don't recommend doing it this way, because it's trying to get around the fact that the events are not properly created in the first place.

0 Karma
Highlighted

Re: Setting Source Type for Log file with some Multi Line data in between lines with TimeStamp

Esteemed Legend

Use this on your HF or Indexer tier inside props.conf:

[my_sourcetype]
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}
TIME_PREFIX = ^
TIME_FORMAT = %Y-%m-%d %H:%M:%S,%3N
EXTRACT-my_sourcetype_ids = "(?ms)TransactionId=(?<TransactionId>\S+).*?PaymentId=(?<PaymentId>\d+)"

If you are doing sourcetype override/overwrite, then USE THE ORIGINAL SOURCETYPE, deploy to the first full instance of Splunk that handles the events (usually HF or Indexer tier), restart all Splunk instances there, send in NEW events (old events will stay broken forever) and ensure that your test search is seeing new events by adding _index_earliest = -5m to your search.

0 Karma