Getting Data In

How to monitor a multi line log with a variable number of field value pairs?

arrowecssupport
Communicator

We monitor the log output of many file storage systems, some devices have only a few, others have hundreds, but there is no way of knowing how many disks each log file will contain.
The issue (in the real world) is that the customer has 2 non compatible drives; the 750gb HDD part code HRF750.
We want to be able to extract on the full line 750gb HDD partnumber: HRF750 s/n: 31564847877 from the log where ever we find the part code HRF750. We can then put this in a table or report, allowing us to find systems running on compatible hardware.

How do I go about doing this?

Below is an example of what a log file looks like.

Array model: RX-100
250gb SSD partnumber: XFA250 s/n: 12313123123
250gb SSD partnumber: XFA250 s/n: 56498787521
250gb SSD partnumber: XFA250 s/n: 95195195198
250gb SSD partnumber: XFA250 s/n: 51515151511
250gb SSD partnumber: XFA250 s/n: 95959595959
750gb HDD partnumber: HRF750 s/n: 31564847877
750gb HDD partnumber: HRF750 s/n: 89765432145
0 Karma

aaraneta_splunk
Splunk Employee
Splunk Employee

@arrowecssupport - Did one of the answers below help provide a solution your question? If yes, please click “Accept” below the best answer to resolve this post and upvote anything that was helpful. If no, please leave a comment with more feedback. Thanks.

0 Karma

woodcock
Esteemed Legend

Maybe this:

... | rex "^Array model\s*:\s*(?<arraymodel>[^\r\n\s]+)"
| rex max_match=0 "(?ms)(?<diskdetail>[^\r\n]+)"
| mvexpand diskdetail
| rex field=diskdetail "^(?<size>\S+)[^:]+:\s+(?<partnumber>\S+)[^:]+:\s+(?<serialnumber>.*)$"
| fields - diskdetail

Now you can add whatever logic that you would like to find mismatches.

0 Karma

gvmorley
Contributor

Hi,

It would be good to know if you're indexing these logs line by line, or as one long event?

Assuming that you've just pulled them in as one event (since you mention multi-line in the title), you can still use the rex command to extract the info you want.

What might be tripping you up is that by default rex only returns the first match. But if you set it to max_match=0 then it will do multiple matches.

So maybe something like this:

| rex max_match=0 "(?m)partnumber:\s(?<part_serial>[^\s]+\ss/n:\s[\w\d]+)"
| rex "Array\smodel:\s(?<array_model>[\w\d-]+)"
| mvexpand part_serial
| table array_model part_serial
| where match(part_serial,"HRF750")

This would return a table which looks like:

alt text

That will hopefully get you started. If the logs have something like the customer or system name in them, you could include that too.

0 Karma
Get Updates on the Splunk Community!

New in Observability - Improvements to Custom Metrics SLOs, Log Observer Connect & ...

The latest enhancements to the Splunk observability portfolio deliver improved SLO management accuracy, better ...

Improve Data Pipelines Using Splunk Data Management

  Register Now   This Tech Talk will explore the pipeline management offerings Edge Processor and Ingest ...

3-2-1 Go! How Fast Can You Debug Microservices with Observability Cloud?

Register Join this Tech Talk to learn how unique features like Service Centric Views, Tag Spotlight, and ...