Getting Data In

How to monitor a multi line log with a variable number of field value pairs?

arrowecssupport
Communicator

We monitor the log output of many file storage systems, some devices have only a few, others have hundreds, but there is no way of knowing how many disks each log file will contain.
The issue (in the real world) is that the customer has 2 non compatible drives; the 750gb HDD part code HRF750.
We want to be able to extract on the full line 750gb HDD partnumber: HRF750 s/n: 31564847877 from the log where ever we find the part code HRF750. We can then put this in a table or report, allowing us to find systems running on compatible hardware.

How do I go about doing this?

Below is an example of what a log file looks like.

Array model: RX-100
250gb SSD partnumber: XFA250 s/n: 12313123123
250gb SSD partnumber: XFA250 s/n: 56498787521
250gb SSD partnumber: XFA250 s/n: 95195195198
250gb SSD partnumber: XFA250 s/n: 51515151511
250gb SSD partnumber: XFA250 s/n: 95959595959
750gb HDD partnumber: HRF750 s/n: 31564847877
750gb HDD partnumber: HRF750 s/n: 89765432145
0 Karma

aaraneta_splunk
Splunk Employee
Splunk Employee

@arrowecssupport - Did one of the answers below help provide a solution your question? If yes, please click “Accept” below the best answer to resolve this post and upvote anything that was helpful. If no, please leave a comment with more feedback. Thanks.

0 Karma

woodcock
Esteemed Legend

Maybe this:

... | rex "^Array model\s*:\s*(?<arraymodel>[^\r\n\s]+)"
| rex max_match=0 "(?ms)(?<diskdetail>[^\r\n]+)"
| mvexpand diskdetail
| rex field=diskdetail "^(?<size>\S+)[^:]+:\s+(?<partnumber>\S+)[^:]+:\s+(?<serialnumber>.*)$"
| fields - diskdetail

Now you can add whatever logic that you would like to find mismatches.

0 Karma

gvmorley
Contributor

Hi,

It would be good to know if you're indexing these logs line by line, or as one long event?

Assuming that you've just pulled them in as one event (since you mention multi-line in the title), you can still use the rex command to extract the info you want.

What might be tripping you up is that by default rex only returns the first match. But if you set it to max_match=0 then it will do multiple matches.

So maybe something like this:

| rex max_match=0 "(?m)partnumber:\s(?<part_serial>[^\s]+\ss/n:\s[\w\d]+)"
| rex "Array\smodel:\s(?<array_model>[\w\d-]+)"
| mvexpand part_serial
| table array_model part_serial
| where match(part_serial,"HRF750")

This would return a table which looks like:

alt text

That will hopefully get you started. If the logs have something like the customer or system name in them, you could include that too.

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...