I have the following output:
DEV#: 0 DEVICE NAME: vpath0 TYPE: 2107900 POLICY: Optimized SERIAL: 123bac ======================================================================= Path# Adapter/Hard Disk State Mode Select Errors 0 fscsi0/hidsk22 Open NORMAL 123456 0 1 fscsi0/hidsk29 Open NORMAL 456789 0
I would like to extract four fields from this: The "path" numbers (in this case 0 and 1). Fields should be named path0 and path1. The "select" values (in this case 123456 and 456789). Fields should be named select0 and select1.
I can't figure out how to get a regex to separate two lines and create the field extraction for me.
My ultimate goal is to be able to compare the two select fields against a common vpath (DEVICE NAME).
Is this possible?
Try this regex:
Note, I'm assuming that there's not going to be more then 2 paths specified..
We could also capture more information if necessary...
Hmm... Didn't seem to work. It didn't error, but I don't see the fields extracted.
So are you basically treating the last two lines of the output as one big string?
(And, no, there will never be more than 2 paths specified. Not unless we redo our entire SAN configuration. :-))
Hrrm.. It worked on my machine. Granted, I'm using the same test files I did for your other answer.
I am treating the output as one big string, using the \n (newline) as a delimiter for the lines.
Lowell's answer is more elegant then my regex happy self.
Thank you for the response. Strange that it works on your end but not mine. I tried a few variants too with no success, including just extracting the fields from just one of the two lines. Still doesn't work... strange.
You may have some luck with the multikv command.
You could do something like this:
sourcetype=your_source_type | rex "^DEV#:\s+(?<dev_no>\d+)\s+DEVICE NAME:\s+(?<device_name>\S+)\s+TYPE:\s+(?<type>\d+)\s+POLICY:\s+(?<policy>\S+)" | rex "SERIAL:\s+(?<serial>\S+)" | multikv | stats list(Select), list(Disk), sum(Errors) by device_name, serial
stats operation is pretty bogus at this point; it's mostly just demoing which fields you have after the
Some of the column names are less than ideal, but you can always rename them if you really need to.
multikv search command looks for a header line (the 4th line in your example) and then it looks for fixed-width rows beneath that. In you case you have two rows of data, and each row will be transformed into it's own event. (This is why it's important to extract the top-level fields (like
serial ...) prior to use the
multikv command. Because after you call
multikv everything but the individual "row" is removed from your raw event. But all the fields are kept.)
So if you look at the fields that exist after the
multikv command, you'll see that the "Path#" column gets named "Path_" (because "#" is not valid in afield name, so it's replaced with a "_"). In the case of the next column, it's called just "Disk" (looks like it is just dropping off the "Adapter/Hard" portion prefix for whatever reason, due to spaces I guess--like I said, it's kind of kludgey command). The remaining columns (State, Mode, Select, and Errors) are all very straight forward to see and the fields are named exactly as the column names appear in the text.
So essentially you are now looking at multiple events. So instead of having "path0" and "path1" as you originally talked about, you will now have a single field called "Path_" and the first event will have the value "0" and the second will have the value "1". So how you combine these back together will be completely determined by what you are trying do with your data. You can recombine your events using
transaction, but without a specific example of how you would like to interact with your fields, it's hard to give a usable example. If you never want to be able to deal with your fields individually like this, then perhaps the mulit-line regex approach is the best for you.
If you're still struggling with figuring out how all of this works. You may find it helpful to recreate the search I've shown above one search command at a time while looking one event at a time. (Sometimes just simplifying the problem into it's smallest parts will help you see what's going on.) If you're very new to splunk, then the whole thing can seem like voodoo (I've been there), I suggest just taking it one step at a time and eventually it will all make sense.
I looked at multikv, but even after reading the document I don't understand how to apply it in this case. From what I read, it seems to assume the fields have already been defined.
Yeah, 'multikv' can be intimating at first. I generally stay away from it as much as possible myself, but there are times where it is the most direct option; and your given example is the classic use case that
multikv. I've updated my answer to include an example search, hope it helps.
Okay, I'll just be brutally honest here: Your example worked great. But I don't understand WHY it worked. 🙂
I mean it extracted the Select field... But I don't see how/where you extracted it in your code. I'll have to look at it more closely.
One more question: using that, how I pick out/separate the two different "selects"? Suppose I want to take the difference of the two values or something like that... how are they called?
Thank you so much, you both have been great. 🙂