I have a query that extracts useful info from a storage system report.
rex "quota list --verbose (?<fs>[A-Z0-9_]+) " | rex max_match=1000 "ViVol: (?<vivol>(?!user)[A-Za-z0-9]+)\nUsage\s+:\s+(?<usage>[0-9. A-Za-z]+)\n\s*Limit\s+:\s+(?<limit>[0-9A-Z. ]+)" | table fs, vivol, usage, limit
There is a single line at the start of the report with the filesystem which I extract as the "fs" field. Then there are several volume descriptions containing separate lines for the volume, usage and limit.
This query produces a single-value field for "fs" then three multi-value fields "vivol", "usage" and "limit". e.g.
fs vivol usage limit FIRST_FS VOL_ABC 100 300 VOL_XYZ 320 800 VOL_123 50 150
When I export this to Excel (using CSV) the multi-value fields are all within a single cell. I want them on separate rows. If I use mvexpand I get the unexpected behaviour that it will properly expand one field but leave the others unexpanded. If I expand all three fields they lose correlation so I get rows that are mixed-up.
FIRST_FS VOL_123 320 300
How do I turn my three multi-value fields into tuples? I want to keep them together so the first row in "vivol" matches the first rows in "usage" and "limit". Bear in mind there are many "fs" events (about 100 of them).
| makeresults | eval x="another_single_value_field" | eval f1=split("a1,a2,a3",",") | eval f2=split("b1,b2,b3",",") | eval f3=split("c1,c2,c3",",") | eval f4=split("d1,d2,d3",",") `comment(" this is solution multiple fields mvexpand ")` `comment(" create counter ")` | eval _counter=mvrange(0,mvcount(f1)) `comment(" prepare to batch ")` | stats list(*) as * by _counter `comment(" batch all fields except _counter ")` | foreach * [ eval <<FIELD>>=if(mvcount(<<FIELD>>)=1,<<FIELD>>,mvindex(<<FIELD>>,_counter))]
Separate the data into lines (events):
| rex max_match=0 "(?<line>[^\)]+\)\n\N+)" | mvexpand line | table line
Next, do your extractions:
| rex field=line "quota list --verbose (?<fs>[A-Z0-9_]+) " | rex field=line max_match=1000 "ViVol: (?<vivol>(?!user)[A-Za-z0-9]+)\nUsage\s+:\s+(?<usage>[0-9. A-Za-z]+)\n\s*Limit\s+:\s+(?<limit>[0-9A-Z. ]+)" | table fs, vivol, usage, limit
Use the SPL command filldown:
| filldown fs
fs vivol usage limit FIRST_FS VOL_ABC 100 300 FIRST_FS VOL_XYZ 320 800 FIRST_FS VOL_123 50 150
then take it from there.
Updated regex a bit to select the values as per the example:
| rex field=line "quota list --verbose (?[A-Z0-9_]+) "
| rex field=line max_match=1000 "ViVol: (?(?!user)[A-Za-z0-9_]+)\nUsage\s+:\s+(?[0-9.]+)[A-Za-z\s\n]+Limit\s+:\s+(?[0-9]+)[A-Za-z\s+()]+"
| table fs, vivol, usage, limit
Use mvzip, makemv and then reset the fields based on index.
First, mvzip the multi-values into a new field:
| eval reading=mvzip(vivol, usage) // create multi-value field for reading | eval reading=mvzip(reading, limit) // add the third field
At this point you'll have a multi-value field called reading. Here's an example of a field value (a list of four items):
"VOL_ABC,100,300", "VOL_XYZ,320,800", "VOL_123, 50,150", "VOL_FOO, 80,120"
Expand the field and restore the values:
| mvexpand reading // separate multi-value into into separate events | makemv reading delim="," // convert the reading into a multi-value | eval vivol=mvindex(reading, 0) // set vivol to the first value of reading | eval usage=mvindex(reading, 1) // set usage to the second value of reading | eval limit=mvindex(reading, -1) // set limit to the last value of reading
Assuming that all the mv fields MUST have the same number of items...
| eval myFan=mvrange(0,mvcount(vivol)) | mvexpand myFan | eval vivol=mvindex(vivol,myFan) | eval usage=mvindex(usage,myFan) | eval limit=mvindex(limit,myFan)
Thanks @sk314. To be fair, this question was left unanswered for four years and 35 hours. Some improvements have been made to the docs since this answer, but this example is still better, IMO.
Very helpful, thanks. I ended up with a completed search that did exactly what I wanted using the above stuff.
source="/Znfs200g/Mainframe/splunk/volSpaceReport.txt" | rex max_match=0 "(?:PRIVATE\s+)(?\d+)\s+(?\d+)" | eval my_zip=mvzip(vol,vol_pct) | mvexpand my_zip | makemv my_zip delim="," | eval vol=mvindex(my_zip,0) | eval vol_pct=mvindex(my_zip,1) | eventstats sum(vol) as vol_sum | eval weighted_vol_pct=(vol_pct*vol/vol_sum) | stats sum(weighted_vol_pct) as Average_HardDisk_Utilization
I ran into the same issue with two multi-valued fields, and arrived at a different solution - make a copy of the field to preserve the order for an mvfind, then use mvexpand, look up the value in the added field, lookup each field that was NOT expanded, then drop the added field. It would look something like:
...| eval vivolIndex=vivol | mvexpand vivol | eval idx=mvfind(vivolIndex,vivol) | eval usage=mvindex(usage,idx) | eval limit=mvindex(limit,idx) | fields - vivolIndex ...
This solution worked better for me as I was using a stats list(x) list(y) and needed to keep the values correlated.
I just had the same issue.
Create a single field with all the eventual fields you want, so you have a single MV, then use mvexpand to create the multiple entries, then do another parse on the (now single-) value to extract the three fields.