Hey all - thanks in advance!
I have _raw log data that contains a header section and then what appears to be two entries within itself. I want to split these entries (they are formatted the same, except the latter appends a '1' onto each fieldname) and then create two events from this one event, like so:
event1 = HEADER|PART1
event2 = HEADER|PART2
An event will come from the same IP and device name; the parts are paths and simple fields. Here is a sample log (bracketed to show how I want it split, but these brackets are not in the raw data):
[Jun 12 23:00:09 server.i.j WRD: 0|AName|Named Application Server|1|0|Rule|0|ClientTime=7:02:28-PM CompName=sparse.info.given.here IPv4=x.x.x.x] [Path=No-Results-Found MD5= Size= Modified= RuleID= ValidHits= InvalidHits= NoValidationHits=] [Path1=No-Results-Found MD51= Size1= Modified1= RuleID1= ValidHits1= InvalidHits1= NoValidationHits1= Count=1]
I would like the final results to be:
Jun 12 23:00:09 server.i.j WRD: 0|AName|Named Application Server|1|0|Rule|0|ClientTime=7:02:28-PM CompName=sparse.info.given.here IPv4=x.x.x.x Path=No-Results-Found MD5= Size= Modified= RuleID= ValidHits= InvalidHits= NoValidationHits=
Jun 12 23:00:09 server.i.j WRD: 0|AName|Named Application Server|1|0|Rule|0|ClientTime=7:02:28-PM CompName=sparse.info.given.here IPv4=x.x.x.x Path1=No-Results-Found MD51= Size1= Modified1= RuleID1= ValidHits1= InvalidHits1= NoValidationHits1= Count=1
Count is not really a big deal here, it can be on either log (the latter by default as it is the final field in the log)
I have the regex to perform the part-splitting if rex is the move here:
| rex field=_raw "(?<header>.*IPv4Address=\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}) (?<part1>Path.*) (?<part2>Path.*)"
Once recombined, I will still perform manipulation on the resulting logs, and I do not need to write to file or CSV. The issue this is causing relates to finding accurate hits on files (the ValidHits1 field is annoying; same with Path1). I can happily rename fields after rejoining my Parts to the header so I can then correlate on top of all data with common field names.
Please feel free to ask for more information to help me out with this, and I appreciate any help you can give for this project!
You could do this (although it only works for 1, 2 and 3, because IPv4 and MD5 complicate matters)!
| eval data = split(_raw, " Path")
| eval header = mvindex(data, 0), path = mvappend("Path".mvindex(data, 1), "Path".mvindex(data, 2))
| mvexpand path
| eval _raw = mvjoin(mvappend(header, path), " ")
| rex mode=sed "s/(?<name>\w+)(?<digit>[1-3])=/\1=/g"
How about this:
| eval data = split(_raw, "] [")
| eval header = ltrim(mvindex(data, 0), "["), path = mvappend(mvindex(data, 1), rtrim(mvindex(data, 2), "]"))
| mvexpand path
| eval _raw = mvjoin(mvappend(header, path), " ")
Your sample data gives
_raw |
Jun 12 23:00:09 server.i.j WRD: 0|AName|Named Application Server|1|0|Rule|0|ClientTime=7:02:28-PM CompName=sparse.info.given.here IPv4=x.x.x.x Path=No-Results-Found MD5= Size= Modified= RuleID= ValidHits= InvalidHits= NoValidationHits= |
Jun 12 23:00:09 server.i.j WRD: 0|AName|Named Application Server|1|0|Rule|0|ClientTime=7:02:28-PM CompName=sparse.info.given.here IPv4=x.x.x.x Path1=No-Results-Found MD51= Size1= Modified1= RuleID1= ValidHits1= InvalidHits1= NoValidationHits1= Count=1 |
@yuanliu the brackets aren't in the data, but you are on the right lines
| eval data = split(_raw, " Path")
| eval header = mvindex(data, 0), path = mvappend("Path".mvindex(data, 1), "Path".mvindex(data, 2))
| mvexpand path
| eval _raw = mvjoin(mvappend(header, path), " ")
This works and I am now focused on renaming the fields within the path mv-object -> the last part of the index(2) is what has fields appended with '1' that need to be removed.
After running the mvexpand on path, I tried to do "rename var1 as var vary1 as vary" etc... but to no avail. I may be out of my scope of understanding on how Multi-Value commands manipulate or stream data. I am going to be searching for that answer on how to modify those field names with this mvindex and mvexpand then mvjoin answer... If I can't seem to find what I want I will accept this as a solution in the coming days.
Thanks to you both, @ITWhisperer @yuanliu
I think when doing my renames I recognized my mistake; they are not extracted fields. So I either have to get it to recognize there are Path and Path1, ValidHits and ValidHits1, etc. for every field or change the data when doing the MV stuff.
@ITWhispererDoes mvindex just "look at" or "copy" data into new, mutable sections? If It just looks at, then I can understand how renaming (read: editing raw data) isn't possible. But If it's copying into a new field and then we're rejoining, shouldn't I be able to manipulate those internal values -- not using rename (as they are not extracted fields)? By using a combo of rex and something else?
You could do this (although it only works for 1, 2 and 3, because IPv4 and MD5 complicate matters)!
| eval data = split(_raw, " Path")
| eval header = mvindex(data, 0), path = mvappend("Path".mvindex(data, 1), "Path".mvindex(data, 2))
| mvexpand path
| eval _raw = mvjoin(mvappend(header, path), " ")
| rex mode=sed "s/(?<name>\w+)(?<digit>[1-3])=/\1=/g"
Thank you very much! That has got me what I needed!