I have a field which contains substitution placeholders
message=User %s performed action %s on %s
message=Message %s from %s
message=User %s updated %s from version %s to version %s. Duration %s
I also have 1 or more (upto 6) matching argument fields:
arg1=ajones
arg2=delete
arg3=presentation.ppt
My aim is to produce a consolidated field which performs the substitution, and produces:
message=User ajones performed action delete on presentation.ppt
replace seems not to allow me to expand the value of the arg1 field, and eval/replace replaces all instances of %s rather than selectively.
I am left pondering the use of rex, but wonder if I have overlooked a better alternative?
Needs more map
; try this:
| makeresults
| eval message="User %s performed action %s on %s
Message %s from %s
User %s updated %s from version %s to version %s. Duration %s"
| eval arg1="ajones", arg2="delete", arg3="presentation.ppt", arg4="arg4", arg5="arg5"
| makemv delim="
" message
| mvexpand message
| rename COMMENT AS "Everything above creates sample event data; everything below is your solution"
| map search="|makeresults|eval message=$message$ | rex field=message mode=sed \"s/%s/$arg1$/ s/%s/$arg2$/ s/%s/$arg3$/ s/%s/$arg4$/ s/%s/$arg5$/\""
Try something like this...
| makeresults
| eval m0="User %s performed action %s on %s"
| eval m1=split(m0,"%s")
| eval m2=mvappend("arg1","arg2","arg3","arg4","arg5","arg6")
| eval m3=mvzip(m1,m2,"")
| eval m4=mvjoin(m3,"")
The results show the limitations of the technique as well. In the above example, args 5 and 6 go away because they are unmatched in the message string. Also, if the final %s is not at the very end of the string, you end up losing the final bit of the string. To make it work perfectly, you need to true up the lengths of the two multivalue fields, something like this...
| makeresults
| eval m0="User %s performed action %s on %s"
| eval m1=split(m0,"%s")
| eval m2=mvappend("arg1","arg2","arg3","arg4","arg5","arg6")
| eval spacer = mvappend("","","","","","","")
| eval c1=mvcount(m1)
| eval c2=mvcount(m2)
| eval c0=mvcount(m2) - mvcount(m1)
| eval m1=if(c0>0,mvappend(m1,mvindex(spacer,0,c0-1)),m1)
| eval m2=if(c0<0,mvappend(m2,mvindex(spacer,0,-c0-1)),m2)
| eval c1A=mvcount(m1)
| eval c2A=mvcount(m2)
| eval m3=mvzip(m1,m2,"")
| eval m4=mvjoin(m3,"")
Of course, that's a verbose way of coding it, for exploratory and educational purposes.
Thanks for the idea - it seems this is quite a troublesome problem to tackle cleanly
YW. The second example is actually generic enough that it should cover the bases. It can be collapsed into about 3-4 lines, but I wanted it to be absolutely clear what the steps are to meet the requirements.
Don't you just love how splunk has so many useful tools? I'd bet that @somesoni2 and @woodcock have a couple more methods to add to mine and @niketnilay's ... or can clean up mine. There must be a more concise way to do lines 5-10 in my example, that's all basically to concatenate nulls onto each multivalue field until each one is the same length.
@nickhillscpl, Following are some of the other options you may try
Option 1 - Using printf evaluation function (Splunk 6.6). This is most suitable for your use case, however required Splunk 6.6. Following is a run anywhere search:
| makeresults
| eval msg="User %s performed action %s on %s"
| eval arg1="ajones"
| eval arg2="delete"
| eval arg3="presentation.ppt"
| eval cmsg=printf(msg,arg1,arg2,arg3)
Option 2 - Using eval replace with regular expression match. This is close to what you are doing. Following is run anywhere search
| makeresults
| eval msg="User %s performed action %s on %s"
| eval arg1="ajones"
| eval arg2="delete"
| eval arg3="presentation.ppt"
| eval cmsg=replace(msg,"(User )\%s( performed action )\%s( on )\%s","\1".arg1."\2".arg2."\3".arg3)
Option 3 - Using map command is also one of the alternatives, but, there will be limitation of max subsearches (by default 10, which implies you would need to perform ** | head 10** or similar command in your base search before feeding results to map command). Following is run anywhere search:
| makeresults
| eval arg1="ajones"
| eval arg2="delete"
| eval arg3="presentation.ppt"
| map search="| makeresults
| eval cmsg=\"User $arg1$ performed action $arg2$ on $arg3$\""
As evident, there could be more options as well 🙂 Hopefully others may suggest!
@niketnilay
Thanks very much for the steer on printf - that was new to me, however my question failed to highlight additional challenges:
The msg field can contain any number of various messages, and the number of arg fields is also variable.
It seems printf works perfectly as long as I pass exactly the right number of args to the function, but not if I over (or under) provide. Therefore the following will not work.
| makeresults
| eval msg="User %s performed action %s on %s"
| eval arg1="ajones"
| eval arg2="delete"
| eval arg3="presentation.ppt"
| eval cmsg=printf(msg,arg1,arg2,arg3,arg4,arg5,arg6)
nor if I fiillnull the args
I will update the question to highlight these things which I omitted, but as you suggest there must be other ways to achieve this. Thanks for your ideas!
@nickhillscpl, sorry that above options would not work for you (I was so sure that one of them would fit your needs). I will give a thought to variable arguments. Meanwhile let me convert my answer to comment so that it flags for others to solve.
@nickhillscpl, while the number of %s is not fixed, will you be always having same number of arguments as that of %s?
Will your argument fields be always named arg1, arg2 ....?
For example if you have "Text A %s Text B %s Text C", then will you have two arguments arg1 and arg2?
Only if the arguments are in reverse sequence the following option with variable arguments will work (or else regular expression for picking up first %s occurrence in your message needs to be added in replace command.
Option 4 - foreach
| makeresults
| eval msg="User %s performed action %s on %s"
| eval arg3="ajones"
| eval arg2="delete"
| eval arg1="presentation.ppt"
| foreach arg*
[ eval msg=replace(msg,"(.*)(%s)","\1".arg<<MATCHSTR>>) ]
I thought I had this but, the regular expression to pick only first occurrence of %s did not work as expected with replace and replaced every %s in single shot (although the same worked fine on regex101 and selected only first %s). So I had to use greedy regex to pick last %s and move towards first instead of picking up the first %s and moving towards last. Hence the sequence of arguments reversed.
^([^%]+)(%s)
worked fine on regex101 and using rex as well to pick only 1st %s but did not work with replace 😞
I can't get this to work on my real data, but your makeresults example works fine - odd!
In any case I cant easily reorder the args, so this approach is not without its troubles.
Thank you for your suggestions, its great to have a number of ideas which I would not have come up with on my own. I have updated my own comment above to illustrate how I am doing it, but thanks again for the great ideas.
The following broadly addresses the problem but feels woefully inefficient.
..|fillnull value="" arg1,arg2,arg3,arg4,arg5,arg6|
eval cmsg=msg+"|"+arg1+"|"+arg2+"|"+arg3+"|"+arg4+"|"+arg5+"|"+arg6 |
replace *%s* WITH *arg1* IN cmsg |
replace *%s* WITH *arg2* IN cmsg|
replace *%s* WITH *arg3* IN cmsg|
replace *%s* WITH *arg4* IN cmsg|
replace *%s* WITH *arg5* IN cmsg|
replace *%s* WITH *arg6* IN cmsg|
rex field=cmsg mode=sed "s/(arg1).+?|(.+?)|/\2/g"|
rex field=cmsg mode=sed "s/(arg2).+?|(.+?)|/\2/g"|
rex field=cmsg mode=sed "s/(arg3).+?|(.+?)|/\2/g"|
rex field=cmsg mode=sed "s/(arg4).+?|(.+?)|/\2/g"|
rex field=cmsg mode=sed "s/(arg5).+?|(.+?)|/\2/g"|
rex field=cmsg mode=sed "s/(arg6).+?|(.+?)|/\2/g"
In the above I am concatenating all the arguments into a consolidated field | delimited. Then I replace the first occurrence of %s with a placeholder (arg1 for example) before finally using rex to substitute second capture group with first.
Because the number of arguments is not constant, I don't see a way to perform the rex in one pass, as the number of capture groups will vary, so I have done this with 6 passes.
Is there a better way?
In the end I am using this updated version - this solves the problem in my case, but thanks to everyone who contributed
..|fillnull value="" arg1,arg2,arg3,arg4,arg5,arg6|
eval msg=msg+"|"+arg1+"|"+arg2+"|"+arg3+"|"+arg4+"|"+arg5+"|"+arg6 |
foreach arg* [ replace *%s* WITH *<<FIELD>>* IN cmsg ]|
foreach arg* [ rex field=cmsg mode=sed "s/(<<FIELD>>).+?|(.+?)|/\2/g"]|
rex field=cmsg mode=sed "s/(\|)//g"
@nickhillscpl, this seems to be working solution, as far as you are performing foreach it will take care of dynamic arguments. Please convert this to answer and accept to close the same.