Splunk Search

Replace substitution placeholders in a field

nickhills
Ultra Champion

I have a field which contains substitution placeholders

message=User %s performed action %s on %s
message=Message %s from %s
message=User %s updated %s from version %s to version %s. Duration %s 

I also have 1 or more (upto 6) matching argument fields:

arg1=ajones
arg2=delete
arg3=presentation.ppt

My aim is to produce a consolidated field which performs the substitution, and produces:

message=User ajones performed action delete on presentation.ppt

replace seems not to allow me to expand the value of the arg1 field, and eval/replace replaces all instances of %s rather than selectively.

I am left pondering the use of rex, but wonder if I have overlooked a better alternative?

If my comment helps, please give it a thumbs up!
0 Karma

woodcock
Esteemed Legend

Needs more map; try this:

| makeresults 
| eval message="User %s performed action %s on %s
Message %s from %s
User %s updated %s from version %s to version %s. Duration %s"
| eval arg1="ajones", arg2="delete", arg3="presentation.ppt", arg4="arg4", arg5="arg5"
| makemv delim="
" message
| mvexpand message

| rename COMMENT AS "Everything above creates sample event data; everything below is your solution"

| map search="|makeresults|eval message=$message$ | rex field=message mode=sed \"s/%s/$arg1$/ s/%s/$arg2$/ s/%s/$arg3$/ s/%s/$arg4$/ s/%s/$arg5$/\""
0 Karma

DalJeanis
Legend

Try something like this...

| makeresults 
| eval m0="User %s performed action %s on %s" 
| eval m1=split(m0,"%s") 
| eval m2=mvappend("arg1","arg2","arg3","arg4","arg5","arg6") 
| eval m3=mvzip(m1,m2,"") 
| eval m4=mvjoin(m3,"")

The results show the limitations of the technique as well. In the above example, args 5 and 6 go away because they are unmatched in the message string. Also, if the final %s is not at the very end of the string, you end up losing the final bit of the string. To make it work perfectly, you need to true up the lengths of the two multivalue fields, something like this...

| makeresults 
| eval m0="User %s performed action %s on %s" 
| eval m1=split(m0,"%s") 
| eval m2=mvappend("arg1","arg2","arg3","arg4","arg5","arg6") 
| eval spacer = mvappend("","","","","","","") 
| eval c1=mvcount(m1)  
| eval c2=mvcount(m2)  
| eval c0=mvcount(m2) - mvcount(m1)  
| eval m1=if(c0>0,mvappend(m1,mvindex(spacer,0,c0-1)),m1) 
| eval m2=if(c0<0,mvappend(m2,mvindex(spacer,0,-c0-1)),m2)   
| eval c1A=mvcount(m1)  
| eval c2A=mvcount(m2)  
| eval m3=mvzip(m1,m2,"") 
| eval m4=mvjoin(m3,"")

Of course, that's a verbose way of coding it, for exploratory and educational purposes.

nickhills
Ultra Champion

Thanks for the idea - it seems this is quite a troublesome problem to tackle cleanly

If my comment helps, please give it a thumbs up!
0 Karma

DalJeanis
Legend

YW. The second example is actually generic enough that it should cover the bases. It can be collapsed into about 3-4 lines, but I wanted it to be absolutely clear what the steps are to meet the requirements.

Don't you just love how splunk has so many useful tools? I'd bet that @somesoni2 and @woodcock have a couple more methods to add to mine and @niketnilay's ... or can clean up mine. There must be a more concise way to do lines 5-10 in my example, that's all basically to concatenate nulls onto each multivalue field until each one is the same length.

0 Karma

niketn
Legend

@nickhillscpl, Following are some of the other options you may try

Option 1 - Using printf evaluation function (Splunk 6.6). This is most suitable for your use case, however required Splunk 6.6. Following is a run anywhere search:

| makeresults
| eval msg="User %s performed action %s on %s"
| eval arg1="ajones"
| eval arg2="delete"
| eval arg3="presentation.ppt"
| eval cmsg=printf(msg,arg1,arg2,arg3)

Option 2 - Using eval replace with regular expression match. This is close to what you are doing. Following is run anywhere search

| makeresults
| eval msg="User %s performed action %s on %s"
| eval arg1="ajones"
| eval arg2="delete"
| eval arg3="presentation.ppt"
| eval cmsg=replace(msg,"(User )\%s( performed action )\%s( on )\%s","\1".arg1."\2".arg2."\3".arg3)

Option 3 - Using map command is also one of the alternatives, but, there will be limitation of max subsearches (by default 10, which implies you would need to perform ** | head 10** or similar command in your base search before feeding results to map command). Following is run anywhere search:

| makeresults
| eval arg1="ajones"
| eval arg2="delete"
| eval arg3="presentation.ppt"
| map search="| makeresults
              | eval cmsg=\"User $arg1$ performed action $arg2$ on $arg3$\""

As evident, there could be more options as well 🙂 Hopefully others may suggest!

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

nickhills
Ultra Champion

@niketnilay
Thanks very much for the steer on printf - that was new to me, however my question failed to highlight additional challenges:
The msg field can contain any number of various messages, and the number of arg fields is also variable.

It seems printf works perfectly as long as I pass exactly the right number of args to the function, but not if I over (or under) provide. Therefore the following will not work.
| makeresults
| eval msg="User %s performed action %s on %s"
| eval arg1="ajones"
| eval arg2="delete"
| eval arg3="presentation.ppt"
| eval cmsg=printf(msg,arg1,arg2,arg3,arg4,arg5,arg6)

nor if I fiillnull the args

I will update the question to highlight these things which I omitted, but as you suggest there must be other ways to achieve this. Thanks for your ideas!

If my comment helps, please give it a thumbs up!
0 Karma

niketn
Legend

@nickhillscpl, sorry that above options would not work for you (I was so sure that one of them would fit your needs). I will give a thought to variable arguments. Meanwhile let me convert my answer to comment so that it flags for others to solve.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

niketn
Legend

@nickhillscpl, while the number of %s is not fixed, will you be always having same number of arguments as that of %s?

Will your argument fields be always named arg1, arg2 ....?

For example if you have "Text A %s Text B %s Text C", then will you have two arguments arg1 and arg2?

Only if the arguments are in reverse sequence the following option with variable arguments will work (or else regular expression for picking up first %s occurrence in your message needs to be added in replace command.

Option 4 - foreach

| makeresults
| eval msg="User %s performed action %s on %s"
| eval arg3="ajones"
| eval arg2="delete"
| eval arg1="presentation.ppt"
| foreach arg*
   [ eval msg=replace(msg,"(.*)(%s)","\1".arg<<MATCHSTR>>) ]

I thought I had this but, the regular expression to pick only first occurrence of %s did not work as expected with replace and replaced every %s in single shot (although the same worked fine on regex101 and selected only first %s). So I had to use greedy regex to pick last %s and move towards first instead of picking up the first %s and moving towards last. Hence the sequence of arguments reversed.

^([^%]+)(%s) worked fine on regex101 and using rex as well to pick only 1st %s but did not work with replace 😞

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

nickhills
Ultra Champion

I can't get this to work on my real data, but your makeresults example works fine - odd!

In any case I cant easily reorder the args, so this approach is not without its troubles.
Thank you for your suggestions, its great to have a number of ideas which I would not have come up with on my own. I have updated my own comment above to illustrate how I am doing it, but thanks again for the great ideas.

If my comment helps, please give it a thumbs up!
0 Karma

nickhills
Ultra Champion

The following broadly addresses the problem but feels woefully inefficient.

 ..|fillnull value="" arg1,arg2,arg3,arg4,arg5,arg6|
eval cmsg=msg+"|"+arg1+"|"+arg2+"|"+arg3+"|"+arg4+"|"+arg5+"|"+arg6 |
replace *%s* WITH *arg1* IN cmsg |
replace *%s* WITH *arg2* IN cmsg|
replace *%s* WITH *arg3* IN cmsg|
replace *%s* WITH *arg4* IN cmsg|
replace *%s* WITH *arg5* IN cmsg|
replace *%s* WITH *arg6* IN cmsg|
rex field=cmsg mode=sed "s/(arg1).+?|(.+?)|/\2/g"|
rex field=cmsg mode=sed "s/(arg2).+?|(.+?)|/\2/g"|
rex field=cmsg mode=sed "s/(arg3).+?|(.+?)|/\2/g"|
rex field=cmsg mode=sed "s/(arg4).+?|(.+?)|/\2/g"|
rex field=cmsg mode=sed "s/(arg5).+?|(.+?)|/\2/g"|
rex field=cmsg mode=sed "s/(arg6).+?|(.+?)|/\2/g"

In the above I am concatenating all the arguments into a consolidated field | delimited. Then I replace the first occurrence of %s with a placeholder (arg1 for example) before finally using rex to substitute second capture group with first.

Because the number of arguments is not constant, I don't see a way to perform the rex in one pass, as the number of capture groups will vary, so I have done this with 6 passes.

Is there a better way?

If my comment helps, please give it a thumbs up!
0 Karma

nickhills
Ultra Champion

In the end I am using this updated version - this solves the problem in my case, but thanks to everyone who contributed

 ..|fillnull value="" arg1,arg2,arg3,arg4,arg5,arg6|
eval msg=msg+"|"+arg1+"|"+arg2+"|"+arg3+"|"+arg4+"|"+arg5+"|"+arg6 |
foreach arg* [ replace *%s* WITH *<<FIELD>>* IN cmsg ]|
foreach arg* [ rex field=cmsg mode=sed "s/(<<FIELD>>).+?|(.+?)|/\2/g"]| 
rex field=cmsg mode=sed "s/(\|)//g"
If my comment helps, please give it a thumbs up!

niketn
Legend

@nickhillscpl, this seems to be working solution, as far as you are performing foreach it will take care of dynamic arguments. Please convert this to answer and accept to close the same.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma
Get Updates on the Splunk Community!

Splunk Enterprise Security 8.0.2 Availability: On cloud and On-premise!

A few months ago, we released Splunk Enterprise Security 8.0 for our cloud customers. Today, we are excited to ...

Logs to Metrics

Logs and Metrics Logs are generally unstructured text or structured events emitted by applications and written ...

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...