Splunk Search

Search query to replace first occurrence word with blank but second occurrence to replace with comma

Kitteh
Path Finder

How do I use regex or replace to remove the first occurrence word found and replace second occurrence onward with comma?

For example, the raw data is:
ubuntu CRON[2907]: pam_unix(cron:session): session opened for user root by (uid=0) ubuntu CRON[2907]: pam_unix(cron:session): session closed for user root

I want it to be:
CRON[2907]: pam_unix(cron:session): session opened for user root by (uid=0),CRON[2907]: pam_unix(cron:session): session closed for user root

0 Karma
1 Solution

cpetterborg
SplunkTrust
SplunkTrust

If you have only one second occurrence of the beginning string, this will work:

| makeresults 
| eval _raw="ubuntu CRON[2907]: pam_unix(cron:session): session opened for user root by (uid=0) ubuntu CRON[2907]: pam_unix(cron:session): session closed for user root by (uid=0)" 
| rex mode=sed "s/^(\S+)(.*?)\s(\1)/\2, /"

The process for multiple occurrences is more complex. Is the data in that case similar to the example that you provided? if not can you provide an example? Is there a maximum number of occurrences?

View solution in original post

inventsekar
SplunkTrust
SplunkTrust

You can run rex two times, first time to replace the first ubuntu with blank,
second ubuntu with a comma

(if the string "ubuntu" is not known before hand, please update some more details(which spot it appears), so that rex can be updated)
(rex mode=sed can not be tested on regex101 website, i have tested it on splunk directly, it works fine.. please check the screenshot)

|makeresults
 | eval _raw = "ubuntu CRON[2907]: pam_unix(cron:session): session opened for user root by (uid=0) ubuntu CRON[2907]: pam_unix(cron:session): session closed for user root"
 | rex mode=sed field=_raw "s#(^ubuntu\s)##"
 | rex mode=sed field=_raw "s#ubuntu#,#"
 | table _raw

alt text

thanks and best regards,
Sekar

PS - If this or any post helped you in any way, pls consider upvoting, thanks for reading !
0 Karma

cpetterborg
SplunkTrust
SplunkTrust

If you have only one second occurrence of the beginning string, this will work:

| makeresults 
| eval _raw="ubuntu CRON[2907]: pam_unix(cron:session): session opened for user root by (uid=0) ubuntu CRON[2907]: pam_unix(cron:session): session closed for user root by (uid=0)" 
| rex mode=sed "s/^(\S+)(.*?)\s(\1)/\2, /"

The process for multiple occurrences is more complex. Is the data in that case similar to the example that you provided? if not can you provide an example? Is there a maximum number of occurrences?

inventsekar
SplunkTrust
SplunkTrust

Hi @cpetterborg, great rex command... Great learning !

to other rex beginners, let me explain it -
"s/^(\S+)(.?)\s(\1)/\2, /"
^(\S+) --- captures the first word
`(.
?)------ remaining line is captured as "\2", till the 2nd ubuntu match
\s(\1)---- matching for "a space and word ubuntu"
before the "/", only matching part, after this "/", its the replacement part
\2,--- on the replacement, leave the\1`, write the "\2" match and then a comma ",". thats it.

thanks and best regards,
Sekar

PS - If this or any post helped you in any way, pls consider upvoting, thanks for reading !

cpetterborg
SplunkTrust
SplunkTrust

Thank you. I saw your original post in email. I'm glad you figured it all out. Congratulations! 🙂 I've upvoted your comment for the fine explanation!

Get Updates on the Splunk Community!

Index This | When is October more than just the tenth month?

October 2025 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Observe and Secure All Apps with Splunk

  Join Us for Our Next Tech Talk: Observe and Secure All Apps with SplunkAs organizations continue to innovate ...

What’s New & Next in Splunk SOAR

 Security teams today are dealing with more alerts, more tools, and more pressure than ever.  Join us for an ...