Splunk Search

Search query to replace first occurrence word with blank but second occurrence to replace with comma

Kitteh
Path Finder

How do I use regex or replace to remove the first occurrence word found and replace second occurrence onward with comma?

For example, the raw data is:
ubuntu CRON[2907]: pam_unix(cron:session): session opened for user root by (uid=0) ubuntu CRON[2907]: pam_unix(cron:session): session closed for user root

I want it to be:
CRON[2907]: pam_unix(cron:session): session opened for user root by (uid=0),CRON[2907]: pam_unix(cron:session): session closed for user root

0 Karma
1 Solution

cpetterborg
SplunkTrust
SplunkTrust

If you have only one second occurrence of the beginning string, this will work:

| makeresults 
| eval _raw="ubuntu CRON[2907]: pam_unix(cron:session): session opened for user root by (uid=0) ubuntu CRON[2907]: pam_unix(cron:session): session closed for user root by (uid=0)" 
| rex mode=sed "s/^(\S+)(.*?)\s(\1)/\2, /"

The process for multiple occurrences is more complex. Is the data in that case similar to the example that you provided? if not can you provide an example? Is there a maximum number of occurrences?

View solution in original post

inventsekar
Super Champion

You can run rex two times, first time to replace the first ubuntu with blank,
second ubuntu with a comma

(if the string "ubuntu" is not known before hand, please update some more details(which spot it appears), so that rex can be updated)
(rex mode=sed can not be tested on regex101 website, i have tested it on splunk directly, it works fine.. please check the screenshot)

|makeresults
 | eval _raw = "ubuntu CRON[2907]: pam_unix(cron:session): session opened for user root by (uid=0) ubuntu CRON[2907]: pam_unix(cron:session): session closed for user root"
 | rex mode=sed field=_raw "s#(^ubuntu\s)##"
 | rex mode=sed field=_raw "s#ubuntu#,#"
 | table _raw

alt text

>>> Happy Splunking !
0 Karma

cpetterborg
SplunkTrust
SplunkTrust

If you have only one second occurrence of the beginning string, this will work:

| makeresults 
| eval _raw="ubuntu CRON[2907]: pam_unix(cron:session): session opened for user root by (uid=0) ubuntu CRON[2907]: pam_unix(cron:session): session closed for user root by (uid=0)" 
| rex mode=sed "s/^(\S+)(.*?)\s(\1)/\2, /"

The process for multiple occurrences is more complex. Is the data in that case similar to the example that you provided? if not can you provide an example? Is there a maximum number of occurrences?

inventsekar
Super Champion

Hi @cpetterborg, great rex command... Great learning !

to other rex beginners, let me explain it -
"s/^(\S+)(.?)\s(\1)/\2, /"
^(\S+) --- captures the first word
`(.
?)------ remaining line is captured as "\2", till the 2nd ubuntu match
\s(\1)---- matching for "a space and word ubuntu"
before the "/", only matching part, after this "/", its the replacement part
\2,--- on the replacement, leave the\1`, write the "\2" match and then a comma ",". thats it.

>>> Happy Splunking !

cpetterborg
SplunkTrust
SplunkTrust

Thank you. I saw your original post in email. I'm glad you figured it all out. Congratulations! 🙂 I've upvoted your comment for the fine explanation!

Get Updates on the Splunk Community!

Improve Your Security Posture

Watch NowImprove Your Security PostureCustomers are at the center of everything we do at Splunk and security ...

Maximize the Value from Microsoft Defender with Splunk

 Watch NowJoin Splunk and Sens Consulting for this Security Edition Tech TalkWho should attend:  Security ...

This Week's Community Digest - Splunk Community Happenings [6.27.22]

Get the latest news and updates from the Splunk Community here! News From Splunk Answers ✍️ Splunk Answers is ...