All Apps and Add-ons

How write the regular expression to extract different formats of the altSecurityIdentities attribute I'm querying in LDAP?

jmaple
Communicator

I am trying to extract text from a specific attribute that I am querying in LDAP. Our "altSecurityIdenities" attribute is not formatted the same on all users where the data in it either has an additional text string I want to capture or it doesn't. When I get it so it only selects the additional string, the other attributes that don't have that string are gathering too much, and if I fix it so the others capture less, so does my first example. Confused yet?

{"altSecurityIdentities":["X509:<I>C=US,O=Entrust,OU=Certification Authorities,OU=Entrust Managed Services SSP CA<S>OID.0.9.2342.19200300.100.1.1=16651003215794 CN=DOE JANE (Affiliate)"]}
{"altSecurityIdentities":["X509:<I>C=US,O=Entrust,OU=Certification Authorities,OU=Entrust Managed Services SSP CA<S>CN=DOE JOHN OID.0.9.2342.19200300.100.1.1=16651002070291"]}

Basically I'm trying to extract the name after CN=, but since the lines aren't structured the same way (OID value comes before CN in one event but not the other), I'm having trouble finding the balance where I can capture the extra string in one, but not gather the OID value of the other.

I started with this simple regex:

CN=[^"]+

While that captures CN=DOE JANE (Affiliate) correctly, it also captures CN=DOE JOHN OID.0.9.2342.19200300.100.1.1=16651002070291 since the quotation is at the end of the query after the OID string on the other object. I know I'm missing something fairly simple, but I just can't seem to get it.

0 Karma
1 Solution

sundareshr
Legend

Try this

rex "CN=(?<cn>.+?)(OID|\")" | table cn

View solution in original post

rsennett_splunk
Splunk Employee
Splunk Employee

This works for the scenarios you've given.

CN=(?P<CN>.*+)(?:OID|"]})(?:[^"]+)

The question is... are those the only scenarios? It would be a good idea to first... take a look at the patterns using the punct field with a simple stats count by punct and then examine the distinction between the different patterns.

That way you can find other possible "runon sentence" holes. But the fact that you can anchor on the CN makes this pretty clean. The only consequence in this case is that if the CN is not followed by OID or "}} it's not going to pick it up. So perhaps a bit of tweaking or... what i'm sure are many additional suggestions happening while I'm typing this, will help! 🙂

With Splunk... the answer is always "YES!". It just might require more regex than you're prepared for!
0 Karma

sundareshr
Legend

Try this

rex "CN=(?<cn>.+?)(OID|\")" | table cn

jmaple
Communicator

That did the trick. I guess I stared at it too long. Much obliged.

0 Karma

rsennett_splunk
Splunk Employee
Splunk Employee

Zactly. However it will make it more efficient if you mark the second capturing group as non capturing. Otherwise regex will pull it and not pull and discard it. Better to have it not use it. (in the scheme of things.) so (OID|\") becomes (?:OID|\")

With Splunk... the answer is always "YES!". It just might require more regex than you're prepared for!
0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...