All Apps and Add-ons

How write the regular expression to extract different formats of the altSecurityIdentities attribute I'm querying in LDAP?

jmaple
Communicator

I am trying to extract text from a specific attribute that I am querying in LDAP. Our "altSecurityIdenities" attribute is not formatted the same on all users where the data in it either has an additional text string I want to capture or it doesn't. When I get it so it only selects the additional string, the other attributes that don't have that string are gathering too much, and if I fix it so the others capture less, so does my first example. Confused yet?

{"altSecurityIdentities":["X509:<I>C=US,O=Entrust,OU=Certification Authorities,OU=Entrust Managed Services SSP CA<S>OID.0.9.2342.19200300.100.1.1=16651003215794 CN=DOE JANE (Affiliate)"]}
{"altSecurityIdentities":["X509:<I>C=US,O=Entrust,OU=Certification Authorities,OU=Entrust Managed Services SSP CA<S>CN=DOE JOHN OID.0.9.2342.19200300.100.1.1=16651002070291"]}

Basically I'm trying to extract the name after CN=, but since the lines aren't structured the same way (OID value comes before CN in one event but not the other), I'm having trouble finding the balance where I can capture the extra string in one, but not gather the OID value of the other.

I started with this simple regex:

CN=[^"]+

While that captures CN=DOE JANE (Affiliate) correctly, it also captures CN=DOE JOHN OID.0.9.2342.19200300.100.1.1=16651002070291 since the quotation is at the end of the query after the OID string on the other object. I know I'm missing something fairly simple, but I just can't seem to get it.

0 Karma
1 Solution

sundareshr
Legend

Try this

rex "CN=(?<cn>.+?)(OID|\")" | table cn

View solution in original post

rsennett_splunk
Splunk Employee
Splunk Employee

This works for the scenarios you've given.

CN=(?P<CN>.*+)(?:OID|"]})(?:[^"]+)

The question is... are those the only scenarios? It would be a good idea to first... take a look at the patterns using the punct field with a simple stats count by punct and then examine the distinction between the different patterns.

That way you can find other possible "runon sentence" holes. But the fact that you can anchor on the CN makes this pretty clean. The only consequence in this case is that if the CN is not followed by OID or "}} it's not going to pick it up. So perhaps a bit of tweaking or... what i'm sure are many additional suggestions happening while I'm typing this, will help! 🙂

With Splunk... the answer is always "YES!". It just might require more regex than you're prepared for!
0 Karma

sundareshr
Legend

Try this

rex "CN=(?<cn>.+?)(OID|\")" | table cn

View solution in original post

jmaple
Communicator

That did the trick. I guess I stared at it too long. Much obliged.

0 Karma

rsennett_splunk
Splunk Employee
Splunk Employee

Zactly. However it will make it more efficient if you mark the second capturing group as non capturing. Otherwise regex will pull it and not pull and discard it. Better to have it not use it. (in the scheme of things.) so (OID|\") becomes (?:OID|\")

With Splunk... the answer is always "YES!". It just might require more regex than you're prepared for!
0 Karma
.conf21 Now Fully Virtual!
Register for FREE Today!

We've made .conf21 totally virtual and totally FREE! Our completely online experience will run from 10/19 through 10/20 with some additional events, too!