Splunk Search

How to filter the log using REGEX?

sarwshai
Communicator

I have logs which contains 'LogonType=Owner' and some logs which contains 'InternalLogonType=Owner'.
I want to send 'LogonType=Owner' to nullqueue while the latter not, so how can i write regex for it?
As writing regex for 'LogonType=Owner' would also capture 'InternalLogonType=Owner' and send it to nullqueue i assume.

Note: logs are big 'LogonType=Owner' & 'InternalLogonType=Owner' are just one string in it.

0 Karma
1 Solution

woodcock
Esteemed Legend

You need to use negative look-behind, like this:

props.conf

[yoursourcetype]
TRANSFORMS-filter = logontype-setnull

transforms.conf

[logontype-setnull]
REGEX = (?<!Internal)LogonType=
DEST_KEY = queue
FORMAT = nullQueue

View solution in original post

jeffland
SplunkTrust
SplunkTrust

There are already valid answers here, but I think the regex can be improved. Instead of using a negative lookbehind, I would either use \b to find a word boundary before the literal LogonType (which InternalLogonType will not match):

\bLogonType=Owner

But based off of your comment showing example events, an even better option is to go with an explicit match including the * before LogonType:

\*LogonType="Owner

Either of these works, but compare the step count yourself and you'll see that they are not equally performant: Negative Lookbehind 116 steps, word boundary 69 steps and explicit match 32 steps for your two sample events.

FrankVl
Ultra Champion

I think he was trying to make those bits bold, don't think those asterisks are part of his log. Assuming the events all follow the same structure, (but the timestamp and first few fields are missing from his sample), \sLogonType="Owner should do the trick.

0 Karma

jeffland
SplunkTrust
SplunkTrust

Yeah, probably. No accurate regexes without accurate data 😕
\s and \b should both do the trick, but \b is still way better because it has fewer matches (I've recreated the two events this time, they now both have time and the other fields, and removed the asterisks, and they compare 66 to 1569 steps - check out the debugger to see why).

0 Karma

sarwshai
Communicator

<\b>also worked, however it doesn't have ability to accept this one also as answer, but thanks much!

0 Karma

FrankVl
Ultra Champion

Those numbers are a bit confusing. When I look at the debugger, the \s option is actually quicker at finding the 1st match, since \b also matches some of the special characters in the timestamp, while \s doesn't.

0 Karma

sarwshai
Communicator

Yes the ** ** was to make it bold, some how it didn't, and the structure is same as i pasted the logs above.

0 Karma

woodcock
Esteemed Legend

You need to use negative look-behind, like this:

props.conf

[yoursourcetype]
TRANSFORMS-filter = logontype-setnull

transforms.conf

[logontype-setnull]
REGEX = (?<!Internal)LogonType=
DEST_KEY = queue
FORMAT = nullQueue

sarwshai
Communicator

Thanks, i would try this and confirm.

0 Karma

sarwshai
Communicator

This worked, thanks!

0 Karma

FrankVl
Ultra Champion

Either define a regex that actually detects when it is just LogonType="Owner". Quite likely a REGEX like \s+LogonType="Owner" might work, to only detect LogonType="Owner" preceded by whitespace (incl. newline). But as @richgalloway mentions: if you want proper help with that, we would need to see a full sample.

Alternatively, you can use 2 transforms (naturally, this is less efficient):

props.conf

[yoursourcetype]
TRANSFORMS-filter = logontype-setnull,internallogontype-setparse

transforms.conf

[logontype-setnull]
REGEX = LogonType="Owner"
DEST_KEY = queue
FORMAT = nullQueue

[internallogontype-setparse]
REGEX = InternalLogonType="Owner"
DEST_KEY = queue
FORMAT = indexQueue

This causes first to apply the null queue to both types (because the regex matches both options) and then sets the queue back to indexqueue for the InternalLogonType="Owner" case.

0 Karma

sarwshai
Communicator

i have pasted the sample above, however i have doubt over your suggestion to move interlogontype to index queue, because there are many logs in the same sourcetype , so would i need to write regex for every other logs except for logontype=Owner to move to index queue? or other logs would be directly indexed?

0 Karma

FrankVl
Ultra Champion

Other logs are not affected, as those will not match the logontype=Owner, so they will just keep their original queue destination (being the indexqueue).

0 Karma

FrankVl
Ultra Champion

PS: from your sample log it seems it contains a " before the field value? I added that to the REGEX in my answer.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Can you share a sample log or two? We need to see what comes before "LogonType=Owner" to create the regex.

---
If this reply helps you, Karma would be appreciated.

sarwshai
Communicator

Below are the logs which contains both 'LogonType=Owner' & 'InternalLogonType=Owner'

1.    **LogonType="Owner"** MailboxOwnerUPN="" MailboxOwnerSid="" DestMailboxOwnerUPN="" DestMailboxOwnerSid="" DestMailboxGuid="" CrossMailboxOperation="" LogonUserDisplayName="" LogonUserSid="" SourceItems="" SourceFolders="" SourceItemIdsList="" SourceItemSubjectsList="" SourceItemAttachmentsList="" SourceItemFolderPathNamesList="Inbox" SourceFolderPathNamesList="" ItemId="" ItemSubject="" ItemAttachments="" DirtyProperties="" OriginatingServer="" MailboxGuid="" MailboxResolvedOwnerName="" LastAccessed="" Identity="=" IsValid="True" ObjectState="New"


2.  2019-04-01T00:14:59+02:00 Operation="" OperationResult="" LogonType="Admin" ExternalAccess="False" DestFolderId="" DestFolderPathName="" FolderId="" FolderPathName="" ClientInfoString="Client=POP3/IMAP4;Protocol=IMAP4" ClientIPAddress="" ClientMachineName="" ClientProcessName="" ClientVersion="" **InternalLogonType="Owner"** MailboxOwnerUPN="" MailboxOwnerSid="" DestMailboxOwnerUPN="" DestMailboxOwnerSid="" DestMailboxGuid="" CrossMailboxOperation="" LogonUserDisplayName="" LogonUserSid="" SourceItems="" SourceFolders="" SourceItemIdsList=" SourceItemSubjectsList="" SourceItemAttachmentsList="" SourceItemFolderPathNamesList="Inbox" SourceFolderPathNamesList="" ItemId="" ItemSubject="" ItemAttachments="" DirtyProperties="" OriginatingServer="" MailboxGuid="" MailboxResolvedOwnerName="" LastAccessed="" Identity="" IsValid="True" ObjectState="New"

i want the first log to be discarded but not the second one.

0 Karma

FrankVl
Ultra Champion

Would that first log also have the timestamp and operation= and operationresult fields preceding the logontype field?

Also: the ** are because you were trying to make these parts bold I guess? Not because those * characters are in your actual logs?

0 Karma

sarwshai
Communicator

Yes ** for the bold, my bad! and structure is same as i pasted above.

0 Karma
Get Updates on the Splunk Community!

Splunk Enterprise Security 8.0.2 Availability: On cloud and On-premise!

A few months ago, we released Splunk Enterprise Security 8.0 for our cloud customers. Today, we are excited to ...

Logs to Metrics

Logs and Metrics Logs are generally unstructured text or structured events emitted by applications and written ...

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...