Splunk Search

Complex Regex, HELP!

Builder

I have a transform that I need help writing a regex for. It has two conditions.

  1. It needs to match the value in this field <BusinessDomainID>*</BusinessDomainID>
  2. It needs to not match this exact string <LogEventTypeCode>SEC_EVENT</LogEventTypeCode>

Here's what I have so far which meets the requirements for condition #1 and works correctly

REGEX=(?m)\<BusinessDomainId\>(BusinessDomainId1|BusinessDomainId2|BusinessDomainId3)\</BusinessDomainId\>

How can I add condition #2 in there so it will not match if it sees <LogEventTypeCode>SEC_EVENT</LogEventTypeCode> in the same event?

Here's some raw events.
This one matches condition 1 and 2

<ELLogInputMessage> 
        <Header> 
            <LogEventTypeCode>SEC_EVENT</LogEventTypeCode> 
            <LogSeverityCode>CRITICAL</LogSeverityCode> 
            <LogEventDateTime>2014-05-06T23:59:59.9999999-05:00</LogEventDateTime> 
        </Header> 
        <SourceInformation> 
            <EAPMId>1</EAPMId> 
            <HostMachineName>HostMachineName3</HostMachineName> 
            <HostEnvironmentName>HostEnvironmentName3</HostEnvironmentName> 
            <ComponentId>ComponentId3</ComponentId> 
            <ComponentName>ComponentName3</ComponentName> 
            <ApplicationEventCorrelationId>ApplicationEventCorrelationId3</ApplicationEventCorrelationId> 
            <UserId>UserId1</UserId> 
            <UserSrc>UserSrc1</UserSrc> 
            <BusinessDomainId>BusinessDomainId1</BusinessDomainId> 
            <BusinessDomainName>BusinessDomainName1</BusinessDomainName> 
        </SourceInformation> 
        <DataAccessInformation> 
            <DataCompId>DataCompId2</DataCompId> 
            <TypeOfAccess>VIEW</TypeOfAccess> 
            <SubjectOfInterest> 
                <SubjectId>SubjectId13</SubjectId> 
                <SubjectName>SubjectName13</SubjectName> 
                <SubjectDomainName>SubjectDomainName3</SubjectDomainName> 
            </SubjectOfInterest> 
            <AccessDateTime>2014-05-06T23:59:59.9999999-05:00</AccessDateTime> 
        </DataAccessInformation> 
        <DetailedLogInformation>anyType</DetailedLogInformation>
    </ELLogInputMessage>

This one matches only condition 1
The only difference here is this line <LogEventTypeCode>APP_EVENT</LogEventTypeCode>

    <ELLogInputMessage> 
        <Header> 
            <LogEventTypeCode>APP_EVENT</LogEventTypeCode> 
            <LogSeverityCode>CRITICAL</LogSeverityCode> 
            <LogEventDateTime>2014-05-06T23:59:59.9999999-05:00</LogEventDateTime> 
        </Header> 
        <SourceInformation> 
            <EAPMId>1</EAPMId> 
            <HostMachineName>HostMachineName3</HostMachineName> 
            <HostEnvironmentName>HostEnvironmentName3</HostEnvironmentName> 
            <ComponentId>ComponentId3</ComponentId> 
            <ComponentName>ComponentName3</ComponentName> 
            <ApplicationEventCorrelationId>ApplicationEventCorrelationId3</ApplicationEventCorrelationId> 
            <UserId>UserId1</UserId> 
            <UserSrc>UserSrc1</UserSrc> 
            <BusinessDomainId>BusinessDomainId1</BusinessDomainId> 
            <BusinessDomainName>BusinessDomainName1</BusinessDomainName> 
        </SourceInformation> 
        <DataAccessInformation> 
            <DataCompId>DataCompId2</DataCompId> 
            <TypeOfAccess>VIEW</TypeOfAccess> 
            <SubjectOfInterest> 
                <SubjectId>SubjectId13</SubjectId> 
                <SubjectName>SubjectName13</SubjectName> 
                <SubjectDomainName>SubjectDomainName3</SubjectDomainName> 
            </SubjectOfInterest> 
            <AccessDateTime>2014-05-06T23:59:59.9999999-05:00</AccessDateTime> 
        </DataAccessInformation> 
        <DetailedLogInformation>anyType</DetailedLogInformation>
    </ELLogInputMessage>
Tags (2)

Communicator

You're going to have to use a negative assertion.
example:

(?!.*"fill in the blank here") the quantifier will search for all instances. My response is very late so for more info just respond to this comment.

Hope you found an answer.

0 Karma

SplunkTrust
SplunkTrust

I'm going to suggest this -->

REGEX = (?m)<LogEventTypeCode>(?!SEC_EVENT).*\<BusinessDomainId\>(BusinessDomainId1|BusinessDomainId2|BusinessDomainId3)\</BusinessDomainId\>

(Which wraps poorly)

I think this is what you want just stated (a little) differently. I'm using a "negative lookahead assertion" operator - which says (roughly):

"LogEventTypeCode followed by anything BUT SEC_EVENT, followed by some anything, then BusinessDomainId"

If this isn't exactly it, it's close 😄

Legend

@araitz - kind of big, but I like it

Splunk Employee
Splunk Employee

This picture is hanging on the 2nd floor of Splunk HQ:

alt text

SplunkTrust
SplunkTrust

If the goal of the transforms rule is for example routing then you could split this into two regexes. One looks for your BusinessDomainID and does something, the other looks for your SEC_EVENT and un-does that something.

0 Karma

SplunkTrust
SplunkTrust

So your goal here is to build a transform that will extract a field for "BusinessDomainID" but only when the event does not also include <LogEventTypeCode>SEC_EVENT</LogEventTypeCode> ?

0 Karma

Builder

Made an edit and added sample events

0 Karma

Legend

A "not this" in regular expressions is notoriously difficult. I would suggest that you break this into two separate transforms if you can (and maybe you can't...)

0 Karma

Motivator

Please post the actual characters (anonymized of course) for event 1 and event 2, and we can help much better. RegEx is VERY specific, and it helps to see what you're trying to match.

0 Karma

Path Finder

Could you post two (anonymized if you want) examples of the full event? One that does match condition 2 and one that doesn't? I'm not an expert on regex by any means but I'm confident that the placement of the presence of condition 2 in the event will effect the final regex format...

0 Karma