Splunk Search

Mismatch ']' in the search of Python Splunk SDK package

bergen288
Engager

My python is 3.8.5 and splunk-sdk is 1.6.16.  My Splunk developer gives me a URL and I get its search string to retrieve data as shown below.

bergen288_0-1636141902635.png

Below is my search string and additional python code: search/earliest/latest are added after copy/paste search string.

SEARCH_STRING = f"""
    search sourcetype="builder:payeeservice" host="JWPP*BLDR*P*" "*PayeeAddResponse" "*" "*" "*" "*" "*" "*" "*"
    earliest=-1h@h latest=-0h@h
    |rex d5p1:Description>(?<Description>.*</d5p1:Description>)
    |eval Description = replace(Description,"<[/]*[d]5p1:[\S]*>|<[d]5p1:[\S\s\"\=]*/>", "")
    |rex "GU\(((?P<SponsorId>[^;]+);(?P<SubscriberId>[^;]+);(?P<SessionId>[^;]*);(?P<CorrelationId>[^;]+);(?P<Version>\w+))\)"
    |table _time,SponsorId, SubscriberId,SessionId, CorrelationId,Description
    |join type=left CorrelationId [search sourcetype="builder:payeeservice" host="JWPP*BLDR*P*"  "*AdditionalInformation*" |xmlkv ]
    |eval Timestamp = if((TenantId != ""),Timestamp,_time),PayeeName = if((TenantId != ""),PayeeName,""), Message = if((Description != ""),Description,Message), Exception = if((TenantId != ""),Exception,""), Address = if((TenantId != ""),Address,""), PayeeType = if((TenantId != ""),PayeeType,""),MerchantId = if((TenantId != ""),MerchantId,""),AccountNumber = if((TenantId != ""),AccountNumber,""),SubscriberId = if((TenantId != ""),UserId,SubscriberId),SponsorId = if((TenantId != ""),TenantId,SponsorId)
    |table Timestamp, SponsorId,SubscriberId, PayeeName,Message,Exception,CorrelationId,SessionId,PayeeName,Address,PayeeType,MerchantId,AccountNumber
"""
import splunklib.results as results
service = connect_Splunk()
rr = results.ResultsReader(service.jobs.create(SEARCH_STRING))
ord_list = []
for result in rr:
    if isinstance(result, results.Message):
        #skip messages
        pass
    elif isinstance(result, dict):
        # Normal events are returned as dicts
        ord_list.append(result)
 
I get this error so something is wrong in my search string.  How to fix it?
splunklib.binding.HTTPError: HTTP 400 Bad Request -- Error in 'SearchParser': Mismatched ']'.
 
Thanks.
 
Labels (2)
0 Karma
1 Solution

PickleRick
SplunkTrust
SplunkTrust
0 Karma

PickleRick
SplunkTrust
SplunkTrust

Firstly - what is this abomination

 search sourcetype="builder:payeeservice" host="JWPP*BLDR*P*" "*PayeeAddResponse" "*" "*" "*" "*" "*" "*" "*"

Wildcards at the beginning cause you to  scan whole events. Not a very good idea. And those repeated wildcards are pointless.

Secondly, think if you can do it as some form of stats. Joins are much less effective and have limitations.

Thirdly - start your search from the beginning and add subsequent steps to see where is the error. It's much easier to pinpoint a mistake this way than debug whole complicated search.

And lastly - it has nothing to do with python since the search itself gives you errors.

0 Karma

bergen288
Engager

Good advice.  Now, I only keep the following simple search statement with "_raw" column only as it contains all my required fields. 

SEARCH_STRING = """
   search sourcetype="builder:payeeservice" host=JWPP*BLDRBP* "*AdditionalInformation*"
    earliest=-1h@h latest=-0h@h
    |table _raw
"""
 
The sample data is in OrderDict format as shown below.  I need to extract all fields between  <NetworkPayeeAddManager> and </NetworkPayeeAddManager> or between <PayeeAddManager> and </PayeeAddManager> and save all information to Pandas DataFrame.  What's the best way to do it?
 
OrderedDict([('_raw', '2021-11-08 08:58:23,832 [42] INFO  FiservLog.stdlog - <NetworkPayeeAddManager><TenantId>13744</TenantId><UserId>999176993878</UserId><SourceMethodName>LogInfoSecure</SourceMethodName><SourceLineNumber>234</SourceLineNumber><Message>NetworkPayee was added successfully</Message><Timestamp>2021-11-08T13:58:23.831628Z</Timestamp><Exception /><AdditionalInformation><SessionId>F7E65ED4D8C74E6699C62F23ECF5D000200TWNQ9X1AA1754513234A6367FEE06</SessionId><Timestamp>11/8/2021 1:58:23 PM</Timestamp><CorrelationId>2461b5d9839a46739e9a3e918ca0681b-01</CorrelationId><PayeeName>Louisville fire brick</PayeeName><Address>{"Address1":"Po 9229","Address2":null,"City":"Louisville","State":"KY","Zip5":"40209","Zip4":null,"Zip2":null}</Address><PayeeType>UnManagedPayee</PayeeType><AccountNumber>XX2222</AccountNumber></AdditionalInformation></NetworkPayeeAddManager>')])
OrderedDict([('_raw', '2021-11-08 08:58:24,783 [105] INFO  FiservLog.stdlog - <PayeeAddManager><TenantId>DI737</TenantId><UserId>344801483</UserId><SourceMethodName>LogInfoSecure</SourceMethodName><SourceLineNumber>234</SourceLineNumber><Message>Payee was added successfully</Message><Timestamp>2021-11-08T13:58:24.7831103Z</Timestamp><Exception /><AdditionalInformation><SessionId>7FC6442718864CE4838E50B026C8D0A0000TWNXSV1721BE0D804F295706DD39E</SessionId><Timestamp>11/8/2021 1:58:24 PM</Timestamp><CorrelationId>ab33b59c-756e-4144-ad62-6f0afadbe8eb</CorrelationId><PayeeName>Gail Nezworski</PayeeName><Address>{"Address1":"2280 S 460 E","Address2":null,"City":"LaGrange","State":"IN","Zip5":"46761","Zip4":null,"Zip2":null}</Address><PayeeType>UnManagedPayee</PayeeType><AccountNumber>XXXXX1888</AccountNumber></AdditionalInformation></PayeeAddManager>')])
0 Karma

bergen288
Engager

I would expect the output dataframe has columns from first "TenantId" to last "AccountNumber" with values such as 13744, XX2222.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Come on. You have Splunk, don't just pull the raw data and process it on the receiver's side.

Do a proper search and retrieve the final results.

In your case the events look XML-ish. Maybe you should use spath or xpath to extract the data you want from the events.

And once again - avoid "*something" as a condition.

0 Karma

bergen288
Engager

Rick:

I modified my search string based on your hints.  In one minute at 9:33am today, there are 1672 rows.  Unfortunately, 23 rows do not have PayeeType column so they have 12 columns while all others have 13 columns which will cause failure to load whole data into Pandas dataframe.  Below is an example of _raw column.  It doesn't have PayeeType.  In addition, there is a chance that AccountNumber may have the same issue.  Is there a way to let Splunk generate "null" value for them so that all rows have 13 columns even though PayeeType and/or AccountNumber might be missing in _raw value?

Thanks.

"2021-11-18 09:33:06,900 [59] INFO FiservLog.stdlog - <PayeeAddManager><TenantId>FI05</TenantId><UserId>559852410</UserId><SourceMethodName>LogInfoSecure</SourceMethodName><SourceLineNumber>234</SourceLineNumber><Message>WARNING:Error adding Payee:Subscriber status prevents this action from being completed</Message><Timestamp>2021-11-18T14:33:06.899739Z</Timestamp><Exception /><AdditionalInformation><SessionId>463949F06E9F4B93A57570E8B56489A0201T4Q4P019019D467AADD625BC88A04</SessionId><Timestamp>11/18/2021 2:33:06 PM</Timestamp><CorrelationId>1637245986853</CorrelationId><PayeeName>PNC CARD SERVICES</PayeeName><Address>null</Address><AccountNumber>XXXXXXXXXXXX8590</AccountNumber></AdditionalInformation></PayeeAddManager>"

    search sourcetype="builder:payeeservice" host=JWPP*BLDRBP* "*AdditionalInformation*"
    earliest=-27m@m latest=-26m@m    
    |xpath outfield=Timestamp "//NetworkPayeeAddManager/Timestamp"
    |xpath outfield=TenantId "//NetworkPayeeAddManager/TenantId"
    |xpath outfield=UserId "//NetworkPayeeAddManager/UserId"
    |xpath outfield=SourceMethodName "//NetworkPayeeAddManager/SourceMethodName"
    |xpath outfield=SourceLineNumber "//NetworkPayeeAddManager/SourceLineNumber"
    |xpath outfield=Message "//NetworkPayeeAddManager/Message"
    |xpath outfield=Exception "//NetworkPayeeAddManager/Exception"
    |xpath outfield=SessionId "//NetworkPayeeAddManager/AdditionalInformation/SessionId"
    |xpath outfield=CorrelationId "//NetworkPayeeAddManager/AdditionalInformation/CorrelationId"
    |xpath outfield=PayeeName "//NetworkPayeeAddManager/AdditionalInformation/PayeeName"
    |xpath outfield=Address "//NetworkPayeeAddManager/AdditionalInformation/Address"
    |xpath outfield=AccountNumber "//NetworkPayeeAddManager/AdditionalInformation/AccountNumber"
    |xpath outfield=PayeeType "//NetworkPayeeAddManager/AdditionalInformation/PayeeType"
    |table Timestamp TenantId UserId SourceMethodName SourceLineNumber Message Exception SessionId
     CorrelationId PayeeName Address AccountNumber PayeeType
 
0 Karma

PickleRick
SplunkTrust
SplunkTrust

Yup.

You can use fillnull https://docs.splunk.com/Documentation/Splunk/8.2.3/SearchReference/Fillnull

I think it's what you need.

0 Karma

bergen288
Engager

Yes, I got "null" value for PayeeType after adding "|fillnull value=null PayeeType" in my SEARCH_STRING.

Thanks.

0 Karma
Get Updates on the Splunk Community!

Notification Email Migration Announcement

The Notification Team is migrating our email service provider from Postmark to AWS Simple Email Service (SES) ...

Mastering Synthetic Browser Testing: Pro Tips to Keep Your Web App Running Smoothly

To start, if you're new to synthetic monitoring, I recommend exploring this synthetic monitoring overview. In ...

Splunk Edge Processor | Popular Use Cases to Get Started with Edge Processor

Splunk Edge Processor offers more efficient, flexible data transformation – helping you reduce noise, control ...