Splunk Search

Help optimising search

tomjb94
New Member

Hi -  

I am currently looking to optimise the search below as it is using a lot of search head resource:

index=idem attrs.GW2_ENV_CLASS=preprod http_status=5* http_status!=503 NOT "mon-tx-"

Sample JSON result set:

   

@timestamp: 2024-07-31T12:41:20+00:00
   attrs.AWS_AMI_ID:
   attrs.AWS_AZ: eu-west-1c
   attrs.AWS_INSTANCE_ID: i-0591d93b5e5881da9
   attrs.AWS_REGION: eu-west-1
   attrs.GW2_APP_VERSION:
   attrs.GW2_ENV_CLASS: preprod
   attrs.GW2_ENV_NUMBER: 0
   attrs.GW2_SERVICE: idem
   body_bytes: 1620
   bytes_sent: 2060
   client_cert_expire_in_days: 272
   client_cert_expiry_date: Apr 30 10:11:07 2025 GMT
   client_cert_issuer_dn: CN=******* PROD SUB CA2,O=Fidelity National Information Services,L=Jacksonville,ST=Florida,C=US
   client_cert_verification: SUCCESS
   client_dn: CN=idem-semantic-monitoring-preprod,OU=Gateway2Cloudops,O=Fidelity National Information Services,L=London,C=GB
   container_id: 17b7167ec5f2d20ec10704550fc8f2c2b9daedc835ce5fe0828ac86651983517
   container_name: /idem-kong-1
   correlationId:
   hostname: 17b7167ec5f2
   http_content_type: application/vnd.*******.idempotency-v1.0+json
   http_referer:
   http_status: 200
   http_user_agent: curl/8.5.0
   log: {"@timestamp": "2024-07-31T12:41:20+00:00", "correlationId": "", "request_method": "POST", "hostname": "17b7167ec5f2", "http_status": 200, "bytes_sent": 2060, "body_bytes": 1620, "request_length": 1689, "request": "POST /idempotency/entries/update HTTP/2.0", "http_user_agent": "curl/8.5.0", "http_referer": "", "body_bytes": 1620, "remote_addr": "10.140.49.156", "remote_user": "", "response_time_s": 0.007, "client_dn": "CN=idem-semantic-monitoring-preprod,OU=Gateway2Cloudops,O=Fidelity National Information Services,L=London,C=GB", "client_cert_issuer_dn": "CN=******* RSA PROD SUB CA2,O=Fidelity National Information Services,L=Jacksonville,ST=Florida,C=US", "client_cert_expiry_date": "Apr 30 10:11:07 2025 GMT", "client_cert_expire_in_days": "272", "client_cert_verification": "SUCCESS", "wpg_correlation_id": "mon-tx-ecs-1722429678-idem-pp-2.preprod.euw1.gw2.*******.io", "http_content_type": "application/vnd.******.idempotency-v1.0+json", "uri_path": "/idempotency/entries/update"}
   parser: json
   remote_addr: 10.140.49.156
   remote_user:
   request: POST /idempotency/entries/update HTTP/2.0
   request_length: 1689
   request_method: POST
   response_time_s: 0.007
   source: stdout
   uri_path: /idempotency/entries/update
   wpg_correlation_id: mon-tx-ecs-1722429678-idem-pp-2.preprod.euw1.gw2.*******.io

 

I have tried adding additional filtering on particular fields, but it is not having the desired effect.

Please note, the wildcards in the JSON are where i have masked this for the purposes of this community case.

Thanks,

Labels (1)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

OK. While one of the obvious culprits is probably the exclusion (NOT "mon-tx-"), it would be very useful to see the job report because depending on your data you might be having also problems elsewhere.

For example - the

http_status=5*

condition while seemingly harmless can be very very "heavy" if you have many different fields containing strings beginning with 5. So this might be the case for either some form of acceleration (summary indexing?) or one of the border cases where it's actually a good idea to use indexed field.

BTW, this is not a JSON.

0 Karma

bowesmana
SplunkTrust
SplunkTrust

Using the TERM() directive in search can dramatically improve speed as it will not have to do a data search, so using 

TERM(preprod)

would avoid having to extract the fields and compare values from the data - question are you using JSON INDEXED_EXTRACTIONS?

The NOT search will be expensive, depending on what proportion of events will have mon-tx- in the data, you may find benefit in filtering that in a subsequent | where clause.

You could also do

(http_status=5* AND (TERM(500) OR TERM(501) OR ... ))

i.e. include all the 5xx codes you want - if you know what they can be.

But it may be that the rest of your search is where some of the performance problems are - can you share a bit more of the search?

gcusello
SplunkTrust
SplunkTrust

Hi @tomjb94 ,

is there some word (e.g. the word "preprod") or string that you can add to your main search (not replacing search by field but adding to it)?

this approach will give you more speed in your searches.

index=idem 
attrs.GW2_ENV_CLASS=preprod 
http_status=5* 
http_status!=503 
NOT "mon-tx-"
preprod 

then, can you reduce the time window?

if you have too many events, you couldaccelarate your searches scheduling a search and saving results in a summary index and then use this index for your searches.

Ciao.

Giuseppe

0 Karma
Get Updates on the Splunk Community!

Cloud Platform | Customer Change Announcement: Email Notification Will Be Available ...

The Notification Team is migrating our email service provider since currently there’s no support ...

Mastering Synthetic Browser Testing: Pro Tips to Keep Your Web App Running Smoothly

To start, if you're new to synthetic monitoring, I recommend exploring this synthetic monitoring overview. In ...

Splunk Edge Processor | Popular Use Cases to Get Started with Edge Processor

Splunk Edge Processor offers more efficient, flexible data transformation – helping you reduce noise, control ...