I need to get deviation of error count in two time...

rajendar381 · ‎07-25-2024

Hi All , I am getting the logs from this query , But I need a query to get deviation of error count in two time periods

index="prod_k8s_onprem_dii--prod1" "k8s.namespace.name"="abc-secure-dig-servi-prod1" "k8s.container.name"="abc-cdf-cust-profile"

for this I need to consider volume of logs as well .
depending on deviation percentage I will decide , Need to promote deployment or stop the deployment

ITWhisperer · ‎07-25-2024

Something like this - obviously you will need to adjust it depending on your events and required time periods

index="prod_k8s_onprem_dii--prod1" "k8s.namespace.name"="abc-secure-dig-servi-prod1" "k8s.container.name"="abc-cdf-cust-profile" (earliest=first_earliest latest=first_latest) OR (earliest=second_earliest latest=second_latest)
| eval period=if(_time>=first_earliest AND _time<first_latest,"First","Second")
| stats count(eval(status="Error")) as error_count count as event_count by period

rajendar381 · ‎07-25-2024

we are doing some API/app deployment in one region at 12: 00 PM EST ,
the 1 st time frame would be 11:30 AM to 12:00 PM EST ( I need to get the error count )
the 2nd time frame would be 12:00 PM to 12:30 PM EST( need to get error count )
we need to consider generated log volume as well .
and get the deviation on the error count on these two time frames .
let's say , if it exceeds certain threshold , I will further proceed /stop the deployment .
so the out put of query is deviation threshold or percentage

yuanliu · ‎07-25-2024

That's better. So, you are looking at adjacent, and equal time intervals. In this case, time bucket is perhaps the simplest. Let me first give you a hard-coded example.

index="prod_k8s_onprem_dii--prod1" "k8s.namespace.name"="abc-secure-dig-servi-prod1" "k8s.container.name"="abc-cdf-cust-profile" (earliest="07/25/2024:11:30:00" latest="07/25/2024:12:30:00")
| addinfo
| bin _time span=30m@m
| stats count(eval(status="Error")) as error_count by _time
| eventstats stdev(error_count)

Is this something you are looking for?

index="prod_k8s_onprem_dii--prod1" "k8s.namespace.name"="abc-secure-dig-servi-prod1" "k8s.container.name"="abc-cdf-cust-profile" (earliest=first_earliest latest=first_latest) OR (earliest=second_earliest latest=second_latest)
| eval period=if(_time>=first_earliest AND _time<first_latest,"First","Second")
| stats count(eval(status="Error")) as error_count count as event_count by period

rajendar381 · ‎07-25-2024

why I am not getting any results , i see there are events

index="prod_k8s_onprm_dig-k8-prod1" "k8s.namespace.name"="apl-secure-dig-svc-prod1" "k8s.container.name"="abc-def-cust-prof" NOT k8s.container.name=istio-proxy NOT log.level IN(DEBUG,INFO) (error OR exception)(earliest="07/25/2024:11:30:00" latest="07/25/2024:12:30:00")
| addinfo
| bin _time span=30m@m
| stats count(eval(log.level="ERROR")) as error_count by _time
| eventstats stdev(error_count)

yuanliu · ‎07-25-2024

This is because you have a multisegment data path and eval doesn't like it. Use a single quote to tell eval log.level is a field name not some random string.

index="prod_k8s_onprm_dig-k8-prod1" "k8s.namespace.name"="apl-secure-dig-svc-prod1" "k8s.container.name"="abc-def-cust-prof" NOT k8s.container.name=istio-proxy NOT log.level IN(DEBUG,INFO) (error OR exception)(earliest="07/25/2024:11:30:00" latest="07/25/2024:12:30:00")
| addinfo
| bin _time span=30m@m
| stats count(eval('log.level'="ERROR")) as error_count by _time
| eventstats stdev(error_count)

rajendar381 · ‎07-26-2024

Thanks , I am able to get error count now , could you please let me know how to get this value in python code .if I run the code I am getting events instead of statistics , How to get statitics in the code

payload=f'search index="prod_k8s_onprem_vvvb_nnnn" "k8s.namespace.name"="apl-siii-iiiii" "k8s.container.name"="uuuu-dss-prog" NOT k8s.container.name=istio-proxy NOT log.level IN(DEBUG,INFO) (error OR exception)(earliest="07/25/2024:11:30:00" latest="07/25/2024:12:30:00")\n'
'| addinfo\n'
'| bin _time span=5m@m\n'
'| stats count(eval(log.level="ERROR")) as error_count by _time\n'
'| eventstats stdev(error_count)'

print(payload)
payload_escaped = f'search={urllib.parse.quote(payload)}'

headers = {
  'Authorization': f'Bearer {splunk_token}',
  'Content-Type': 'application/x-www-form-urlencoded'
}

url = f'https://{splunk_host}:{splunk_port}/services/search/jobs/export?output_mode=json'
response = requests.request("POST", url, headers=headers, data=payload_escaped, verify=False)

print(f'{response.status_code=}')
txt = response.text

if response.status_code==200:
  json_txt = f'[\n{txt}]'
  os.makedirs('data', exist_ok=True)
  with open("data/output_deploy.json", "w") as f:
    f.write(json_txt)
    f.close()
else:
  print(txt)

yuanliu · ‎07-25-2024

You need to tell volunteers what kind of "two time frames" are you concerned about. Two adjacent, equal time intervals? Two equal intervals days apart? Or some random intervals?

rajendar381 · ‎07-25-2024

we are doing some API/app deployment in one region at 12: 00 PM EST ,
the 1 st time frame would be 11:30 AM to 12:00 PM EST ( I need to get the error count )
the 2nd time frame would be 12:00AM to 12:30 PM EST( need to get error count )
we need to consider generated log volume as well .
and get the deviation on the error count on these two time frames .
let's say , if it exceeds certain threshold , I will further proceed /stop the deployment .
so the out put of query is deviation threshold or percentage

I need to get deviation of error count in two time frames

count

What the End of Support for Splunk Add-on Builder Means for You

Solve, Learn, Repeat: New Puzzle Channel Now Live

Building Reliable Asset and Identity Frameworks in Splunk ES

Are you a member of the Splunk Community?

I need to get deviation of error count in two time frames

count

What the End of Support for Splunk Add-on Builder Means for You

Solve, Learn, Repeat: New Puzzle Channel Now Live

Building Reliable Asset and Identity Frameworks in Splunk ES