How to Join entries for a summary index

wayn23 · ‎09-27-2017

I have two indexes that I want to create a summary from every hour.

Index1
request_type, request_guid, request_timestamp, meta_field1, meta_field2, ...

Index1 contains log entries from each processing steps in each service request. Each service request is assigned a unique request_guid and all ~10 logs for the processing of a request have that id. The time the request was made is stored in request_timestamp and also remains the same through all logs for a request..

index2
request_guid, meta_fieldA, meta_fieldB, ...

index2 contains more data for the logs, but is in a separate index so that it can be secured differently from index1. The request_guid is the same value as in index1

I want to summarize by collecting stats for each request type by hour.

The approach I have taken is to select all the logs from Index1 where the request_timestamp is in the hour. I cannot use the log time directly as a request logs might span into the next hour ( as in started at 9:59:59 and ended at 10:00:01)

 index=index1 earliest=0 
       | addinfo | eval timemillis=strftime(strptime(request_timestamp,"%Y-%m-%dT%H:%M:%S.%3N%z"),"%s") 
       | where timemillis>=info_min_time AND timemillis<info_max_time

This gives me a list of all the logs from index1 that I want to summarize. I then join the logs from index2

| join  usetime=false type=left overwrite=false request_guid  max=0
  [search index=index2]

I then create a summary with the following

| sort _time desc
| transaction request_guid  maxspan=10m mvlist=true
| stats 
    count as requests_in_period
    by request_type

What results is elevated counts per period. They all are multiples of 2, which tells me that a cross product seems to be happening. I think the problem is in the join command. There are multiple entries selected from index1 that have duplicate values for request_guid that I think are over selecting from index2, yet I want all the matching entries from index2 with the request_guid.

When I view the events right after the join, all the logs are from index1 and none are from index2

Background
I started with a subselect of the request_guids that should be summarized, but a subselect is limited to 10500 entries, which will not be enough for an hours of transactions logs.

DalJeanis · ‎09-27-2017

Try something like this. This would run every 30 minutes, at x:15 and x:45. We're summing at the 15m level, and doing so for two periods, -45m to -30m and -30m to -15m past.

earliest=-45m@m  latest=@m index=index1 OR index=index2 

| rename COMMENT as "get rid of all unneeded fields"
| fields index request_guid request_timestamp, request_type

| rename COMMENT as "calculate desired timestamp from the index1 record, roll it and the request type from the index2 record onto the index1 records for each guid"
| eval mytime=strptime(request_timestamp,"%Y-%m-%dT%H:%M:%S.%3N%z")
| eventstats min(mytime) as mytime min(request_type) as mytype by request_guid 

| rename COMMENT as "get rid of unneeded index2 records, and records that are too old or too young to have settled"
| addinfo
| where (mytime>=info_min_time) AND (mytime<info_max_time-900) AND (index="index1") 

| rename COMMENT as "chunk up the records into 15 minute chunks, stats together all records for each guid, then find out how many unique guids there were in each period."
| eval _time=900*floor(mytime/900) 
| stats  min(mytype) as request_type by _time request_guid
| stats dc(request_guid) as requests_in_period by _time request_type

wayn23 · ‎09-29-2017

Thanks for the response. You inspired me to look into using stats more instead of a join and that proved very effective and fast.

How to Join entries for a summary index

Routing logs with Splunk OTel Collector for Kubernetes

New This Month - Observability Updates Give Extended Visibility and Improve User ...

Intro to Splunk Synthetic Monitoring