I am looking to create a splunk query but finding it complex to start with.
Use case:
Index 1 has two logs like
Log 1: Received from client C for user Y and request id: X
Log 2: request id:X completed
Index 2 looks like:
User Y has total sent items count : Z
I want output if all user items count , from particular client say , D, for which request is completed.
Basically, if from D client, there was a request and that request is completed, give me the User and items count for that.
Presuming an item is not counted if it is not complete, we can combine Log 1 and index2 like this.
(index=index1 "Received from client * for user * and request id: *" OR
(index=index2 "User * has total sent items count : *")
| rex "[uU]ser (?<user>\S+)"
| rex "items count : (?<itemCount>\d+)"
| stats sum(itemCount) as "Items count" by user
Adding to Rich’s answer this is a good resource. https://conf.splunk.com/files/2020/slides/TRU1761C.pdf
want to combine, log1 and log2 from index1 with index2?
Using the construct of @richgalloway's answer,
(index=index1 "Received from client * for user * and request id: *" OR "request id:* completed") OR
(index=index2 "User * has total sent items count : *")
| rex "request id:\s*(?<id>\w+)
| rex "from client (?<client>\S+)"
| rex "[uU]ser (?<user>\S+)"
| rex "items count : (?<itemCount>\d+)"
| stats sum(itemCount) as "Items count" values(source) as source by user id
| where source == "log2"
Note this uses a side effect of SPL's multivalue equality test and deduced source of "completed" status. You can reconstruct this to be more semantically explicit.
How is it making sure that the request id is completed. I cannot see computation on "request id"
As I said, it was a deduction. From your description, "request id:* completed" is only found in log2, "Received from client * for user * and request id: *" is only found in log1. Assuming log1, log2 are in source field, the search term
(index=index1 "Received from client * for user * and request id: *" OR "request id:* completed")
makes sure that if an event has source=="log2", it must contain "request id:* completed". Computation of "request id" or, in my code sample, "id", is computed via stats groupby. But the stats command in the previous post was incorrect. Group by id and group by user cannot be in a single stats.
The correct command should be
(index=index1 "Received from client * for user * and request id: *" OR "request id:* completed") OR
(index=index2 "User * has total sent items count : *")
| rex "request id:\s*(?<id>\w+)"
| rex "from client (?<client>\S+)"
| rex "[uU]ser (?<user>\S+)"
| rex "items count : (?<itemCount>\d+)"
| eventstats values(source) as source values(user) as user by id ``` operates on index=index1 source IN (log1, log2) ```
| where source == "log2" ``` only users who have at least one id that is in log2 will remain in index1 ```
| stats sum(itemCount) as "Items count" values(index) as index by user
| where isnotnull('Items count') AND index == "index1" ``` only users who have a remaining entry in index1, i.e., who have at least one id that is in log2, i.e., who have at least one "completed" record, AND who also have at least one 'Items count' ```
Similar to the deduction about log2, index == "index1" deduces that the user has at least one request id that is in log2, therefore must has at least one "completed" request id.
You need to make sure that you have the same field name for "user" between both indexes. Assuming that the "user field" in both indexes is called "user", and assuming that you have fields extracted which are called "client" and "count_field" and "requestID" then something like the following should work:
index=indexA "Received from client" OR ("request id:" "completed") | transaction startsWith="Received from client" endsWith="completed" requestID
| stats values(client) as client values(user) as user by requestID
| join type=outer user [| search index=indexB "has total sent items count" | stats sum(count_field) as count_total by user]
The thing to watch out for here is that the count that is returned from index2 will essentially be a sum of the count for the user. Because there is no "requestID" present in Index2. So the count_field that is seen in the index2 is not directly tied to the requestID that is seen in index1.