Splunk Search

How to group by two or many fields fields?

sshubh
Explorer

Hi,

Here is my Data in 2 logs having 3 fields:

Log1:

Books Bought AccountName={} , BookIds={}

(here BookId can contains multiple bookIds)

eg: 

Books Bought AccountName={ABC} , BookIds={book1, book2, book3}

Books Bought AccountName={ABC} , BookIds={book1}

Books Bought AccountName={DEF} , BookIds={book1, book2}

Books Bought AccountName={EPF} , BookIds={book1, book3}

Books Bought AccountName={EPF} , BookIds={book1}

 

Log2:

Books Sold AccountName={} , BookId={}

(here BookId contains only one bookId)

 

Eg.

Books Sold AccountName={ABC} , BookId={book2}

Books Sold AccountName={EPF} , BookId={book1}

Books Sold AccountName={EPF} , BookId={book1}

 

Result I want:

AccountName Total Books bookName bought sold
ABC 4 book1
book2
book3
2
1
1
0
1
0
DEF 2 book1
book2
1
1
0
0
EPF 3 book1
book3
2
1
2
0
         

 

can anyone please help me. Tried but not getting the result.



Labels (3)
0 Karma

woodcock
Esteemed Legend

Here is why I stopped helping so much.  You guys don't make an effort to frame it up with fake data.  I can whip the answer up in about 5 minutes for just about anything but it takes double or triple that to frame up the fake data.  So it just too much extra effort.  YOU should have written the first 19 lines of SPL below when you posted your question.  In any case, here is the answer.  NOTE that my answer is not quite what you asked for but it is better.  To make it EXACTLY what you asked for switch to | eval _total = 'Books Bought':

|makeresults
| eval _raw="
sourcetype | AccountName | BookId
Books Bought | ABC | book1,book2,book3
Books Bought | ABC | book1
Books Bought | DEF | book1,book2
Books Bought | EPF | book1,book3
Books Bought | EPF | book1"
| append [
| makeresults
| eval _raw="
sourcetype | AccountName | BookId
Books Sold | ABC | book2
Books Sold | EPF | book1
Books Sold | EPF | book1" ]
| multikv forceheader=1
| makemv delim="," BookId
| table sourcetype AccountName BookId
| rename COMMENT AS "Everything above this line is setup, AND YOU SHOULD HAVE PROVIDED THIS FOR US!!!"
| stats count BY sourcetype AccountName BookId
| eval {sourcetype}=count
| fields - sourcetype count
| stats first(*) AS * BY AccountName BookId
| fillnull value=0
| eval _total = 'Books Bought' - 'Books Sold'
| stats sum(_total) AS Total list(*) AS * BY AccountName

 

0 Karma

woodcock
Esteemed Legend

Something broke the rendering of the markup here ...???

0 Karma

yuanliu
SplunkTrust
SplunkTrust

You need to be careful about data presentation because the actual solution depends on precise format of data.  In order to illustrate thought process, I will assume that the examples are in the precise format of raw events.  Additionally, I will assume that log1 and log2 are the values of sourcetype.

If this assumption is correct, Splunk would have given you a field AccountName in both sourcetypes; a BookId field in log1, and a BookIds field in log2.  AccountName, BookId1, and BookIds all begins and ends with paired curly brackets.  The separator in BookId2 is a comma followed by exactly one white pace.

With this, you can use

 

| foreach AccountName BookId BookIds
    [ rex field=<<FIELD>> "{(?<<<FIELD>>>[^}]+)"]
| eval BookIds = split(BookIds, ", ")
| eval book = coalesce(BookId, BookIds)
| mvexpand book
| eval bought = if(sourcetype == "log1", 1, 0)
| eval sold = if(sourcetype == "log2", 1, 0)
| stats sum(bought) as bought sum(sold) as sold by AccountName book
| stats list(*) as * by AccountName ``` this is solely for display ```

 

Your sample data will give

AccountName
book
bought
sold
ABC
book1
book2
book3
2
1
1
0
1
0
DEF
book1
book2
1
1
0
0
EPF
book1
book3
2
1
2
0

 

You can use the following data emulation to compare with your real data.

 

| makeresults
| eval sourcetype="log1", log = split("Books Bought AccountName={ABC} , BookIds={book1, book2, book3}
Books Bought AccountName={ABC} , BookIds={book1}
Books Bought AccountName={DEF} , BookIds={book1, book2}
Books Bought AccountName={EPF} , BookIds={book1, book3}
Books Bought AccountName={EPF} , BookIds={book1}", "
")
| mvexpand log
| rename log as _raw
| append [
    |makeresults
    | eval sourcetype="log2", log = split("Books Sold AccountName={ABC} , BookId={book2}
Books Sold AccountName={EPF} , BookId={book1}
Books Sold AccountName={EPF} , BookId={book1}", "
")
    | mvexpand log 
    | rename log AS _raw]
| extract
``` data emulation above ```

 

AccountNameBookIdBookIds_raw_timesourcetype
{ABC} {book1, book2, book3}Books Bought AccountName={ABC} , BookIds={book1, book2, book3}2023-03-18 13:10:46log1
{ABC} {book1}Books Bought AccountName={ABC} , BookIds={book1}2023-03-18 13:10:46log1
{DEF} {book1, book2}Books Bought AccountName={DEF} , BookIds={book1, book2}2023-03-18 13:10:46log1
{EPF} {book1, book3}Books Bought AccountName={EPF} , BookIds={book1, book3}2023-03-18 13:10:46log1
{EPF} {book1}Books Bought AccountName={EPF} , BookIds={book1}2023-03-18 13:10:46log1
{ABC}{book2} Books Sold AccountName={ABC} , BookId={book2}2023-03-18 13:10:46log2
{EPF}{book1} Books Sold AccountName={EPF} , BookId={book1}2023-03-18 13:10:46log2
{EPF}{book1} Books Sold AccountName={EPF} , BookId={book1}2023-03-18 13:10:46log2

I have doubts if the raw events are truly as illustrated.  But if they are, you can use that search.

0 Karma

acharlieh
Influencer

@sshubh People answering posts on here are doing so by donating their time. As such you should not be preemptively tagging people to help with your question.

You'll attract responses by showing what you've tried, what result you got and how you thought it should be different, and showing a willingness to learn.

You posted the exact same question a few days ago, and were given an answer. Instead of asking again with no additional information, you could follow up with what is not working for you or what you're not understanding with that answer. It may help to break down the problem into steps and check each one.

Start with are your fields extracted from your events properly? Do you get all values of the multi-valued fields as you expected on each event? Are your field names lined up between the bought and sold records so you can correlate them together? Do you have a field marking a result as a bought or a sold transaction already? ITWhisperer did some field manipulation and extraction with eval. 

Assuming you have the above done correctly, did you know that using makeresults and eval we can actually simulate your example data set with a Splunk Search and anyone can build from it? (names of fields might be slightly different, but this should be where you are at this stage. 

| makeresults count=8
| streamstats count
| eval AccountName=case(count in(1,2,6),"ABC", count=3,"DEF", true(),"EPF"), TransactionType=if(count<=5,"bought","sold"), BookId=case(count=1,split("book1,book2,book3",","),count in (2,5,7,8),"book1",count=3,split("book1,book2",","),count=4,split("book1,book3",","),count=6,"book2")
| fields - _time count


If you don't have the above done correctly, then anything afterwards isn't going to work, and you should be talking about that problem first. (I'll also note that field names are always case sensitive)

From this point IT Whisperer already showed you how stats can group by multiple fields, and even showed you the trick with eval and french braces {} in order to create fields with names based on the values of other fields, and running stats multiple times to combine things down. 

You can use the same tricks in a slightly different order to not need the fillnull command (but it's still useful to know). 

| eval T_{TransactionType}=1
| stats count(T_*) as * by AccountName BookId
| stats list(BookId) list(bought) list(sold) by AccountName

 
I leave the total books calculation as an exercise for you, but also the hint that stats can perform multiple statistical functions in a single pass on multiple different fields of the input data set.

PickleRick
SplunkTrust
SplunkTrust

1. What have you tried so far?

2. You won't get your results reliably in that form. Separate multivalued fields are independent and you can't sort one related to an order of another field.

 

Get Updates on the Splunk Community!

Detecting Remote Code Executions With the Splunk Threat Research Team

WATCH NOWRemote code execution (RCE) vulnerabilities pose a significant risk to organizations. If exploited, ...

Enter the Splunk Community Dashboard Challenge for Your Chance to Win!

The Splunk Community Dashboard Challenge is underway! This is your chance to showcase your skills in creating ...

.conf24 | Session Scheduler is Live!!

.conf24 is happening June 11 - 14 in Las Vegas, and we are thrilled to announce that the conference catalog ...