HI folks!
I need to group by two variables but am having trouble figuring it out.
time ip_address user eventid sizesent
12:01 1.1.1.1 oneuser 9839 1
12:01 1.1.1.1 twouser 7382 2
12:02 2.3.4.5 oneuser 8211 3
12:04 1.1.1.1 threeuser 9222 4
That's an example of my data and I want to group the time stamp and the ip address. Basically, if time=time and ip_address=ip_address, combine them and, preferably, take the highest value of sizesent.
I tried dedup but that gets rid of vital information. Any help would be appreciated!
As you are looking for highest value of sizesent, sort it based on that field and then use dedup
... | sort time, ip_address, -sizesent | dedup time ip_address
Let me know if it answered your question
Hi Tullir,
Did you get a chance to test this? Just want to know whether it worked or not
The dedup
command takes the first event it finds for each unique combination of fields in its arguments. For example, dedup time ip_address
will give you
12:01 1.1.1.1 twouser 7382 2
12:02 2.3.4.5 oneuser 8211 3
12:04 1.1.1.1 threeuser 9222 4
Keep in mind that "first" is reverse time order unless you sort the events first. So 'reverse | dedup time ip_address` will give you
12:01 1.1.1.1 oneuser 9839 1
12:02 2.3.4.5 oneuser 8211 3
12:04 1.1.1.1 threeuser 9222 4
The stats
command is another option.
... | stats max(sizesent) as maxsizesent by time, ip_address | ...
produces
12:01 1.1.1.1 2
12:02 2.3.4.5 3
12:04 1.1.1.1 4
but gets rid of more vital information than dedup
.
Where does the "corA" in your sample output come from?
Thanks Rich!!
I was thinking of adding the value corX to all the entries where IP address=ip address and time=time. This was I could list all the cor's found and analyze from there?
I'm thinking I need to correlate the events some how? Would want the output to show something like
12:01 1.1.1.1 oneuser 9839 1 corA
12:02 2.3.4.5 oneuser 8211 3
12:04 1.1.1.1 threeuser 9222 4