I am working with the fields srcip and malware-type. I need to show how many instances of each type of malware have been observed for each srcip, and then group them by src_ip.
For example:
127.0.0.1: TrojanHorse (2)
Worm (1)
192.168.7.256: TrojanHorse (1)
1.8.9.10: Worm (5)
Rootkit (2)
The code "stats values(malware-type) by srcip" will list the types of malware grouped by srcip, but getting the counts in there (while still grouping by srcip) has me stumped...can anyone help me out? Thanks!
Yep, although the parentheses are necessary...just needed the data. Thanks!
Are the counts of malware types represented by the digits in the parentheses?
To answer this question we should be aligned. We are making a couple of assumptions.
If we use your sample of data, here is how we will see that in Splunk.
From there we extract the IP,malware type and count (assuming that is embedded in the parentheses).
The inline extractions to obtain these fields are:
... | rex "(?<srcip>\d+\.\d+\.\d+\.\d+):\s(?<info>.+)"
| rex field=info max_match=0 "(?<malware_type>[a-zA-Z]+)\s+\((?<mcount>\d+)\)"
You can combine a number of functions with the stats command. Here is a complete list.
You may want to try:
... | stats list(malware_type) count(malware_type) by srcip
Which will return the following. I don't know that this is what you are looking for but here it is nonetheless.
More than likely, this may be a more effective way to represent this data:
... | stats list(malware_type) AS malware_type list(mcount) AS mcount sum(mcount) as total by srcip
All together:
... | rex "(?<srcip>\d+\.\d+\.\d+\.\d+):\s(?<info>.+)"
| rex field=info max_match=0 "(?<malware_type>[a-zA-Z]+)\s+\((?<mcount>\d+)\)"
| stats list(malware_type) AS malware_type list(mcount) AS mcount sum(mcount) as total by srcip