Simple one for you all!
I have a query that shows files(cs_uri_stem) on a webserver accessed and the http status codes (status) as a result of that transaction.
I would like an output that shows each of the files accessed(unique value), as well as how many status 200 codes (for each file), as well as a percentage of the 200 codes in relation to all other http status codes for that particular file.
output example:
cs_uri_stem requests percent of 200 codes
file1.txt 150 50% (assuming there were 300 requests with 150x http 200's and 150x http 404's)
file2.pdf 120 100% (assuming there were 120 requests with 120x http 200's)
Thanks everyone!
I made an assumption about what field contains the 200 values. Please change "status" throughout to the correct field name for your data.
Also, try changing "sum" to "count".
It would help you know what you've tried so far and how those attempts failed you.
I would use something like the following:
<your query for relevant events>
| stats count as requests, sum(eval(status=200)) as success by cs_uri_stem
| eval pct=success*100/requests
| table cs_uri_stem requests pct
Thanks for the response Rich.
i tried out your query, the uri stem field and requests count field does populate, however the pct field remains blank for all rows.
Any thoughts?
I made an assumption about what field contains the 200 values. Please change "status" throughout to the correct field name for your data.
Also, try changing "sum" to "count".
So this is what worked in the end.
| generating search
| stats count as requests, count(eval('sc-status'=200)) as success by cs-uri-stem
| eval pct=success*100/requests
| table cs-uri-stem requests pctApologies for giving incorrect field names that complicated things. i didnt think it mattered if a field name had punctuation. Lesson has been learned.
What do your event actually look like - please provide an obfuscated example and your current query so we can see how you are processing the events / files?
These are AWS s3 cloudfront logs.
for 1 log event, its basically 30+ fields like..
cs-bytes: xxx
cs-method: xxx
cs-protocol: xxx
cs-uri-stem: xxx
sc-status: xxx
Does that help any?
Good thing you attempt to illustrate data. But as several have already pointed out, you need to be precise when discussing data analytics - qualitatively precise. Not only did you use underscore in one place and dashes in another place as @bowesmana notices, but in this illustration, you give prefix cs in most fields, but sc in the last field. Is that supposed to be cs-status as opposed to sc-status? As you change field names, you are asking volunteers here to read you mind about what each field name means in your original post.
I think @richgalloway already gives you the solution if you just apply it with the correct field names. If I have to guess, cs-uri-stem is what you call cs_uri_stem in OP, and cs-status is what you call status in OP. Hence,
<your query for relevant events>
| stats count as requests, sum(eval('cs-status'=200)) as success by cs-uri-stem
| eval pct=success*100/requests
| table cs-uri-stem requests pct
Apologies, for the inaccurate original post. I was going by memory rather than referencing the actual log.
While everyone had essentially the same SPL, Apparently the quotes around 'sc-status' made the difference. Everyone elses 200 code count was bringing back 0. when i put quotes around sc-status in their SPL, it worked as well.
can you explain what the difference in using quotes did in this case?
Without the single quotes (and it is important to use single quotes not double quotes), Splunk treats it as two fields 'sc' minus 'status'. With the single quotes, the field name becomes a single field name i.e. 'cs-status'. Whenever you have a field name with "special characters" including spaces, you should use single quotes when referencing the field, e.g.' field name with spaces'
oh wow, i did not know that (obviously). So basically me giving everyone wrong info status vs sc-status made everyones SPL incorrect. Apologies everyone, i didnt think it made a difference.
@ITWhisperer wrote:... Whenever you have a field name with "special characters" including spaces, you should use single quotes when referencing the field, e.g.' field name with spaces'..
I wont forget that.. thanks!
If you are not getting any results, is that data JSON and are the fields being auto extracted?
If you look at your posts you will see you have described the field as cs_uri_stem (with underscores) in one post and then shown cs-uri-stem (with hyphens) in another post. These are different fields, so you must use the correct name for your field.
As an example using dummy generated data:
| makeresults count=10
| eval cs_url_stem=mvindex(split("ABC",""),random()%3)
| eval sc_status=(1+random()%4)*100
``` Lines above generate dummy data - remove when copying to your search ```
| stats count as requests count(eval(sc_status=200)) as success by cs_url_stem
| eval pct=100*success/requests