As a newbie I am currently working on a mini internship project which requires me to analyse a dataset using splunk. I have completed almost all but the last part of it which reads "gender that performed the most fraudulent activities and in what category". Basically im supposed to get the gender (F or M) that performed the most fraud in specifically in what category. The dataset which consists of a column of steps, customer, age,gender, Postcodeorigin, merchant, category,amount and fround from a file name fraud_report.csv . The file has already been uploaded to splunk. I am just stuck at the query part.
Hi @Cleanhearty ,
I suppose that you already ingested the csv file in a lookup or in an index.
If in a lookup you can define what you mean with "gender that performed the most fraudulent activities and in what category", I suppose that you mean most fraudolent by amount,
so you could try something like this:
| inputlookup fraud_report.csv
| stats max(amount) AS amount BY gender category
| sort -amount
| head 10in this way, you have the top 10 categories by gender that have the greatest amount.
My hint is also to follow the Splunk Search Tutorial (https://docs.splunk.com/Documentation/SplunkCloud/latest/SearchTutorial/WelcometotheSearchTutorial) to learn how to run similar searches.
Ciao.
Giuseppe
Thanks for the help. Unfortunately it didnt return any results(statistics(0)). That's weird.
PS: I replaced the file name with the origial.
Hi @Cleanhearty ,
check if the used fields (amount, gender, and category) are in the lookup and the name is exactly the same (field names are case sensitive).
then check the amount field format.
ciao.
Giuseppe