topic Re: Difficulty in determining the query to extract a dataset in Getting Data In

Difficulty in determining the query to extract a dataset

Cleanhearty — Thu, 19 Sep 2024 06:06:44 GMT

As a newbie I am currently working on a mini internship project which requires me to analyse a dataset using splunk. I have completed almost all but the last part of it which reads "gender that performed the most fraudulent activities and in what category". Basically im supposed to get the gender (F or M) that performed the most fraud in specifically in what category. The dataset which consists of a column of steps, customer, age,gender, Postcodeorigin, merchant, category,amount and fround from a file name fraud_report.csv . The file has already been uploaded to splunk. I am just stuck at the query part.

Re: Difficulty in determining the query to extract a dataset

gcusello — Thu, 19 Sep 2024 06:30:06 GMT

Hi @Cleanhearty ,

I suppose that you already ingested the csv file in a lookup or in an index.

If in a lookup you can define what you mean with "gender that performed the most fraudulent activities and in what category", I suppose that you mean most fraudolent by amount,

so you could try something like this:

| inputlookup fraud_report.csv | stats max(amount) AS amount BY gender category | sort -amount | head 10

in this way, you have the top 10 categories by gender that have the greatest amount.

My hint is also to follow the Splunk Search Tutorial (https://docs.splunk.com/Documentation/SplunkCloud/latest/SearchTutorial/WelcometotheSearchTutorial) to learn how to run similar searches.

Ciao.

Giuseppe

Re: Difficulty in determining the query to extract a dataset

Cleanhearty — Tue, 24 Sep 2024 13:13:11 GMT

Thanks for the help. Unfortunately it didnt return any results(statistics(0)). That's weird.

PS: I replaced the file name with the origial.

Re: Difficulty in determining the query to extract a dataset

gcusello — Tue, 24 Sep 2024 13:29:52 GMT

Hi @Cleanhearty ,

check if the used fields (amount, gender, and category) are in the lookup and the name is exactly the same (field names are case sensitive).

then check the amount field format.

ciao.

Giuseppe