How do you run a subquery for each row of a CSV fi...

known_user · ‎01-24-2019

I want to run a Splunk query for all the values in the CSV file and replace the value with the field in the CSV file. I've imported the file into Splunk as an input lookup table, and I'm able to view the fields using an inputlookup query. But, I want to run that with all the sub queries where I'm fetching maximum count on a per hour, per day, per week and per month basis.

input file is ids.csv, which has around 800 rows and it's just one column, like below:

...

query that im using:

| inputlookup ids.csv | fields ids as id |  [search index="abc" id "search string here" |bin _time span="1hour" | stats count as maxHour by _time | sort - count | head 1]  |appendcols[search
index="abc" id "search string here" |bin _time span="1day" | stats count as maxDay by _time | sort - count |head 1 ]|appendcols[search
index="abc" id "search string here" |bin _time span="1week" | stats count as maxWeek by _time | sort - count | head 1 ]|appendcols[search
index="abc" id "search string here" |bin _time span="1month" | stats count as maxMonth by _time | sort - count | head 1]

I'm not getting the expected results for this. I'm expecting a tabular format where I get the count for each time range with the specific ID by passing the ID field in the search subquery.

How can I solve this?

Thanks

jkat54 · ‎01-27-2019

I’ve never tried more than one bin command but maybe this works

woodcock · ‎01-26-2019

Assuming that you are trying to limit the search results to only those id values that are in ids.csv, then like this:

index="abc" id "search string here" [ |inputlookup ids.csv | rename ids AS id | format ]
| timechart limit=0 useother=false span=1h count AS maxHour BY id
| untable _time id maxHour
| multireport

[ sort 0 - maxHour
| dedup id ]

[ timechart limit=0 useother=false span=1d sum(maxHour) AS maxDay BY id
| untable _time id maxDay
| sort 0 - maxDay
| dedup id ]

[ timechart limit=0 useother=false span=1mon sum(maxDay) AS maxMonth BY id
| untable _time id maxMonth
| sort 0 - maxMonth
| dedup id ]

This gives you each id's maxHour, maxDay, and maxMonth.

richgalloway · ‎01-27-2019

The multireport command looks interesting, but I can find no documentation for it.

---
If this reply helps you, Karma would be appreciated.

woodcock · ‎01-27-2019

Yes, I only know it by using it. It needs to be documented because it is very useful.

richgalloway · ‎01-28-2019

Docs feedback, @woodcock

---
If this reply helps you, Karma would be appreciated.

richgalloway · ‎01-25-2019

First, unless you just left it out of the question, your CSV needs a header row so Splunk knows what the field name is. I will call it "id".

The literal translation of your question would employ the map command. Something like

| inputlookup ids.csv | map [[search index="abc" $id$ "search string here" |bin _time span="1hour" | stats count as maxHour by _time | sort - count | head 1]  
|appendcols[search
 index="abc" $id$ "search string here" |bin _time span="1day" | stats count as maxDay by _time | sort - count |head 1 ]
|appendcols[search
 index="abc" $id$ "search string here" |bin _time span="1week" | stats count as maxWeek by _time | sort - count | head 1 ]
|appendcols[search
 index="abc" $id$ "search string here" |bin _time span="1month" | stats count as maxMonth by _time | sort - count | head 1]]

But this would be horrendously inefficient as the searches would be executed 800 times each.

More efficient would be to make the lookup file a subsearch of each search.

search index="abc" [ | inputlookup ids.csv | fields id | format ] "search string here" |bin _time span="1hour" | stats count as maxHour by _time, id | sort - count | dedup id]  
 |appendcols[search
  index="abc" [ | inputlookup ids.csv | fields id | format ] "search string here" |bin _time span="1day" | stats count as maxDay by _time, id | sort - count | dedup id ]
 |appendcols[search
  index="abc" [ | inputlookup ids.csv | fields id | format ] "search string here" |bin _time span="1week" | stats count as maxWeek by _time, id | sort - count | dedup id ]
 |appendcols[search
  index="abc" [ | inputlookup ids.csv | fields id | format ] "search string here" |bin _time span="1month" | stats count as maxMonth by _time, id | sort - count | dedup id ]

Now you only make four passes through the data.

---
If this reply helps you, Karma would be appreciated.

How do you run a subquery for each row of a CSV file by passing the field in a search string?

CX Day is Coming!

Strengthen Your Future: A Look Back at Splunk 10 Innovations and .conf25 Highlights!

Now Offering the AI Assistant Usage Dashboard in Cloud Monitoring Console

Are you a member of the Splunk Community?

How do you run a subquery for each row of a CSV file by passing the field in a search string?

CX Day is Coming!

Strengthen Your Future: A Look Back at Splunk 10 Innovations and .conf25 Highlights!

Now Offering the AI Assistant Usage Dashboard in Cloud Monitoring Console