Hi,
Is it possible to search in a scheduled report?
I scheduled a request in a report because this request takes some time to execute. Then I would like to use the result of the scheduled request in other requests.
I am using the map function in my request, that's why i am interested of using the result of a scheduled request (for performance issue).
In the main request i am searching my logs, for each one of them I have a user identifier: sponsor_imsi. But I need to find to which customer belongs this user. Each customer has a range of id, so in need to find the customer where the sponsor_imsi is included in the range.
Here is my request:
index=pbx billing_duration="*" | table _time Date Application msisdn mvno_imsi sponsor_imsi
| map maxsearches=50000 search="| from datamodel:CustomerRange | search RangeStart<=$sponsor_imsi$ RangeEnd>=$sponsor_imsi$ | eval sponsor_imsi=$sponsor_imsi$"
| table Allocation RangeStart RangeEnd sponsor_imsi msisdn
So far I managed to use the a datamodel (CustomerRange) for the subsearch in the map function. But i cannot find a way to search in scheduled report instead.
Thanks
To get the results of a scheduled report you use loadjob
:
https://docs.splunk.com/Documentation/SplunkCloud/6.6.0/SearchReference/Loadjob
Hi, I moved the rant-mode comments to the bottom, and left them for the amusement of fellow senior splunkers. Here's the kind and gentle version:
There are a couple of dozen better ways to do this. This is just one. This should run about a half zillion times faster than using map
.
I'm going to assume you can get a record that looks like this ( | table customer RangeStart RangeEnd
) out of your data model with one record per customer, including their start and end of imsi range. You can do that with a subsearch, or you could just create a csv file with the information and use | inputcsv
.
Insert that search
or inputcsv
where all the asterisks *********>
are.
index=pbx billing_duration="*"
| table _time Date Application msisdn mvno_imsi sponsor_imsi
| eval sortorder="B"
| rename COMMENT as "The above takes your records and puts them by sponsor_imsi in the middle of the sort order"
| rename COMMENT as "The following takes your beginning and ending sponsor_imsi records into the mix, having the customer number start on the A record."
| rename COMMENT as "The Z end-of-range record blanks out the customer number for all following records in case an invalid imsi does not fit a range."
[ your search that gets *********> | table customer RangeStart RangeEnd
| eval sortorder=mvappend("A","Y","Z")
| mvexpand sortorder
| eval sponsor_imsi = if(sortorder="A",RangeStart,RangeEnd)
| eval customer=if(sortorder="Z","((unknown))",customer)
| table customer sponsor_imsi sortorder
]
| rename COMMENT as "Now we sort the records into imsi order, and take the most recent customer for each imsi."
| sort 0 sponsor_imsi sortorder
| streamstats current=t last(customer) as customer
| rename COMMENT as "Finally, we kill all records that are not from the original search, and set customer to ((unknown)) if the imsi was below the lowest valid range."
| where sortorder=="B"
| fillnull value="((unknown))" customer
(Rant mode ON)
Okay, NO. No No No.
Map
should be avoided whenever possible. This use of map
is ... completely avoidable. In a hundred ways you can do this, map
might work, but is arguably the very worst option. Calling a search 50K times sequentially? NO.
Have I mentioned, NO?
(/Rant mode OFF)
To get the results of a scheduled report you use loadjob
:
https://docs.splunk.com/Documentation/SplunkCloud/6.6.0/SearchReference/Loadjob
Thanks for the answer.
so i have a similar problem (if i understand correctly) that i work around with rest. below is a rough sketch of what i do. I grab the results of my saved search using the rest command (a lot of it can probably get stripped out or made more efficient, it was a first run effort). Then I count by label (which is the name of my saved search) and i then map all of the results of those saved searches together. after that i do a bunch of stats/evals to join them all.
|rest /services/search/jobs|search isSaved=1 isSavedSearch=1 author=cmerrima isDone=1 delegate="scheduler" label="Scheduled Search Name"|stats count by label|fields - count|map maxsearches=10000 search="loadjob savedsearch="cmerrima:search:$label$""
i think you could do something similar if you have a saved search for the first part and map it to the datamodel.
Thanks for the quick answer but I don't think it applicable in this case.
Indeed my subsearch is a database request (dbxquery). The DB request is only returning 10 lines. The map function takes a lot of time because the request is executed for each event. This is why I want to reuse the results of the scheduled report.
You can enable summary indexing for the search and then search the results in the summary index.