I am basically doing two searches where the results of the 1st search serves as input for the 2nd search.
There are also two sourcetypes: TICKET_OPENED & TICKET_CLOSED, both with a common field TICKET_NUMBER. Search on TICKET_OPENED is filtered using the field TYPE=EQUIPMENT, which is not on TICKET_CLOSED. The resulting TICKET_NUMBER from TICKET_OPENED is used to do a subsearch on TICKET_CLOSED. Search command below:
index=xyz sourcetype=TICKET_OPENED TYPE=EQUIPMENT | dedup TICKET_NUMBER | join TICKET_NUMBER [ search index=xyz sourcetype=TICKET_CLOSED | fields TICKET_NUMBER, CLOSE_DATE ] | table TICKET_NUMBER CREATE_DATE CLOSE_DATE
As from above, the fields CLOSE_DATE from TICKET_CLOSED and CREATE_DATE from TICKET_OPENED are also being extracted since the main purpose of the search is to find the open to close duration of tickets.
The problem is that the above search does not return all results. Using "All time"(Jan to now) only returns x events, however, some per month searches yield more than the all time. I've read that "join" has a certain limit. As additional info, TICKET_NUMBER are in the millions (both for TICKET_OPENED & TICKET_CLOSED) and the TYPE=EQUIPMENT filter on TICKET_OPENED narrows it to thousands.
Sorry for the long narrative and thank you in advance.
how about something like this? The remove eval will search for TICKET_NUMBERS that only have a TICKET_CLOSED
index=xyz (sourcetype=TICKET_OPENED TYPE=EQUIPMENT) OR (sourcetype=TICKET_CLOSED)
|stats latest(CREATE_DATE) as CREATE_DATE earliest(CLOSE_DATE) as CLOSE_DATE values(sourcetype) as sourcetypes dc(souretype) as numberSourcetypes by TICKET_NUMBER
|eval remove=if(numberSourcetypes=1 AND sourcetypes="TICKET_CLOSED",1,0)
|search remove=0
Again, only TICKET_NUMBER that are being filtered using the "TYPE=EQUIPMENT" of the sourcetype=TICKET_OPENED are being used as input for sourcetype=TICKET_CLOSED.
The suggested searches are pulling tickets that do not have the primary criteria of sourcetype=TICKET_OPENED TYPE=EQUIPMENT and missing the point of finding an alternative for the subsearch I am doing while avoiding the limitation of the JOIN command.
I accidentally put |search remove=1
instead of |search remove=0
in the search I provided above, the initial search looks for TICKET_OPEN events with TYPE=EQUIPMENT and any TICKET_CLOSED events. the stats command should grab the latest CREATE_DATE and the earliest CLOSE_DATE for each TICKET_NUMBER as well as which sourcetypes and how many sourcetypes reported for each TICKET_NUMBER. By running the remove eval and searching for 0, you should be removing any TICKET_NUMBERS that only have a TICKET_CLOSED TICKET_NUMBER and should be left with any TICKET_NUMBER that has TYPE=EQUIPMENT.
If the above truly doesn't work and I'm not understanding clearly what the issue is, the only other option I can think of would be to reverse the joins, although you still might run into your limit issue.
index=xyz sourcetype=TICKET_CLOSED | fields TICKET_NUMBER, CLOSE_DATE| join TICKET_NUMBER [ search index=xyz sourcetype=TICKET_OPENED TYPE=EQUIPMENT | dedup TICKET_NUMBER ] | table TICKET_NUMBER CREATE_DATE CLOSE_DATE
Well, for many reasons you're better off taking a step back and trying to do this without join. Here's a way forward, although I have a couple questions and I'll have to come back and update this answer once or twice before we get there.
First, run this search -
index=xyz (sourcetype=TICKET_OPENED TYPE=EQUIPMENT ) OR sourcetype=TICKET_CLOSED |
stats values(CREATE_DATE) values(CLOSE_DATE) by TICKET_NUMBER
questions - 1) For each row (ie each TICKET_NUMBER), how many values come back for CREATE_DATE and CLOSE DATE ? Is it just one of each per ticket? 2) what do the values of CREATE_DATE and CLOSE_DATE look like, exactly? If you have multiple values coming back for each TICKET_NUMBER here, then we'll have to convert them to epochtime values with the eval command, so we can give back only the earliest create date and the latest close date.
UPDATES:
1) Incorporating findings that there's only ever one CREATE_DATE and one CLOSE_DATE per ticket.
technically you can use earliest() and latest() instead of values() in the stats command. Since that's more common I've done that below. However there's a certain virtue in using values() even stil, because if that assumption ever breaks, you'll notice it right away.
2) Incorporating the requirement I missed before, -- that type=equipment
is present only on the opened
side of the equation, but that you need to only end up with those ticket numbers. Note that we pass along the type field to stats, and then filter after the aggregation.
3) Below I'm also calculating the number of days each ticket was open.
index=xyz (sourcetype=TICKET_OPENED TYPE=EQUIPMENT ) OR sourcetype=TICKET_CLOSED |
stats latest(CREATE_DATE) as CREATE_DATE earliest(CLOSE_DATE) as CLOSE_DATE values(TYPE) as type by TICKET_NUMBER
| search type="EQUIPMENT"
| eval created_epochtime=strptime(CREATE_DATE,"%Y-%m-%d %H:%M:%S.%Q")
| eval closed_epochtime=strptime(CLOSE_DATE,"%Y-%m-%d %H:%M:%S.%Q")
| eval secondsOpen=closed_epochtime-created_epochtime
| eval daysOpen=round(secondsOpen/(60*60*24),2)
Extra ideas, directions.
4) if you want a graph of the average number of days tickets were open, tack a | timechart avg(daysOpen)
on the end.
5) If the TYPE=EQUIPMENT
events are only a tiny subset of the overall tickets, you might also use a subsearch in the search clause to improve efficiency, and I mean a true subsearch, not just square bracket subsearch syntax as implemented by join/append.
index=xyz (sourcetype=TICKET_OPENED TYPE=EQUIPMENT ) OR ( sourcetype=TICKET_CLOSED [ search index=xyz sourcetype=TICKET_OPENED TYPE=EQUIPMENT |dedup TICKET_NUMBER] |
stats latest(CREATE_DATE) as CREATE_DATE earliest(CLOSE_DATE) as CLOSE_DATE values(TYPE) as type by TICKET_NUMBER
| search type="EQUIPMENT"
| eval created_epochtime=strptime(CREATE_DATE,"%Y-%m-%d %H:%M:%S.%Q")
| eval closed_epochtime=strptime(CLOSE_DATE,"%Y-%m-%d %H:%M:%S.%Q")
| eval secondsOpen=closed_epochtime-created_epochtime
| eval daysOpen=round(secondsOpen/(60*60*24),2)
Thank you for the detailed explanation, however, this does not work as your suggested search pulls all TICKET_CLOSED TICKET_NUMBERs. Keep i mind that I am using sourcetype=TICKET_OPENED TYPE=EQUIPMENT to filter TICKET_NUMBER and use that as input for TICKET_CLOSED. The result I am getting shows a lot of TICKET_NUMBER without CREATE_DATEs.
Thank you for the response. Yes, just one result each for CREATE_DATE and CLOSE DATE, sample result below:
TICKET_NUMBER CREATE_DATE CLOSE_DATE
12345607 2016-08-09 04:37:02.0202 2016-08-30 02:45:30.3030
12345608 2016-08-16 06:01:18.1818 2016-08-29 05:11:46.4646
12345609 2016-08-18 21:31:05.0505 2016-08-26 11:54:39.3939
12345610 2016-08-19 18:22:25.2525 2016-08-30 20:34:33.3333
12345611 2016-08-23 15:51:35.3535 2016-08-26 15:45:15.1515
12345612 2016-08-25 08:56:39.3939 2016-08-27 02:04:08.0808
12345613 2016-08-25 09:32:58.5858 2016-08-27 01:42:17.1717
12345614 2016-08-25 19:06:55.5555 2016-08-25 19:37:45.4545
CREATE_DATE & CLOSE_DATE are actually derived from the timestamps of the events.
Oh cool. Well that makes it nice and simple. I assume you'd like to calculate the number of days each ticket was open, so I'll update my answer with that.