Hi There,
I have got a live feed from DBConnect for incidents data in below format:
dateRaised, IncID, Location, Status, closedDate
05-05-20, 12345, ABC, Closed, 06-05-20
05-05-20, 23456, DEF, In Progress,
06-05-20, 34567, GHI, Closed, 06-05-20
24-04-20, 45678, JKL, On Hold,
28-04-20, 56789, MNO, Closed, 06-05-20
Here, column closedDate is NULL for all the incidents which are not yet closed.
And, _time is set to dateRaised.
I want to create a rolling report of daily raised and closed incident count for the last 30 days with below details:
Date RaisedCount ClosedCount
I have tried solutions given for some of the similar questions on this site but to no joy.
I want to write a search which I can run with time picker set as yesterday and save as a scheduled report. Now, this report runs daily and store data in a summary index.
Using this summary index, I can get Date, RaisedCount, ClosedCount
One of the thing I have tried is overriding _time with column closedDate in sub-search (Time picker is set to "Yesterday")
This is giving me only those Closed Ticket which were raised yesterday. I want to include even those tickets which were raised even in last month but if Closed yesterday.
index="idx" source="incident_data_source" sourcetype="incident_live_feed"
| eval _time=strptime(dateRaised, "%Y-%m-%d %H:%M:%S.%N")
| stats dc(dateRaised) as RaisedCount
| table RaisedCount
| appendcols [search source="Inc_Open_Closed_Test.csv" host="xyz" sourcetype="csv"
| eval _time=strptime(closedDate, "%Y-%m-%d %H:%M:%S.%N")
| stats dc(closedDate) as ClosedCount
| table ClosedCount]
| table RaisedCount ClosedCount
Any help would be greatly appreciated.
Thank you.
Madhav
It is a serious design error to set _time
to dateRaised
on ingestion. That serves literally none of your needs. It should be set to the date of extraction and ingestion, which is the date that the data is correct as/of. (Ideally there would be a rising column lastUpdated
in your other database, and you'd be using that, and only extracting the records that had changed.)
We can get around that issue, but we shouldn't have to.
Here's your solution. We use the _index-time
for the search, instead of the _time
.
index="idx" source="incident_data_source" sourcetype="incident_live_feed"
_index_earliest=-30d@d _index_latest=@d
| rename COMMENT as "Keep only the most recent version of each incident"
| sort 0 - _index_time
| dedup incID
| rename COMMENT as "Now we have only one record per incident extracted in the last 30 days."
| rename COMMENT as "Calculate the dates using the formats you gave us in the examples."
| eval dateRaised=strptime(dateRaised, "%d-%m-%y")
| eval dateClosed=strptime(closedDate, "%d-%m-%y")
| rename COMMENT as "Drop all fields we don't need, then create one mv field with all relevant dates."
| fields incID dateRaised dateClosed
| eval relevantDates=mvappend(dateRaised,dateClosed)
| rename COMMENT as "stats to count opened and closed records for each of relevantDates, then get rid of records from prior months."
| stats count(dateRaised) as raised count(dateClosed) as closed by relevantDates
| where relevantDates >= relative_time(now(),"-30d@d")
| rename relevantDates as _time
There you go. It's aircode, so you'll need to debug against your real data, and we assumed that the data was all in the same file rather than half being in a csv, which makes very little sense.
It is a serious design error to set _time
to dateRaised
on ingestion. That serves literally none of your needs. It should be set to the date of extraction and ingestion, which is the date that the data is correct as/of. (Ideally there would be a rising column lastUpdated
in your other database, and you'd be using that, and only extracting the records that had changed.)
We can get around that issue, but we shouldn't have to.
Here's your solution. We use the _index-time
for the search, instead of the _time
.
index="idx" source="incident_data_source" sourcetype="incident_live_feed"
_index_earliest=-30d@d _index_latest=@d
| rename COMMENT as "Keep only the most recent version of each incident"
| sort 0 - _index_time
| dedup incID
| rename COMMENT as "Now we have only one record per incident extracted in the last 30 days."
| rename COMMENT as "Calculate the dates using the formats you gave us in the examples."
| eval dateRaised=strptime(dateRaised, "%d-%m-%y")
| eval dateClosed=strptime(closedDate, "%d-%m-%y")
| rename COMMENT as "Drop all fields we don't need, then create one mv field with all relevant dates."
| fields incID dateRaised dateClosed
| eval relevantDates=mvappend(dateRaised,dateClosed)
| rename COMMENT as "stats to count opened and closed records for each of relevantDates, then get rid of records from prior months."
| stats count(dateRaised) as raised count(dateClosed) as closed by relevantDates
| where relevantDates >= relative_time(now(),"-30d@d")
| rename relevantDates as _time
There you go. It's aircode, so you'll need to debug against your real data, and we assumed that the data was all in the same file rather than half being in a csv, which makes very little sense.
Thank you, @DalJeanis for your help. If I understand correctly - my current rising column is LastUpdated and _time is set to dateRaised. If I change the _time to LastUpdated, it will solve my problem, is that correct?
Thank you.
Thank you, @DalJeanis - I have changed my data feed and set _time to LastUpdate and I am almost getting required results.
Does the "live feed" include all tickets, or only the ones that change?