I have a CSV file with headers which have date and time stamp fields in a single column. I want to extract date and time as separate fields. e.g. on 1st row of dataset the DeclaredDate is "26/12/2008 10:30" I want to extract fields DeclareDate as 26/12/2008 in date format and 10:30 as DeclareTime in time format.
If I import data using default CSV input type, the DeclareDate is extracted as timestamp for the event. However, I want both DeclaredDate as well as RevokedDate to be extracted in time format..
Can someone give me some clue? I do not have background in regex.
Sample data below:
Id,Name,DeclaredDate,RevokedDate,RegionId,LgaId,Area,AgencyId,ReferenceNo
1,Mt Mologone S44,26/12/2008 10:30,29/12/2008 12:00,4,800,Bland District,NULL,NULL
2,Gulp Road,6/01/2009 15:00,18/01/2009 20:00,1,8350,"Wingecarribee District, nm",NULL,NULL
3,Beaumont,15/01/2009 14:00,16/01/2009 17:00,1,4000,Hornsby Local Goverment Area,NULL,NULL
Like this:
... | rex field=DeclaredDate "(?<DeclareDate>\S+)\s+(?<DeclareTime>.*)" | rex field=RevokedDate "(?<RevokeDate>\S+)\s+(?<RevokeTime>.*)"
| rex field=DeclaredDate "(?<DeclareDate>\S+)\s+(?<DeclareTime>.*)" | rex field=RevokedDate "(?<RevokeDate>\S+)\s+(?<RevokeTime>.*)" | eval DeclaredDate=DeclaredDate." GMT" | eval RevokedDate=RevokedDate." GMT" | eval DeclaredDateEpoch=strptime(DeclaredDate, "%m/%e/%Y %H:%M %Z") | eval RevokedDateEpoch=strptime(RevokedDate, "%m/%e/%Y %H:%M %Z") | eval duration_sec = RevokedDateEpoch - DeclaredDateEpoch | eval duration=tostring(durationSeconds,"duration")
Time zone is an integral part of creating epoch time. "GMT" is replaced with the respective time zones. The difference on times of the timezones will not care what the timezone is set to, but other operations will care.
Thank you both Woddcock and Landen. Both works!
Like this:
... | rex field=DeclaredDate "(?<DeclareDate>\S+)\s+(?<DeclareTime>.*)" | rex field=RevokedDate "(?<RevokeDate>\S+)\s+(?<RevokeTime>.*)"
@landen99, I encountered a bug editing the comments in this post (when deleting a duplicate comment of yours) and it deleted the last few comments of yours on accident. I am working to have it restored.
No problem, Woodcock. I was just trying to be helpful with a solution which was already on the right track. I'll just compile all of my ideas and corrections to your solution into my own solution. I did not intend to "wander" to topics that you were not interested in discussing. I do think that they are fully relevant to the question, though.
PS: I was not aware that users could delete each other's comments. I've never done that before or seen that option.
you don't need "[" or "]".
\S+ works just as well by itself
Thanks Woodcock. It certainly works.
One other question in relation to this. If I now want the duration between DeclareDate and RevokeDate, how can I get that. I tried the example in the url below, but seems not working.
I tried this:
| convert timeformat="%d/%m/%yT%H:%M" mktime(RevokedDate) mktime(DeclaredDate) | eval duration=RevokedDate-DeclaredDate
This is where you do my solution with eval strftime, strptime, subtraction and convert duration.
Oh, and time zones may be a factor, because it is going to Unix time to be subtracted.
For duration TZ should not be a factor because both times should be the same TZ and we don't care what it is because we are subtracting them; do it like this:
... | rex field=DeclaredDate "(?<DeclareDate>\S+)\s+(?<DeclareTime>.*)" | rex field=RevokedDate "(?<RevokeDate>\S+)\s+(?<RevokeTime>.*)" | DeclaredDateEpoch=strptime(DeclaredDate, "%m/%d/%Y %H:%M") | RevokedDateEpoch=strptime(RevokedDate, "%m/%d/%Y %H:%M") | durationSeconds = RevokedDateEpoch - DeclaredDateEpoch
You Rex date and time separately, so they will not work. Also, the timezones may not be the same, as in the case with bit9 source. Lastly, duration in sec can be made pretty with tostring(sec, "duration").
He asked for 2 things: to separate the TimeDates and to calculate the difference. My answer is in 2 parts that both use the original field. Your comment about tostring
is fair, and I thought of it, but this Q&A has been wondering very much so I decided to stick to the main path.
True. I will update answer and delete comments.
Otherwise, a very elegant answer, I must say.
I considered giving the eval strftime, and convert or fieldformat strptime approach instead of this one based on rex, but the author never mentioned the need for sorting or filtering the time fields or formatting it from Unix time to various formats or between different time zones, so I feel your rex answer meets the question perfectly.