I have a CSV file that I would like to index one time only. There are two fields (Date, Time) that I want to be able to use as _time so that I can create a correlation of avg/median response times with data being forwarded in to a different index down the line. Are there attributes I can add to props.conf to help with this? Here is what my current props configurations look like:
[oms_invoice_data]
HEADER_MODE = firstline
SHOULD_LINEMERGE = false
TIME_PREFIX = ^
TIME_FORMAT = %Y%m%d,%H%M%S
MAX_TIMESTAMP_LOOKAHEAD = 15
EXTRACT-oms_invoice_data=(?<Date>[^,]*),\s*(?<Time>[^,]*),\s*(?<Division>[^,]*),\s*(?<Customer>[^,]*),\s*(?<BillTo>[^,]*),\s*(?<DiversityCustomer>[^,]*),\s*(?<InvoiceTypes>[^,]*),\s*(?<EDI>[^,]*),\s*(?<PDV>[^,]*),\s*(?<PAPER>[^,]*),\s*(?<InvoiceNumber>[^,]*),\s*(?<InvoiceAmount>[^,]*),\s*(?<LegacyEDI>.*)
Alternatively, is there a way I could manipulate the search to concatenate these values, and then later coalesce _time
with time
(custom time field) to be able to calculate earliest/latest _time
and find durations that way? I tried doing something like this but things didn't work out as I had hoped (knowing I can't reassign _time
respectively:
index=main sourcetype=oms_invoice_data source="/tmp/oms_invoice_data_0504_0510.csv" EDI=1 OR LegacyEDI=1 InvoiceTypes=EDI* Date=20150504
| table Date,Time, InvoiceNumber InvoiceAmount,
| mvexpand InvoiceNumber
| eval SourceSystem="OMS"
| eval Date = strftime(Date, "%Y/%m/%d")
| eval Time = strftime(Time,"%H:%M:%S")
| eval time=Date. " " .Time
The weird thing here is that time is displayed as 1970/08/22 09:46:53
rather than 2015/05/04 09:46:53
Essentially what I am looking for is being able to calculate earliest(_time)
and latest(_time)
so that for one day, 05/04 for example), I can calculate the end-to-end avg and median response time for each transaction.
Any thoughts on this would be greatly appreciated.
Thanks in Advance!
Here is the solution:
Props.conf
[foo]
TIME_PREFIX=^
MAX_TIMESTAMP_LOOKAHEAD=15
DATETIME_CONFIG=/etc/apps/props/default/oms_invoice_data_datetime.xml
SHOULD_LINEMERGE=false
EXTRACT-oms_invoice_data=(?<Date>[^,]*),\s*(?<Time>[^,]*),\s*(?<Division>[^,]*),\s*(?<Customer>[^,]*),\s*(?<BillTo>[^,]*),\s*(?<DiversityCustomer>[^,]*),\s*(?<InvoiceTypes>[^,]*),\s*(?<EDI>[^,]*),\s*(?<PDV>[^,]*),\s*(?<PAPER>[^,]*),\s*(?<InvoiceNumber>[^,]*),\s*(?<InvoiceAmount>[^,]*),\s*(?<LegacyEDI>.*)
XML File that Strips Datetime
<datetime>
<define name="_oms_csv_timestamp" extract="year, month, day, hour, minute, second">
<text>(\d{4})(\d{2})(\d{2}),(\d{1,2})(\d{2})(\d{2})</text>
</define>
<timePatterns>
<use name="_oms_csv_timestamp"/>
</timePatterns>
<datePatterns>
<use name="_oms_csv_timestamp"/>
</datePatterns>
</datetime>
This pulls out the exact timestamp from the CSV. The XML file resides in props.conf.
If your InvoiceTime
column had a leading zero for the hour you could use these settings:
[foo]
TIME_FORMAT=%Y%m%d%H%M%S
TIMESTAMP_FIELDS=InvoiceDate,InvoiceTime
SHOULD_LINEMERGE=false
INDEXED_EXTRACTIONS=csv
NO_BINARY_CHECK=true
KV_MODE=none
disabled=false
pulldown_type=true
That would assign _time
correctly from the get-go so you don't need to do any more maths at search time.
Did you add a leading zero to the hour of your InvoiceTime
column?
There is no XML at index time so I'm not sure what you're referring to.
Thanks! I modified the props.conf and am re-indexing the file now. I will let you know how things look!
This didn't work, unfortunately. I tried using a few different attributes but no luck. Is it possible to use custom XML to format the logging?
Are you sure that's the right CSV file? I see no commas separating the values...
I copied that right from the original spreadsheet. Here is the same one I am indexing from /tmp/:
InvoiceDate,InvoiceTime,Division,Customer,BillTo,DiversityCustomer,InvoiceTypes,EDI,PDV,PAPER,InvoiceNumber,InvoiceAmount,LegacyEDI
20150504,53141,ATL,12345,12345,N,PDV,0,1,0,001,382.61,0
20150504,53218,ATL,12345,12345,N,EDI & PAPER,1,0,1,002,231.14,0
20150504,53218,ATL,12345,12345,N,EDI & PAPER,1,0,1,003,245.19,0
20150504,53218,ATL,12345,12345,N,EDI & PAPER,1,0,1,004,19.03,0
20150504,53153,ATL,12345,12345,N,EDI & PAPER,1,0,1,005,251.54,0
20150504,53153,ATL,12345,12345,N,EDI & PAPER,1,0,1,006,146.92,0
then in props I extract InvoiceDate and InvoiceTime as Date, Time
It'd help if you posted a sample of your CSV file.
Sure thing - Here are the top 5 lines with edited values:
Date Time Division Customer Bill-To DiversityCustomer InvoiceTypes EDI PDV PAPER InvoiceNumber InvoiceAmount LegacyEDI
20150504 53141 ATL 1234567 12345 N PDV 0 1 0 001 382.61 0
20150504 53218 ATL 1234567 12345 N EDI & PAPER 1 0 1 002 231.14 0
20150504 53218 ATL 1234567 12345 N EDI & PAPER 1 0 1 003 245.19 0
20150504 53218 ATL 1234567 12345 N EDI & PAPER 1 0 1 004 19.03 0
Also, this CSV file contains 7 days worth of data (5/4 to 5/10). With the above search I am only looking to start with 5/4. If I need to re-index the file I can do that. Thanks!