Getting Data In

How to Assign Values for _time at Index Time for Future Searching?

_gkollias
Builder

I have a CSV file that I would like to index one time only. There are two fields (Date, Time) that I want to be able to use as _time so that I can create a correlation of avg/median response times with data being forwarded in to a different index down the line. Are there attributes I can add to props.conf to help with this? Here is what my current props configurations look like:

[oms_invoice_data]
HEADER_MODE = firstline
SHOULD_LINEMERGE = false
TIME_PREFIX = ^
TIME_FORMAT = %Y%m%d,%H%M%S
MAX_TIMESTAMP_LOOKAHEAD = 15
EXTRACT-oms_invoice_data=(?<Date>[^,]*),\s*(?<Time>[^,]*),\s*(?<Division>[^,]*),\s*(?<Customer>[^,]*),\s*(?<BillTo>[^,]*),\s*(?<DiversityCustomer>[^,]*),\s*(?<InvoiceTypes>[^,]*),\s*(?<EDI>[^,]*),\s*(?<PDV>[^,]*),\s*(?<PAPER>[^,]*),\s*(?<InvoiceNumber>[^,]*),\s*(?<InvoiceAmount>[^,]*),\s*(?<LegacyEDI>.*)

Alternatively, is there a way I could manipulate the search to concatenate these values, and then later coalesce _time with time (custom time field) to be able to calculate earliest/latest _time and find durations that way? I tried doing something like this but things didn't work out as I had hoped (knowing I can't reassign _time respectively:

index=main sourcetype=oms_invoice_data source="/tmp/oms_invoice_data_0504_0510.csv" EDI=1 OR LegacyEDI=1 InvoiceTypes=EDI* Date=20150504
| table Date,Time, InvoiceNumber InvoiceAmount, 
| mvexpand InvoiceNumber 
| eval SourceSystem="OMS"
| eval Date = strftime(Date, "%Y/%m/%d")
| eval Time = strftime(Time,"%H:%M:%S")
| eval time=Date. " " .Time

The weird thing here is that time is displayed as 1970/08/22 09:46:53 rather than 2015/05/04 09:46:53

Essentially what I am looking for is being able to calculate earliest(_time) and latest(_time) so that for one day, 05/04 for example), I can calculate the end-to-end avg and median response time for each transaction.

Any thoughts on this would be greatly appreciated.

Thanks in Advance!

0 Karma

youngc_splunk
Splunk Employee
Splunk Employee

@_gkollias It looks like you are trying to process EDI data.  We do have a solutions accelerator for EDI, we can share.  Are you interesting in learning about it?  Love to set-up some time to share what we have.

0 Karma

_gkollias
Builder

Here is the solution:

Props.conf

[foo]
    TIME_PREFIX=^
    MAX_TIMESTAMP_LOOKAHEAD=15
    DATETIME_CONFIG=/etc/apps/props/default/oms_invoice_data_datetime.xml
    SHOULD_LINEMERGE=false
    EXTRACT-oms_invoice_data=(?<Date>[^,]*),\s*(?<Time>[^,]*),\s*(?<Division>[^,]*),\s*(?<Customer>[^,]*),\s*(?<BillTo>[^,]*),\s*(?<DiversityCustomer>[^,]*),\s*(?<InvoiceTypes>[^,]*),\s*(?<EDI>[^,]*),\s*(?<PDV>[^,]*),\s*(?<PAPER>[^,]*),\s*(?<InvoiceNumber>[^,]*),\s*(?<InvoiceAmount>[^,]*),\s*(?<LegacyEDI>.*)

XML File that Strips Datetime

<datetime>
    <define name="_oms_csv_timestamp" extract="year, month, day, hour, minute, second">
        <text>(\d{4})(\d{2})(\d{2}),(\d{1,2})(\d{2})(\d{2})</text>
    </define>
    <timePatterns>
        <use name="_oms_csv_timestamp"/>
    </timePatterns>
    <datePatterns>
        <use name="_oms_csv_timestamp"/>
    </datePatterns>
</datetime>

This pulls out the exact timestamp from the CSV. The XML file resides in props.conf.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

If your InvoiceTime column had a leading zero for the hour you could use these settings:

[foo]
TIME_FORMAT=%Y%m%d%H%M%S
TIMESTAMP_FIELDS=InvoiceDate,InvoiceTime
SHOULD_LINEMERGE=false
INDEXED_EXTRACTIONS=csv
NO_BINARY_CHECK=true
KV_MODE=none
disabled=false
pulldown_type=true

That would assign _time correctly from the get-go so you don't need to do any more maths at search time.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Did you add a leading zero to the hour of your InvoiceTime column?

There is no XML at index time so I'm not sure what you're referring to.

0 Karma

_gkollias
Builder

Thanks! I modified the props.conf and am re-indexing the file now. I will let you know how things look!

0 Karma

_gkollias
Builder

This didn't work, unfortunately. I tried using a few different attributes but no luck. Is it possible to use custom XML to format the logging?

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Are you sure that's the right CSV file? I see no commas separating the values...

0 Karma

_gkollias
Builder

I copied that right from the original spreadsheet. Here is the same one I am indexing from /tmp/:

InvoiceDate,InvoiceTime,Division,Customer,BillTo,DiversityCustomer,InvoiceTypes,EDI,PDV,PAPER,InvoiceNumber,InvoiceAmount,LegacyEDI
20150504,53141,ATL,12345,12345,N,PDV,0,1,0,001,382.61,0
20150504,53218,ATL,12345,12345,N,EDI & PAPER,1,0,1,002,231.14,0
20150504,53218,ATL,12345,12345,N,EDI & PAPER,1,0,1,003,245.19,0
20150504,53218,ATL,12345,12345,N,EDI & PAPER,1,0,1,004,19.03,0
20150504,53153,ATL,12345,12345,N,EDI & PAPER,1,0,1,005,251.54,0
20150504,53153,ATL,12345,12345,N,EDI & PAPER,1,0,1,006,146.92,0

0 Karma

_gkollias
Builder

then in props I extract InvoiceDate and InvoiceTime as Date, Time

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

It'd help if you posted a sample of your CSV file.

0 Karma

_gkollias
Builder

Sure thing - Here are the top 5 lines with edited values:

Date Time Division Customer Bill-To DiversityCustomer InvoiceTypes EDI PDV PAPER InvoiceNumber InvoiceAmount LegacyEDI

20150504 53141 ATL 1234567 12345 N PDV 0 1 0 001 382.61 0

20150504 53218 ATL 1234567 12345 N EDI & PAPER 1 0 1 002 231.14 0

20150504 53218 ATL 1234567 12345 N EDI & PAPER 1 0 1 003 245.19 0

20150504 53218 ATL 1234567 12345 N EDI & PAPER 1 0 1 004 19.03 0

Also, this CSV file contains 7 days worth of data (5/4 to 5/10). With the above search I am only looking to start with 5/4. If I need to re-index the file I can do that. Thanks!

0 Karma
Get Updates on the Splunk Community!

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...

State of Splunk Careers 2024: Maximizing Career Outcomes and the Continued Value of ...

For the past four years, Splunk has partnered with Enterprise Strategy Group to conduct a survey that gauges ...

Data-Driven Success: Splunk & Financial Services

Splunk streamlines the process of extracting insights from large volumes of data. In this fast-paced world, ...