Getting Data In

How to Assign Values for _time at Index Time for Future Searching?

_gkollias
Builder

I have a CSV file that I would like to index one time only. There are two fields (Date, Time) that I want to be able to use as _time so that I can create a correlation of avg/median response times with data being forwarded in to a different index down the line. Are there attributes I can add to props.conf to help with this? Here is what my current props configurations look like:

[oms_invoice_data]
HEADER_MODE = firstline
SHOULD_LINEMERGE = false
TIME_PREFIX = ^
TIME_FORMAT = %Y%m%d,%H%M%S
MAX_TIMESTAMP_LOOKAHEAD = 15
EXTRACT-oms_invoice_data=(?<Date>[^,]*),\s*(?<Time>[^,]*),\s*(?<Division>[^,]*),\s*(?<Customer>[^,]*),\s*(?<BillTo>[^,]*),\s*(?<DiversityCustomer>[^,]*),\s*(?<InvoiceTypes>[^,]*),\s*(?<EDI>[^,]*),\s*(?<PDV>[^,]*),\s*(?<PAPER>[^,]*),\s*(?<InvoiceNumber>[^,]*),\s*(?<InvoiceAmount>[^,]*),\s*(?<LegacyEDI>.*)

Alternatively, is there a way I could manipulate the search to concatenate these values, and then later coalesce _time with time (custom time field) to be able to calculate earliest/latest _time and find durations that way? I tried doing something like this but things didn't work out as I had hoped (knowing I can't reassign _time respectively:

index=main sourcetype=oms_invoice_data source="/tmp/oms_invoice_data_0504_0510.csv" EDI=1 OR LegacyEDI=1 InvoiceTypes=EDI* Date=20150504
| table Date,Time, InvoiceNumber InvoiceAmount, 
| mvexpand InvoiceNumber 
| eval SourceSystem="OMS"
| eval Date = strftime(Date, "%Y/%m/%d")
| eval Time = strftime(Time,"%H:%M:%S")
| eval time=Date. " " .Time

The weird thing here is that time is displayed as 1970/08/22 09:46:53 rather than 2015/05/04 09:46:53

Essentially what I am looking for is being able to calculate earliest(_time) and latest(_time) so that for one day, 05/04 for example), I can calculate the end-to-end avg and median response time for each transaction.

Any thoughts on this would be greatly appreciated.

Thanks in Advance!

0 Karma

youngc_splunk
Splunk Employee
Splunk Employee

@_gkollias It looks like you are trying to process EDI data.  We do have a solutions accelerator for EDI, we can share.  Are you interesting in learning about it?  Love to set-up some time to share what we have.

0 Karma

_gkollias
Builder

Here is the solution:

Props.conf

[foo]
    TIME_PREFIX=^
    MAX_TIMESTAMP_LOOKAHEAD=15
    DATETIME_CONFIG=/etc/apps/props/default/oms_invoice_data_datetime.xml
    SHOULD_LINEMERGE=false
    EXTRACT-oms_invoice_data=(?<Date>[^,]*),\s*(?<Time>[^,]*),\s*(?<Division>[^,]*),\s*(?<Customer>[^,]*),\s*(?<BillTo>[^,]*),\s*(?<DiversityCustomer>[^,]*),\s*(?<InvoiceTypes>[^,]*),\s*(?<EDI>[^,]*),\s*(?<PDV>[^,]*),\s*(?<PAPER>[^,]*),\s*(?<InvoiceNumber>[^,]*),\s*(?<InvoiceAmount>[^,]*),\s*(?<LegacyEDI>.*)

XML File that Strips Datetime

<datetime>
    <define name="_oms_csv_timestamp" extract="year, month, day, hour, minute, second">
        <text>(\d{4})(\d{2})(\d{2}),(\d{1,2})(\d{2})(\d{2})</text>
    </define>
    <timePatterns>
        <use name="_oms_csv_timestamp"/>
    </timePatterns>
    <datePatterns>
        <use name="_oms_csv_timestamp"/>
    </datePatterns>
</datetime>

This pulls out the exact timestamp from the CSV. The XML file resides in props.conf.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

If your InvoiceTime column had a leading zero for the hour you could use these settings:

[foo]
TIME_FORMAT=%Y%m%d%H%M%S
TIMESTAMP_FIELDS=InvoiceDate,InvoiceTime
SHOULD_LINEMERGE=false
INDEXED_EXTRACTIONS=csv
NO_BINARY_CHECK=true
KV_MODE=none
disabled=false
pulldown_type=true

That would assign _time correctly from the get-go so you don't need to do any more maths at search time.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Did you add a leading zero to the hour of your InvoiceTime column?

There is no XML at index time so I'm not sure what you're referring to.

0 Karma

_gkollias
Builder

Thanks! I modified the props.conf and am re-indexing the file now. I will let you know how things look!

0 Karma

_gkollias
Builder

This didn't work, unfortunately. I tried using a few different attributes but no luck. Is it possible to use custom XML to format the logging?

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Are you sure that's the right CSV file? I see no commas separating the values...

0 Karma

_gkollias
Builder

I copied that right from the original spreadsheet. Here is the same one I am indexing from /tmp/:

InvoiceDate,InvoiceTime,Division,Customer,BillTo,DiversityCustomer,InvoiceTypes,EDI,PDV,PAPER,InvoiceNumber,InvoiceAmount,LegacyEDI
20150504,53141,ATL,12345,12345,N,PDV,0,1,0,001,382.61,0
20150504,53218,ATL,12345,12345,N,EDI & PAPER,1,0,1,002,231.14,0
20150504,53218,ATL,12345,12345,N,EDI & PAPER,1,0,1,003,245.19,0
20150504,53218,ATL,12345,12345,N,EDI & PAPER,1,0,1,004,19.03,0
20150504,53153,ATL,12345,12345,N,EDI & PAPER,1,0,1,005,251.54,0
20150504,53153,ATL,12345,12345,N,EDI & PAPER,1,0,1,006,146.92,0

0 Karma

_gkollias
Builder

then in props I extract InvoiceDate and InvoiceTime as Date, Time

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

It'd help if you posted a sample of your CSV file.

0 Karma

_gkollias
Builder

Sure thing - Here are the top 5 lines with edited values:

Date Time Division Customer Bill-To DiversityCustomer InvoiceTypes EDI PDV PAPER InvoiceNumber InvoiceAmount LegacyEDI

20150504 53141 ATL 1234567 12345 N PDV 0 1 0 001 382.61 0

20150504 53218 ATL 1234567 12345 N EDI & PAPER 1 0 1 002 231.14 0

20150504 53218 ATL 1234567 12345 N EDI & PAPER 1 0 1 003 245.19 0

20150504 53218 ATL 1234567 12345 N EDI & PAPER 1 0 1 004 19.03 0

Also, this CSV file contains 7 days worth of data (5/4 to 5/10). With the above search I am only looking to start with 5/4. If I need to re-index the file I can do that. Thanks!

0 Karma
Get Updates on the Splunk Community!

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

If you’ve ever deployed a new database cluster, spun up a caching layer, or added a load balancer, you know it ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Financial fraud isn't slowing down. If anything, it's getting more sophisticated. Account takeovers, credit ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...