Getting Data In

How to Assign Values for _time at Index Time for Future Searching?

_gkollias
Builder

I have a CSV file that I would like to index one time only. There are two fields (Date, Time) that I want to be able to use as _time so that I can create a correlation of avg/median response times with data being forwarded in to a different index down the line. Are there attributes I can add to props.conf to help with this? Here is what my current props configurations look like:

[oms_invoice_data]
HEADER_MODE = firstline
SHOULD_LINEMERGE = false
TIME_PREFIX = ^
TIME_FORMAT = %Y%m%d,%H%M%S
MAX_TIMESTAMP_LOOKAHEAD = 15
EXTRACT-oms_invoice_data=(?<Date>[^,]*),\s*(?<Time>[^,]*),\s*(?<Division>[^,]*),\s*(?<Customer>[^,]*),\s*(?<BillTo>[^,]*),\s*(?<DiversityCustomer>[^,]*),\s*(?<InvoiceTypes>[^,]*),\s*(?<EDI>[^,]*),\s*(?<PDV>[^,]*),\s*(?<PAPER>[^,]*),\s*(?<InvoiceNumber>[^,]*),\s*(?<InvoiceAmount>[^,]*),\s*(?<LegacyEDI>.*)

Alternatively, is there a way I could manipulate the search to concatenate these values, and then later coalesce _time with time (custom time field) to be able to calculate earliest/latest _time and find durations that way? I tried doing something like this but things didn't work out as I had hoped (knowing I can't reassign _time respectively:

index=main sourcetype=oms_invoice_data source="/tmp/oms_invoice_data_0504_0510.csv" EDI=1 OR LegacyEDI=1 InvoiceTypes=EDI* Date=20150504
| table Date,Time, InvoiceNumber InvoiceAmount, 
| mvexpand InvoiceNumber 
| eval SourceSystem="OMS"
| eval Date = strftime(Date, "%Y/%m/%d")
| eval Time = strftime(Time,"%H:%M:%S")
| eval time=Date. " " .Time

The weird thing here is that time is displayed as 1970/08/22 09:46:53 rather than 2015/05/04 09:46:53

Essentially what I am looking for is being able to calculate earliest(_time) and latest(_time) so that for one day, 05/04 for example), I can calculate the end-to-end avg and median response time for each transaction.

Any thoughts on this would be greatly appreciated.

Thanks in Advance!

0 Karma

_gkollias
Builder

Here is the solution:

Props.conf

[foo]
    TIME_PREFIX=^
    MAX_TIMESTAMP_LOOKAHEAD=15
    DATETIME_CONFIG=/etc/apps/props/default/oms_invoice_data_datetime.xml
    SHOULD_LINEMERGE=false
    EXTRACT-oms_invoice_data=(?<Date>[^,]*),\s*(?<Time>[^,]*),\s*(?<Division>[^,]*),\s*(?<Customer>[^,]*),\s*(?<BillTo>[^,]*),\s*(?<DiversityCustomer>[^,]*),\s*(?<InvoiceTypes>[^,]*),\s*(?<EDI>[^,]*),\s*(?<PDV>[^,]*),\s*(?<PAPER>[^,]*),\s*(?<InvoiceNumber>[^,]*),\s*(?<InvoiceAmount>[^,]*),\s*(?<LegacyEDI>.*)

XML File that Strips Datetime

<datetime>
    <define name="_oms_csv_timestamp" extract="year, month, day, hour, minute, second">
        <text>(\d{4})(\d{2})(\d{2}),(\d{1,2})(\d{2})(\d{2})</text>
    </define>
    <timePatterns>
        <use name="_oms_csv_timestamp"/>
    </timePatterns>
    <datePatterns>
        <use name="_oms_csv_timestamp"/>
    </datePatterns>
</datetime>

This pulls out the exact timestamp from the CSV. The XML file resides in props.conf.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

If your InvoiceTime column had a leading zero for the hour you could use these settings:

[foo]
TIME_FORMAT=%Y%m%d%H%M%S
TIMESTAMP_FIELDS=InvoiceDate,InvoiceTime
SHOULD_LINEMERGE=false
INDEXED_EXTRACTIONS=csv
NO_BINARY_CHECK=true
KV_MODE=none
disabled=false
pulldown_type=true

That would assign _time correctly from the get-go so you don't need to do any more maths at search time.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Did you add a leading zero to the hour of your InvoiceTime column?

There is no XML at index time so I'm not sure what you're referring to.

0 Karma

_gkollias
Builder

Thanks! I modified the props.conf and am re-indexing the file now. I will let you know how things look!

0 Karma

_gkollias
Builder

This didn't work, unfortunately. I tried using a few different attributes but no luck. Is it possible to use custom XML to format the logging?

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Are you sure that's the right CSV file? I see no commas separating the values...

0 Karma

_gkollias
Builder

I copied that right from the original spreadsheet. Here is the same one I am indexing from /tmp/:

InvoiceDate,InvoiceTime,Division,Customer,BillTo,DiversityCustomer,InvoiceTypes,EDI,PDV,PAPER,InvoiceNumber,InvoiceAmount,LegacyEDI
20150504,53141,ATL,12345,12345,N,PDV,0,1,0,001,382.61,0
20150504,53218,ATL,12345,12345,N,EDI & PAPER,1,0,1,002,231.14,0
20150504,53218,ATL,12345,12345,N,EDI & PAPER,1,0,1,003,245.19,0
20150504,53218,ATL,12345,12345,N,EDI & PAPER,1,0,1,004,19.03,0
20150504,53153,ATL,12345,12345,N,EDI & PAPER,1,0,1,005,251.54,0
20150504,53153,ATL,12345,12345,N,EDI & PAPER,1,0,1,006,146.92,0

0 Karma

_gkollias
Builder

then in props I extract InvoiceDate and InvoiceTime as Date, Time

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

It'd help if you posted a sample of your CSV file.

0 Karma

_gkollias
Builder

Sure thing - Here are the top 5 lines with edited values:

Date Time Division Customer Bill-To DiversityCustomer InvoiceTypes EDI PDV PAPER InvoiceNumber InvoiceAmount LegacyEDI

20150504 53141 ATL 1234567 12345 N PDV 0 1 0 001 382.61 0

20150504 53218 ATL 1234567 12345 N EDI & PAPER 1 0 1 002 231.14 0

20150504 53218 ATL 1234567 12345 N EDI & PAPER 1 0 1 003 245.19 0

20150504 53218 ATL 1234567 12345 N EDI & PAPER 1 0 1 004 19.03 0

Also, this CSV file contains 7 days worth of data (5/4 to 5/10). With the above search I am only looking to start with 5/4. If I need to re-index the file I can do that. Thanks!

0 Karma
Get Updates on the Splunk Community!

Cloud Platform & Enterprise: Classic Dashboard Export Feature Deprecation

As of Splunk Cloud Platform 9.3.2408 and Splunk Enterprise 9.4, classic dashboard export features are now ...

Explore the Latest Educational Offerings from Splunk (November Releases)

At Splunk Education, we are committed to providing a robust learning experience for all users, regardless of ...

New This Month in Splunk Observability Cloud - Metrics Usage Analytics, Enhanced K8s ...

The latest enhancements across the Splunk Observability portfolio deliver greater flexibility, better data and ...