Solved: How to build a lookup table without delimiters?

ddrillic · ‎08-17-2017

Due to the nature of the data, we can't use any delimiters.

The data layout is as follows by character position.

Name = 1-8
Department = 9-12
Location= 13-24
New Department = 25-28
Status = 29-30

Is there a way to specify the lookup definition based on these character position?

DalJeanis · ‎08-17-2017

The field names are not an issue. Knowing that the data is abstracted and/or encrypted is enough.

Splunk CAN bring in and process binary files...

Assuming the data is all in one 30 byte field, then this would extract the binary-valued fields...

 | rex "^(?<Name>.{8})(?<Department>.{4})(?<Location>.{12})(?<New Department>.{4})(?<Status>.{2})$"

...but I'm just not sure what other gotchas there might be involved with just slapping that data into a lookup and trying to use it as is.

I am TEMPTED to think in terms of having each of those fields except Status being converted into and represented by one to three 4-byte numbers. I know that would perform the function without issue, but I don't know if I'm introducing unneeded complexity that the vanilla system would handle straight out of the box.

I SUSPECT, based on other questions and answers about binary data, that splunk just isn't architected to handle it very well.

The best option that I can suggest is to convert the binary data into display-hex, thus taking up twice as much space, but consisting only of [0-9A-F]. Then it can be treated as character data.

View solution in original post

DalJeanis · ‎08-17-2017

The field names are not an issue. Knowing that the data is abstracted and/or encrypted is enough.

Splunk CAN bring in and process binary files...

Assuming the data is all in one 30 byte field, then this would extract the binary-valued fields...

 | rex "^(?<Name>.{8})(?<Department>.{4})(?<Location>.{12})(?<New Department>.{4})(?<Status>.{2})$"

...but I'm just not sure what other gotchas there might be involved with just slapping that data into a lookup and trying to use it as is.

I am TEMPTED to think in terms of having each of those fields except Status being converted into and represented by one to three 4-byte numbers. I know that would perform the function without issue, but I don't know if I'm introducing unneeded complexity that the vanilla system would handle straight out of the box.

I SUSPECT, based on other questions and answers about binary data, that splunk just isn't architected to handle it very well.

The best option that I can suggest is to convert the binary data into display-hex, thus taking up twice as much space, but consisting only of [0-9A-F]. Then it can be treated as character data.

ddrillic · ‎08-20-2017

Much appreciated @DalJeanis

DalJeanis · ‎08-17-2017

Yes, but no.

First, there is no reason your delimiter can't be something not possible to be present in the data, such as "!!!!".

Second, unless the data is encrypted, those fields don't present as data types that would necessarily include ALL OF the special characters... semicolons, exclamation points, commas, @ # $ ^ & and so on.

So, what's up here?

ddrillic · ‎08-17-2017

Great - thank you @DalJeanis

Instead of the field names mentioned before please consider the following -

Field1 = positions 01-08
Field2 = positions 09-12
Field3 = positions 13-24
Field4 = positions 25-28
Field5 = positions 29-30

These fields may contain any combinations of characters (displayable and non-displayable) including special characters. So there are no combinations of characters that could reliably be used as a delimiter field.
So the question is - Can a lookup table be built from a structured file where the records are fixed length as defined before, and how?

somesoni2 · ‎08-17-2017

Are you indexing this data? Do you want to use data as-is as lookup table file?

ddrillic · ‎08-17-2017

We would like to use the data as-is...

DalJeanis · ‎08-17-2017

Will the system have to deal with any binary zeroes x'00' in the data?

How to build a lookup table without delimiters?

Splunk Observability for AI

Splunk Enterprise Security 8.x: The Essential Upgrade for Threat Detection, ...

Splunk Observability as Code: From Zero to Dashboard

Are you a member of the Splunk Community?

How to build a lookup table without delimiters?

Splunk Observability for AI

Splunk Enterprise Security 8.x: The Essential Upgrade for Threat Detection, ...

Splunk Observability as Code: From Zero to Dashboard