Deployment Architecture

Do I use props.conf on Kepware or Splunk Server, when using Kepware IDF for Splunk?

jhumkey
Path Finder

I'm trying to split a Value="????" into individual subfields. I believe starting with the answer from this question (link) answers.splunk.com/answers/310671/how-to-split-differently-ordered-values-into-subfi.html I know how to achieve the different splits for the different line types.

And SEDCMD seems to be something I can put in the props.conf file.

But . . . the datastream is coming in through Kepware IDF for Splunk ("Industrial Data Forwarder") And the bottom of this page (link) docs.splunk.com/Documentation/Splunk/6.3.0/Data/Anonymizedatausingconfigurationfiles indicates that data coming in through a forwarder, skips parsing and props.conf and goes straight to the indexer as is.

So . . . is there a way to split up the subfields on the Kepware server (under the IDF for Splunk) or . . . does that forwarder honor props.conf on the Splunk server?
Where would the props.conf go on the Kepware server? (If it must be there, and will work there.)

Related question 2 : as I'm debugging the props.conf changes, is there anywhere in Splunk I can see syntax errors? Or do I just keep making changes blind to the SEDCMD lines until its perfect and suddenly all works?

Thanks.

0 Karma
1 Solution

jhumkey
Path Finder

OP here, I'll post my own answer should anyone be headed down the same path. The general suggestions are correct . . . for my case (PLC Data coming in from a factory) . . . my best option is to import the data "as is". Then use Settings -> Fields -> Field Extraction, to form regular expressions and define each of the field (sub fields actually) within each Tag from the PLC. Yes, this seems non-obvious. It means the Tags are being interpreted at SEARCH time, not at Ingestion time. You'd naturally think this would be inefficient, but Splunk seems to be designed for speed around this concept. So, for each of the (approx 20) types of PLC Tag's I'm monitoring, I've regex split up the fields, and can search and report on those fields as if they were "there all along". So, I can't answer the original question I asked (where to actually split up the fields coming from Kepware and the "IDF for Splunk" forwarder). But, I've at least determined that I shouldn't be trying. Using regex and splitting fields up at Search time was the way to go for me. Thanks to any who read the question and attempted to answer.

View solution in original post

0 Karma

jhumkey
Path Finder

OP here, I'll post my own answer should anyone be headed down the same path. The general suggestions are correct . . . for my case (PLC Data coming in from a factory) . . . my best option is to import the data "as is". Then use Settings -> Fields -> Field Extraction, to form regular expressions and define each of the field (sub fields actually) within each Tag from the PLC. Yes, this seems non-obvious. It means the Tags are being interpreted at SEARCH time, not at Ingestion time. You'd naturally think this would be inefficient, but Splunk seems to be designed for speed around this concept. So, for each of the (approx 20) types of PLC Tag's I'm monitoring, I've regex split up the fields, and can search and report on those fields as if they were "there all along". So, I can't answer the original question I asked (where to actually split up the fields coming from Kepware and the "IDF for Splunk" forwarder). But, I've at least determined that I shouldn't be trying. Using regex and splitting fields up at Search time was the way to go for me. Thanks to any who read the question and attempted to answer.

0 Karma

bgilmore_splunk
Splunk Employee
Splunk Employee

Great question! There is a Kepware Explorer app available on Splunkbase https://splunkbase.splunk.com/app/2851/ that demonstrates how to do extractions like that at search time, which would be configured on the search head or indexer. There is a good example for extracting components of the Tag etc and how those components can be turned into new fields.

Always more than one way to skin a cat though, so let me know and we can work on specifics of your environment. assuming that you run a Universal Forwarder on the same machine as the Kepserver, all documentation on extract transform etc would apply once data hits the UF.

0 Karma

jhumkey
Path Finder

"how to do extractions like that at search time".

I guess I didn't explain well. I (think I) need the fields split up at import time. Such that they exist as independent indexed fields. If I wait until Search time . . . it "looks" better on the output, but then I can't search based on the subfield. Searching for 1234 is very different than Barcode14=1234. The first will find 1234 "anywhere" in "any" subfield. The second will ONLY find 1234 if it exists in Barcode14. And I don't think, you can "split" and "search" on a field at the same time during the search process.
So . . . I'm looking where/how to split into subfields from a stream coming from the IDF for Splunk directly in.
Thanks though.

0 Karma

bgilmore_splunk
Splunk Employee
Splunk Employee

We typically don't recommend extracting new fields and indexing them, but it is possible to configure this on the indexer, see http://docs.splunk.com/Documentation/Splunk/6.1.3/Data/Configureindex-timefieldextraction for more information.

But I think based on your question, you are still trying to do something that is very possible at search time. Remember that your search runs against the _raw of your event, so in order to find 1234 in an event, you can just use an asterix wildcard on both sides of 1234 which will filter to any events that contain that set of characters, with any other characters on the left and right. This may be oversimplifying what you are trying to accomplish though.

If you run a search time extraction using rex http://docs.splunk.com/Documentation/Splunk/6.3.0/SearchReference/Rex, new fields you create from the content of Value will be available once you run the search. Any search after the next pipe would be able to use the extracted fields as if they were in the data originally. Benefit of this is that you can use your filter constraint to only run the rex against Tags that match your multivalue rule. Let me know if this solves for you - thanks!

0 Karma

jhumkey
Path Finder

So far . . . "String" String2 OR String3 | rex+sed . . . . works to find the Strings and use the rex+sed to expand the fields. I haven't found ANY combination of reversing that order . . . . rex+sed | "String" String2 OR String3 . . . that will work, or allow me to search on the expanded fields. Search seems (in the 20 cases I've attempted) to demand that the search strings come before the rex+sed expansion. Any attempt to leave the Strings off the front end, gets Syntax errors at the first double quote in the sed, for missing search parameters. Any attempt to search for an initial string, then pipe the output of rex+sed to expand fields, and search for more strings after the rex, gives "unknown command STRING" as a syntax error, after the rex+sed. Even if I manage to accomplish it . . . mere mortal "users" will be using this. The test search is four lines long to break up the strings for three input types, and I'm going to have maybe 40 total types of lines to expand. So I'll end up with (estimating) 53 lines of field expansion, with search strings at the end and an "end user" will have to find and delicately edit the barcodes on the suffix? That's expecting too much of my end users. (To not screw up.) Plus, to "use rex+sed first" . . . expand everything for every line, then narrow by String, seems to demand a "full table scan" to get the strings to search for. Whereas to find the pre-indexed strings . . . seems the more efficient option once the volume of historical data grows large. And I still haven't gotten past the scary statement in the documentation "Splunk Enterprise does not parse structured data that has been forwarded to an indexer." (Does that mean it's not possible to split on input data coming from Kepware's IDF for Splunk, since its Forwarding straight in to the Indexer at all?)

0 Karma
Get Updates on the Splunk Community!

Easily Improve Agent Saturation with the Splunk Add-on for OpenTelemetry Collector

Agent Saturation What and Whys In application performance monitoring, saturation is defined as the total load ...

Explore the Latest Educational Offerings from Splunk [January 2025 Updates]

At Splunk Education, we are committed to providing a robust learning experience for all users, regardless of ...

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...