I am struggling to figure this out. Here is my situation:
1) I have a tab delimited data file. I have defined a transform in transforms.conf to parse this file with a tab delimiter into its 12 fields.
2) I defined a field extraction in Manager to extract thoese 12 fields, referencing the transform that I defined in #1. All of the fields are being extracted successfully.
3) One of those fields is a URI which appears in a couple of different patterns. The name of the field is uri_query.
4) I have defined a secondary transform which operates on the uri_query field that was extracted in the earlier transform (#1 above). This transform extracts several fields from that uri. To definie this transform, I have added the following definition in transforms.conf. This appears below the transform defined in #1 above.
[cdn-uri-v3] REGEX = /(?<producer>[^\/]+)/(?<content_id>\d+)/(?<encoding>(iPad|iPhone)[^\/]+)/(?<file>.[^\t]+)\t SOURCE_KEY = uri_query
5) I then defined a field extraction using Manager to extract those fields.
But nothing is getting extracted from this secondary transform.
I suspect that there are orders of precedence that might be the culprit here. IE, do I need to define all of these using transforms.conf and props.conf? If so, the existing documentation is not very clear on the syntax and order that I need to use. I have tried a couple of these variants, but also to little success.
Can you please edit your question and put four spaces just before cdn-uri-v3]? This would put the line in "code sample mode" and allow us to see the < and > chars (and their content) which are not currently displayed.
I generally do this kind of things straight in props.conf and transforms.conf files, but it is generally possible to extract new fields out of existing fields. Have you checked your regex? What if you run a search like
<your own filters> uri_query=* | rex field=uri_query "<text in your regex>"
Do the extracted fields appear?
Thanks Paolo. I added the additional spaces to the conf snippet. They are appearing now in my post.
I have checked my regexp, although only against the _raw record. Will try now to actually run it using the actual field.
By restricting the rex test against the actual query, i did find a number of problems with my origial regexp. Thanks for the help. Its all working great now.
One last question for you though. Are there any rules of precedence that I need to keep in mind if I choose to use Manager vs. the configuration files to define the extractions and transforms?
Uhm, I'd think you could add config in both ways. In the end, for search time operations, what you've configured in the manager takes precedence over what's in the other config files. This because that kind of Manager configs get stored in etc/users/username/appname/... dir which has higher priority than etc/apps/... I'd suggest you to read through this, though: http://www.splunk.com/base/Documentation/latest/Admin/Wheretofindtheconfigurationfiles