I am working on ingesting the WSJT-X log. I got to where I have the basic fields in Splunk and wanted to create a date and time stamp from the poorly formatted data. I started with a very basic eval statement to test this and I am not seeing the new field. So, what did I miss?
I created the following:
transforms.conf
[wsjtx_log]
REGEX = (\d{2})(\d{2})(\d{2})_(\d{2})(\d{2})(\d{2})\s+(\d+\.\d+)\s+(\w+)\s+(\w+)\s+(\d+|-\d+)\s+(-\d+\.\d+|\d+\.\d+)\s+(\d+)\s+(.+)
FORMAT = year::$1 month::$2 day::$3 hour::$4 min::$5 sec::$6 freqMhz::$7 action::$8 mode::$9 rxDB::$10 timeOffset::$11 freqOffSet::$12 remainder::$13
[add20]
INGEST_EVAL = fyear="20" . $year$
props.conf
[wsjtx_log]
REPORT-wsjtx_all = wsjtx_log
TRANSFORMS = add20
fields.conf
fyear]
INDEXED = TRUE
The fields extracted with REPORT are eztracted in search time so they're not available inindex time for INGEST_EVAL.
That makes sense, the docs mentioned the order of operations but sometimes that doesnt sink in. It was easy enough to transition but I am still not seeing the field. I do see the fields as parsed from the props.conf-
Props:
[wsjtx_log]
#REPORT-wsjtx_all = wsjtx_log
EXTRACT-wsjtx = (?<year>\d{2})(?<month>\d{2})(?<day>\d{2})_(?<hour>\d{2})(?<min>\d{2})(?<sec>\d{2})\s+(?<freqMhz>\d+\.\d+)\s+(?<action>\w+)\s+(?<mode>\w+)\s+(?<rxDB>\d+|-\d+)\s+(?<timeOffset>-\d+\.\d+|\d+\.\d+)\s+(?<freqOffSet>\d+)\s+(?<remainder>.+)
TRANSFORMS = add20
Transform:
[wsjtx_log]
#REGEX = (\d{2})(\d{2})(\d{2})_(\d{2})(\d{2})(\d{2})\s+(\d+\.\d+)\s+(\w+)\s+(\w+)\s+(\d+|-\d+)\s+(-\d+\.\d+|\d+\.\d+)\s+(\d+)\s+(.+)
#FORMAT = year::$1 month::$2 day::$3 hour::$4 min::$5 sec::$6 freqMhz::$7 action::$8 mode::$9 rxDB::$10 timeOffset::$11 freqOffSet::$12 remainder::$13
[add20]
INGEST_EVAL = fyear="20" . $year$
Fields:
[fyear]
INDEXED = TRUE
No. It's not about the order of operations.
It's about search-time vs. index-time.
REPORT and EXTRACT are two operations that are done on the event in search time - when the event is being read from the index and processed before presenting to the user. INGEST_EVAL is an operation which is done in index-time - when the event is initially received from the source and before it's written to the index. You search-time operations are not performed in index-time (and vice-versa).
So regardless of whether you define your search-time operations inline or with transform (in other words - as REPORT or EXTRACT), they are not active in index-time. You can only operate on indexed fields with INGEST_EVAL. So if you want to extract a part of your event in order to use it in INGEST_EVAL, you have to first extract it with TRANSFORM as indexed field (if you don't need it stored later, you can afterwards rewrite it with another INGEST_EVAL to null()).
I removed the $ signs from the field (I copied from the web UI). I also used this as a guide but still no go.
https://docs.splunk.com/Documentation/Splunk/9.2.2/Data/IngestEval
OK. From the start.
Your INGEST_EVAL looks like this:
INGEST_EVAL = fyear="20" . year
Right?
Where does the "year" field come from?
From the EXTRACT in props.conf.
EXTRACT-wsjtx = (?<year>\d{2})(?<month>\d{2})(?<day>\d{2})_(?<hour>\d{2})(?<min>\d{2})(?<sec>\d{2})\s+(?<freqMhz>\d+\.\d+)\s+(?<action>\w+)\s+(?<mode>\w+)\s+(?<rxDB>\d+|-\d+)\s+(?<timeOffset>-\d+\.\d+|\d+\.\d+)\s+(?<freqOffSet>\d+)\s+(?<remainder>.+)
As I wrote before - EXTRACT and REPORT are run in search-time. TRANSFORM (including INGEST_EVAL) is run in index-time. You don't have search-time stuff in index-time. So you don't have your "year" field when you're trying to run INGEST_EVAL.
Ok, between your commentary and my re-write of the documentation I got this working. I will post my re write of the splunk instructions and the confs ASAP. Wish I could attach docs..