topic How to do field extraction using transform? in Splunk Search

How to do field extraction using transform?

arun_kant_sharm — Mon, 28 Mar 2022 16:29:01 GMT

Hello Experts,

I am facing difficulty at index time fields extraction.

My sample log file format:

Time stamp: Fri Mar 18 00:00:49 2022 File: File_name_1 Renamed to: Rename_1 Time stamp: Fri Mar 18 00:00:50 2022 File: File_name_1 Renamed to: Rename_1 Time stamp: Fri Mar 18 00:00:51 2022 File: File_name_1 Renamed to: Rename_1 Time stamp: Fri Mar 18 00:00:52 2022 File: File_name_1 Renamed to: Rename_1 Time stamp: Fri Mar 18 00:00:53 2022 File: File_name_1 Renamed to: Rename_1

props.conf [ demo ] CHARSET=AUTO LINE_BREAKER=([\r\n]+) MAX_TIMESTAMP_LOOKAHEAD=24 NO_BINARY_CHECK=true SHOULD_LINEMERGE=true TIME_FORMAT=%a %b %d %H:%M:%S %Y TIME_PREFIX=^Time stamp:\s+ TRANSFORMS-extractfield=extract_demo_field TRUNCATE=100000 transforms.conf [extract_demo_field] REGEX =^Time stamp:\s*(?<timeStamp>.*)$\s*^File:\s*(?<file>.*)$\s*^Renamed to:\s+(?<renameFile>.*)$ FORMAT = time_stamp::$1 file::$2 renamed_to::$3 WRITE_META = true

Re: field extraction using transform

PickleRick — Mon, 28 Mar 2022 11:57:28 GMT

Firstly, what is your problem? 🙂

Secondly, if you use $1, $2 and so on, you don't neet to name capture groups.

And finally, did you define entries in fields.conf for indexed fields?

Re: field extraction using transform

dhirendra761 — Mon, 28 Mar 2022 16:04:29 GMT

@arun_kant_sharm

Don't set forcefully anything in Splunk.

Splunk even understand this timestamp as well.

No need to extract time and setting transform.conf

Below stanza is also working , these are just default setting for sourcetype

[demo] DATETIME_CONFIG = LINE_BREAKER = ([\r\n]+) NO_BINARY_CHECK = true category = Custom pulldown_type = true

Re: field extraction using transform

PickleRick — Mon, 28 Mar 2022 16:36:00 GMT

Sorry, but I have to disagree here.

If you can, tell splunk as much as you know. And telling splunk the explicit time format and position is one of the most (if not the most) important things in making inputs quicker. Don't leave splunk guessing. I know that in low-volume environments it might work but once you hit several thousands EPS levels, you want all the performace you can get (and you want to avoid ambiguities).