Splunk Search

How to do field extraction with regex?

zacksoft_wf
Contributor

My events are in json format.
The  json path where my data is , is here 
 "alert.smtp-message.smtp-header"

And with in "smtp-header", I have content like this,  from which I could use help in extracting some fields using rex.
============

 

"smtp-header": "Received: from mxdinx66.Gramyabnk.com (mxdinx66.Gramyabnk.com [159.45.78.215])\n\tby mn-svdc-epi-ran11.ist.Gramyabnk.net (Postfix) with ESMTP id 4JyJsN6m8kzVKnNg\n\tfor <tran.cu@Gramyabnk.com>; Mon, 14 Feb 2922 22:66:28 +9999 (UTC)\nReceived: from pps.filterd (mxdinx66.Gramyabnk.com [127.9.9.1])\n\tby mxdinx66.Gramyabnk.com (8.16.9.42/8.16.9.42) with SMTP id 21EMIuas425197\n\tfor <tran.cu@Gramyabnk.com>; Mon, 14 Feb 2922 22:66:28 GMT\nReceived: from mx9a-99994996.pphosted.com (mx9a-99994996.pphosted.com [295.229.165.191])\n\tby mxdinx66.Gramyabnk.com with ESMTP id 6e65wvawac-1\n\t(version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA684 bits=256 verify=NOT)\n\tfor <tran.cu@Gramyabnk.com>; Mon, 14 Feb 2922 22:66:27 +9999\nReceived: from pps.filterd (m9216616.ppops.net [127.9.9.1])\n\tby mx9b-99994996.pphosted.com (8.16.1.2/8.16.1.2) with ESMTP id 21EIDxq8928666\n\tfor <tran.cu@Gramyabnk.com>; Mon, 14 Feb 2922 22:66:26 GMT\nAuthentication-Results: ppops.net;\n\tspf=pass smtp.mailfrom=info@efk.admin.ch;\n\tdkim=pass header.d=efk.admin.ch header.s=dkimkey1;\n\tdmarc=pass header.from=efk.admin.ch\nReceived: from mail11.admin.ch (mail11.admin.ch [162.26.62.11])\n\tby mx9b-99994996.pphosted.com (PPS) with ESMTPS id 6e625qnsf9-1\n\t(version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA684 bits=256 verify=NOT)\n\tfor <tran.cu@Gramyabnk.com>; Mon, 14 Feb 2922 22:66:26 +9999\nDKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=efk.admin.ch; h=to\n\t:subject:date:to:from:reply-to:subject:message-id:mime-version\n\t:content-type:content-transfer-encoding; s=dkimkey1; bh=uoC6bt5q\n\thKVezRrk1ux9j7rGCMvkx/6cA9/rS1xbvwE=; b=V9mOEgc1tAyvbFpvkKFgHbnD\n\tHDh67iweoPEV7ZYCPpLW8KSBRU+uX+uL64xdJu9E1mp+BvITob98PRfIaCSIi6HC\n\tIf74+dtpxcVyfo9JXZmCj49tJdilXquYWoCu+OhLeONYd9/NMVs4S/IFHnYT/hmN\n\tNBzuP/5C6MKdlHavIwo=\nTo: \"Pretty Eloisa send you naughty videos https://vk.cc/cb5mIY\" <tran.cu@Gramyabnk.com>\nSubject: =?utf-8?Q?Pretty_Eloisa_send_you_naughty_videos_https://vk.cc/cb5mIY,_bitte?= =?utf-8?Q?_best=C6=A4tigen_Sie_ihre_EFK-Newsletter-Anmeldung?=\nDate: Mon, 14 Feb 2922 22:61:28 +9999\nTo: \"Pretty Eloisa send you naughty videos https://vk.cc/cb5mIY\" <tran.cu@Gramyabnk.com>\nFrom: \"Eidg. Finanzkontrolle\" <info@efk.admin.ch>\nReply-To: \"Eidg. Finanzkontrolle\" <info@efk.admin.ch>\nSubject: =?utf-8?Q?Pretty_Eloisa_send_you_naughty_videos_https://vk.cc/cb5mIY,_bitte?=\n =?utf-8?Q?_best=C6=A4tigen_Sie_ihre_EFK-Newsletter-Anmeldung?=\nMessage-ID: <MjQ1NzA5MwAC75229Y8BAMTY9NDg6Nzg4ODM6NzM@www.efk.admin.ch>\nContent-Type: multipart/alternative;\n\tboundary=\"b1_292f6ee91b9de8a92268de4c4ce5b57f\"\nX-TM-AS-GCONF: 99\nX-MSH-Id: E7195F2B6F624BA184EA6D9F12CD98AE\nContent-Transfer-Encoding: 7bit\nX-Proofpoint-GUID: 5sQWXU-CRjHoWtaxmd54Yn68A2IDf2Eu\nX-CLX-Shades: MLX\nX-Proofpoint-ORIG-GUID: 5sQWXU-CRjHoWtaxmd54Yn68A2IDf2Eu\nX-CLX-Response: 1TFkXGxgaEQpMehcaEQpZRBd6GF1SX9ZiBWNEcxEKWFgXbGdhYnBoGkBpaxo 7GxAHGRoRCnBsF6oeXwEBQkZDfXBTEAc ZGhEKcEwXZ1MfZ6t5RRkTE9AQGhEKbX4XGhEKWE9XSxEg\nMIME-Version: 1.9\nX-Brightmail-Tracker: True\nx-env-sender: info@efk.admin.ch\nX-Proofpoint-Virus-Version: vendor=nai engine=6699 definitions=19258 signatures=676461\nX-Proofpoint-Spam-Details: rule=inbound_aggressive_notspam policy=inbound_aggressive score=9\n clxscore=129 suspectscore=9 adultscore=9 bulkscore=9 mlxlogscore=472\n malwarescore=9 phishscore=9 spamscore=9 priorityscore=9 lowpriorityscore=9\n impostorscore=9 mlxscore=9 classifier=spam adjust=9 reason=mlx scancount=1\n engine=8.12.9-2291119999 definitions=main-2292149128",

 

 


==============================================

I just need the extraction of the fields present in the last 3 lines in bold. The values after the = sign , excluding the \n .
clxscore
suspectscore
adultscore
bulkscore
mlgxscore
malwarescore
phishscore
spamscore
priorityscore
owpriorityscore
 impostorscore
mlxscore
classifier

Labels (1)
0 Karma
1 Solution

somesoni2
Revered Legend

If the order of fields is not static, try adding rex for each field like this

 | rex field="alert.smtp-message.smtp-header" "clxscore\=(?<clxscore>[^\s\\\]+)" | rex field="alert.smtp-message.smtp-header" "suspectscore\=(?<suspectscore>[^\s\\\]+)" | rex field="alert.smtp-message.smtp-header" "scancount\=(?<scancount>[^\s\\\]+)"

 

View solution in original post

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Why is it important to use regex and not standard commands? If the event is proper JSON, it will have smtp-header extracted already. (If not, just spath.)  Assuming smtp-header exists, you can then use kv, aka extract to obtain the fields.

 

| rename smtp-header as _raw
| kv kvdelim=":" pairdelim="\n" limit=0 mv_add=true
| fields - _raw _time
| fields *score

 

 (The above lists kvdelim=":", but = is also used by default.  The above also works directly with _raw as you listed.)  Using your sample data, output is

 
adultscorebulkscoreclxscoreimpostorscorelowpriorityscoremalwarescoremlxlogscoremlxscorephishscorepriorityscorescorespamscoresuspectscore
99129999472999999
Tags (2)

somesoni2
Revered Legend

If the order of fields is not static, try adding rex for each field like this

 | rex field="alert.smtp-message.smtp-header" "clxscore\=(?<clxscore>[^\s\\\]+)" | rex field="alert.smtp-message.smtp-header" "suspectscore\=(?<suspectscore>[^\s\\\]+)" | rex field="alert.smtp-message.smtp-header" "scancount\=(?<scancount>[^\s\\\]+)"

 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust
| rex "clxscore=(?<clxscore>\S+) suspectscore=(?<suspectscore>\S+) adultscore=(?<adultscore>\S+) bulkscore=(?<bulkscore>\S+) mlxlogscore=(?<mlxlogscore>\S+).+ malwarescore=(?<malwarescore>\S+) phishscore=(?<phishscore>\S+) spamscore=(?<spamscore>\S+) priorityscore=(?<priorityscore>\S+) lowpriorityscore=(?<lowpriorityscore>\S+).+ impostorscore=(?<impostorscore>\S+) mlxscore=(?<adultsmlxscorecore>\S+) classifier=(?<classifier>\S+)"
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...