Splunk Search

How to do field extraction with regex?

zacksoft_wf
Contributor

My events are in json format.
The  json path where my data is , is here 
 "alert.smtp-message.smtp-header"

And with in "smtp-header", I have content like this,  from which I could use help in extracting some fields using rex.
============

 

"smtp-header": "Received: from mxdinx66.Gramyabnk.com (mxdinx66.Gramyabnk.com [159.45.78.215])\n\tby mn-svdc-epi-ran11.ist.Gramyabnk.net (Postfix) with ESMTP id 4JyJsN6m8kzVKnNg\n\tfor <tran.cu@Gramyabnk.com>; Mon, 14 Feb 2922 22:66:28 +9999 (UTC)\nReceived: from pps.filterd (mxdinx66.Gramyabnk.com [127.9.9.1])\n\tby mxdinx66.Gramyabnk.com (8.16.9.42/8.16.9.42) with SMTP id 21EMIuas425197\n\tfor <tran.cu@Gramyabnk.com>; Mon, 14 Feb 2922 22:66:28 GMT\nReceived: from mx9a-99994996.pphosted.com (mx9a-99994996.pphosted.com [295.229.165.191])\n\tby mxdinx66.Gramyabnk.com with ESMTP id 6e65wvawac-1\n\t(version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA684 bits=256 verify=NOT)\n\tfor <tran.cu@Gramyabnk.com>; Mon, 14 Feb 2922 22:66:27 +9999\nReceived: from pps.filterd (m9216616.ppops.net [127.9.9.1])\n\tby mx9b-99994996.pphosted.com (8.16.1.2/8.16.1.2) with ESMTP id 21EIDxq8928666\n\tfor <tran.cu@Gramyabnk.com>; Mon, 14 Feb 2922 22:66:26 GMT\nAuthentication-Results: ppops.net;\n\tspf=pass smtp.mailfrom=info@efk.admin.ch;\n\tdkim=pass header.d=efk.admin.ch header.s=dkimkey1;\n\tdmarc=pass header.from=efk.admin.ch\nReceived: from mail11.admin.ch (mail11.admin.ch [162.26.62.11])\n\tby mx9b-99994996.pphosted.com (PPS) with ESMTPS id 6e625qnsf9-1\n\t(version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA684 bits=256 verify=NOT)\n\tfor <tran.cu@Gramyabnk.com>; Mon, 14 Feb 2922 22:66:26 +9999\nDKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=efk.admin.ch; h=to\n\t:subject:date:to:from:reply-to:subject:message-id:mime-version\n\t:content-type:content-transfer-encoding; s=dkimkey1; bh=uoC6bt5q\n\thKVezRrk1ux9j7rGCMvkx/6cA9/rS1xbvwE=; b=V9mOEgc1tAyvbFpvkKFgHbnD\n\tHDh67iweoPEV7ZYCPpLW8KSBRU+uX+uL64xdJu9E1mp+BvITob98PRfIaCSIi6HC\n\tIf74+dtpxcVyfo9JXZmCj49tJdilXquYWoCu+OhLeONYd9/NMVs4S/IFHnYT/hmN\n\tNBzuP/5C6MKdlHavIwo=\nTo: \"Pretty Eloisa send you naughty videos https://vk.cc/cb5mIY\" <tran.cu@Gramyabnk.com>\nSubject: =?utf-8?Q?Pretty_Eloisa_send_you_naughty_videos_https://vk.cc/cb5mIY,_bitte?= =?utf-8?Q?_best=C6=A4tigen_Sie_ihre_EFK-Newsletter-Anmeldung?=\nDate: Mon, 14 Feb 2922 22:61:28 +9999\nTo: \"Pretty Eloisa send you naughty videos https://vk.cc/cb5mIY\" <tran.cu@Gramyabnk.com>\nFrom: \"Eidg. Finanzkontrolle\" <info@efk.admin.ch>\nReply-To: \"Eidg. Finanzkontrolle\" <info@efk.admin.ch>\nSubject: =?utf-8?Q?Pretty_Eloisa_send_you_naughty_videos_https://vk.cc/cb5mIY,_bitte?=\n =?utf-8?Q?_best=C6=A4tigen_Sie_ihre_EFK-Newsletter-Anmeldung?=\nMessage-ID: <MjQ1NzA5MwAC75229Y8BAMTY9NDg6Nzg4ODM6NzM@www.efk.admin.ch>\nContent-Type: multipart/alternative;\n\tboundary=\"b1_292f6ee91b9de8a92268de4c4ce5b57f\"\nX-TM-AS-GCONF: 99\nX-MSH-Id: E7195F2B6F624BA184EA6D9F12CD98AE\nContent-Transfer-Encoding: 7bit\nX-Proofpoint-GUID: 5sQWXU-CRjHoWtaxmd54Yn68A2IDf2Eu\nX-CLX-Shades: MLX\nX-Proofpoint-ORIG-GUID: 5sQWXU-CRjHoWtaxmd54Yn68A2IDf2Eu\nX-CLX-Response: 1TFkXGxgaEQpMehcaEQpZRBd6GF1SX9ZiBWNEcxEKWFgXbGdhYnBoGkBpaxo 7GxAHGRoRCnBsF6oeXwEBQkZDfXBTEAc ZGhEKcEwXZ1MfZ6t5RRkTE9AQGhEKbX4XGhEKWE9XSxEg\nMIME-Version: 1.9\nX-Brightmail-Tracker: True\nx-env-sender: info@efk.admin.ch\nX-Proofpoint-Virus-Version: vendor=nai engine=6699 definitions=19258 signatures=676461\nX-Proofpoint-Spam-Details: rule=inbound_aggressive_notspam policy=inbound_aggressive score=9\n clxscore=129 suspectscore=9 adultscore=9 bulkscore=9 mlxlogscore=472\n malwarescore=9 phishscore=9 spamscore=9 priorityscore=9 lowpriorityscore=9\n impostorscore=9 mlxscore=9 classifier=spam adjust=9 reason=mlx scancount=1\n engine=8.12.9-2291119999 definitions=main-2292149128",

 

 


==============================================

I just need the extraction of the fields present in the last 3 lines in bold. The values after the = sign , excluding the \n .
clxscore
suspectscore
adultscore
bulkscore
mlgxscore
malwarescore
phishscore
spamscore
priorityscore
owpriorityscore
 impostorscore
mlxscore
classifier

Labels (1)
0 Karma
1 Solution

somesoni2
Revered Legend

If the order of fields is not static, try adding rex for each field like this

 | rex field="alert.smtp-message.smtp-header" "clxscore\=(?<clxscore>[^\s\\\]+)" | rex field="alert.smtp-message.smtp-header" "suspectscore\=(?<suspectscore>[^\s\\\]+)" | rex field="alert.smtp-message.smtp-header" "scancount\=(?<scancount>[^\s\\\]+)"

 

View solution in original post

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Why is it important to use regex and not standard commands? If the event is proper JSON, it will have smtp-header extracted already. (If not, just spath.)  Assuming smtp-header exists, you can then use kv, aka extract to obtain the fields.

 

| rename smtp-header as _raw
| kv kvdelim=":" pairdelim="\n" limit=0 mv_add=true
| fields - _raw _time
| fields *score

 

 (The above lists kvdelim=":", but = is also used by default.  The above also works directly with _raw as you listed.)  Using your sample data, output is

 
adultscorebulkscoreclxscoreimpostorscorelowpriorityscoremalwarescoremlxlogscoremlxscorephishscorepriorityscorescorespamscoresuspectscore
99129999472999999
Tags (2)

somesoni2
Revered Legend

If the order of fields is not static, try adding rex for each field like this

 | rex field="alert.smtp-message.smtp-header" "clxscore\=(?<clxscore>[^\s\\\]+)" | rex field="alert.smtp-message.smtp-header" "suspectscore\=(?<suspectscore>[^\s\\\]+)" | rex field="alert.smtp-message.smtp-header" "scancount\=(?<scancount>[^\s\\\]+)"

 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust
| rex "clxscore=(?<clxscore>\S+) suspectscore=(?<suspectscore>\S+) adultscore=(?<adultscore>\S+) bulkscore=(?<bulkscore>\S+) mlxlogscore=(?<mlxlogscore>\S+).+ malwarescore=(?<malwarescore>\S+) phishscore=(?<phishscore>\S+) spamscore=(?<spamscore>\S+) priorityscore=(?<priorityscore>\S+) lowpriorityscore=(?<lowpriorityscore>\S+).+ impostorscore=(?<impostorscore>\S+) mlxscore=(?<adultsmlxscorecore>\S+) classifier=(?<classifier>\S+)"
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...