Hi team,
I need to extract the highlighted field in the below messege using regex... I have tried Splunk inbuilt field extraction it is throwing error when i use multiple field...
{ "eventTime": "2024-06-24T06:15:42Z", "leaduuid": "1234455", "CrmId": "11111111", "studentCrmUuid": "634543564", "externalId": "", "SiteId": "xxxx", "subCategory": "", "category": "Course Enquiry", "eventId": "", "eventRegistrationId": "", "status": "Open", "source": "Online Enquiry", "leadId": "22222222", "assignmentStatusCode": "", "assignmentStatus": "", "isFirstLead": "yes", "c4cEventId": "", "channelPartnerApplication": "no", "applicationReceivedDate": "", "referredBy": "", "referrerCounsellor": "", "createdBy": "Technical User", "lastChangedBy": "Technical User" , "leadSubAgentID": "", "cancelReason": ""}, "offersInPrinciple": {"offersinPrinciple": "no", "oipReferenceNumber": "", "oipVerificationStatus": ""}, "qualification": {"qualification": "Unqualified", "primaryFinancialSource": ""}, "online": {"referringUrl": "", "idpNearestOffice": "", "sourceSiteId": "xxxxx", "preferredCounsellingMode": "", "institutionInfo": "", "courseName": "", "howDidYouHear": "Social Media"}
I need to extract the highlighted field in the below messege using regex...
Not only do you not NEED to do this using regex, you MUST NOT use regex for this task. As @ITWhisperer points out, your data is in JSON, a structured data. Never treat structured data as plain text as @PickleRick points out.
As @PickleRick notes, you can set KV_MODE = json in your sourcetype. But even if you do not, Splunk should have already figured out and give you CrmId, status, source, etc. Do you not get these field names and values?
field name | field value |
CrmId | 11111111 |
SiteId | xxxx |
applicationReceivedDate | |
assignmentStatus | |
assignmentStatusCode | |
c4cEventId | |
cancelReason | |
category | Course Enquiry |
channelPartnerApplication | no |
createdBy | Technical User |
eventId | |
eventRegistrationId | |
eventTime | 2024-06-24T06:15:42Z |
externalId | |
isFirstLead | yes |
lastChangedBy | Technical User |
leadId | 22222222 |
leadSubAgentID | |
leaduuid | 1234455 |
referredBy | |
referrerCounsellor | |
source | Online Enquiry |
status | Open |
studentCrmUuid | 634543564 |
subCategory |
Even if you do not for some oddball reason, using spath should suffice. This is an example with spath using @ITWhisperer's makeresults emulation.
| makeresults
| eval _raw="{ \"eventTime\": \"2024-06-24T06:15:42Z\", \"leaduuid\": \"1234455\", \"CrmId\": \"11111111\", \"studentCrmUuid\": \"634543564\", \"externalId\": \"\", \"SiteId\": \"xxxx\", \"subCategory\": \"\", \"category\": \"Course Enquiry\", \"eventId\": \"\", \"eventRegistrationId\": \"\", \"status\": \"Open\", \"source\": \"Online Enquiry\", \"leadId\": \"22222222\", \"assignmentStatusCode\": \"\", \"assignmentStatus\": \"\", \"isFirstLead\": \"yes\", \"c4cEventId\": \"\", \"channelPartnerApplication\": \"no\", \"applicationReceivedDate\": \"\", \"referredBy\": \"\", \"referrerCounsellor\": \"\", \"createdBy\": \"Technical User\", \"lastChangedBy\": \"Technical User\" , \"leadSubAgentID\": \"\", \"cancelReason\": \"\"}, \"offersInPrinciple\": {\"offersinPrinciple\": \"no\", \"oipReferenceNumber\": \"\", \"oipVerificationStatus\": \"\"}, \"qualification\": {\"qualification\": \"Unqualified\", \"primaryFinancialSource\": \"\"}, \"online\": {\"referringUrl\": \"\", \"idpNearestOffice\": \"\", \"sourceSiteId\": \"xxxxx\", \"preferredCounsellingMode\": \"\", \"institutionInfo\": \"\", \"courseName\": \"\", \"howDidYouHear\": \"Social Media\"}"
``` ITWhisperer's data emulation ```
| spath
It gives the above field names and values.
Hi @ITWhisperer
You provided rex is also not working as expected.
Given your sample data, the extraction does work, as shown by this runanywhere example:
| makeresults
| eval _raw="{ \"eventTime\": \"2024-06-24T06:15:42Z\", \"leaduuid\": \"1234455\", \"CrmId\": \"11111111\", \"studentCrmUuid\": \"634543564\", \"externalId\": \"\", \"SiteId\": \"xxxx\", \"subCategory\": \"\", \"category\": \"Course Enquiry\", \"eventId\": \"\", \"eventRegistrationId\": \"\", \"status\": \"Open\", \"source\": \"Online Enquiry\", \"leadId\": \"22222222\", \"assignmentStatusCode\": \"\", \"assignmentStatus\": \"\", \"isFirstLead\": \"yes\", \"c4cEventId\": \"\", \"channelPartnerApplication\": \"no\", \"applicationReceivedDate\": \"\", \"referredBy\": \"\", \"referrerCounsellor\": \"\", \"createdBy\": \"Technical User\", \"lastChangedBy\": \"Technical User\" , \"leadSubAgentID\": \"\", \"cancelReason\": \"\"}, \"offersInPrinciple\": {\"offersinPrinciple\": \"no\", \"oipReferenceNumber\": \"\", \"oipVerificationStatus\": \"\"}, \"qualification\": {\"qualification\": \"Unqualified\", \"primaryFinancialSource\": \"\"}, \"online\": {\"referringUrl\": \"\", \"idpNearestOffice\": \"\", \"sourceSiteId\": \"xxxxx\", \"preferredCounsellingMode\": \"\", \"institutionInfo\": \"\", \"courseName\": \"\", \"howDidYouHear\": \"Social Media\"}"
| rex "\"CrmId\": \"(?<CrmId>[^\"]+).*\"status\": \"(?<status>[^\"]+).*\"source\": \"(?<source>[^\"]+).*\"leadId\": \"(?<leadId>[^\"]+).*\"isFirstLead\": \"(?<isFirstLead>[^\"]+).*\"offersinPrinciple\": \"(?<offersinPrinciple>[^\"]+).*\"sourceSiteId\": \"(?<sourceSiteId>[^\"]+).*\"howDidYouHear\": \"(?<howDidYouHear>[^\"]+)"
Please provide more details on what exactly is "not working", and more examples of your events demonstrating the failure.
Firstly, this looks like JSON so you should probably look to use JSON extractions. If you are getting errors with this, then perhaps you could share what you tried and what errors you got, and perhaps it can be resolved that way.
However, if you want to continue down the rex track (not recommended), you could try something like this
| rex "\"CrmId\": \"(?<CrmId>[^\"]+).*\"status\": \"(?<status>[^\"]+).*\"source\": \"(?<source>[^\"]+).*\"leadId\": \"(?<leadId>[^\"]+).*\"isFirstLead\": \"(?<isFirstLead>[^\"]+).*\"offersinPrinciple\": \"(?<offersinPrinciple>[^\"]+).*\"sourceSiteId\": \"(?<sourceSiteId>[^\"]+).*\"howDidYouHear\": \"(?<howDidYouHear>[^\"]+)"
Hi @ITWhisperer
the below error message I got
The extraction failed. If you are extracting multiple fields, try removing one or more fields. Start with extractions that are embedded within longer text strings.
1. Don't use the "graphical" extractor. It is there for simple cases but for more complicated ones it might not find proper way of extracting fields and if it does it will most probably not be the proper and efficient way to do it.
2. As @ITWhisperer already pointed out - this seems to be a json structure. Use proper KV_MODE and don't try to be smart. Fiddling with regexes against structured data usually ends badly.