Splunk Search

remove the special character ' from beginning and end of the field value

mustafag
Path Finder

Hi,
I am reeving the logs from email gateway and all the field values are between ' character and those are captured as part of field value. Below is the sample log.

<22>May 21 14:16:30 meg234 : app='smtp', name='Email Status', policy_name='', dvc_host='', virtual_host='meg.test.com', event_id=50006, reason_id=77, direction=1, src_ip='1.1.1.1', src_host='meg.test.com', dest_ip='2.2.2.2', dest_host='', rhdr_ip='', is_primary_action=, scanner='', action='', status='Email Delivered', sender=, recipient='', msgid='5b69_036d_201e8739_1cef_495b_a267_8ce04d4b9c36', orig_msgid='2ecf5795f6ea4e1ca81c732102316082@test.local', nrcpts=1, relay='', subject='sadeer Final', encryption_type='0', orig_subject='', orig_sender='', size=238141, attachments='Companytest.docx, test.xlsx', number_attachments=2, virus_name='', file_name='', spamscore=, spamthreshold=, spamrules='', URL='', contentrule='[]', content_terms='[]', tz='GMT', tz_offset='+0000', dlpfile='', dlprules='', dlpclassification='', dlpfileuploaded='', dlpfiledigest='', dlpfilesize='', iascore=, iathreshold=, ts_reputation_score=, ts_geo_location='', ts_ip_rep_status=, ts_hash_length=, ts_lookup_hash='', local-time='2017-05-21_14:16:23_GMT' scan-host-name='meg', scan-host-ip='1.1.1.1', host-name='meg234', host-domain-name='test.com', mac-address='00:00:00:34:79:45', product='FG (9.9) PM5600', user-name='test'

All the captured field value including the special character ' as begging and end of the value. I wan't to remove the special character ' from all the beginning and end of the value. of all the fields.
help me on this.

Tags (1)
0 Karma
1 Solution

woodcock
Esteemed Legend

The Search-Time Order of Operations is this:

Sourcetype RENAME
EXTRACT-xxx
REPORT-xxx
KV_MODE
FIELDALIAS-xxx
EVAL-xxx
LOOKUP-xxx
MILLISECONDS
FILTER
EVENTTYPING
TAGGING

So use EVAL instead of EXTRACT and try this:

[YourSourcetypeHere]
EVAL-app=replace(app, "^'|'$", "")

And so on for all of the field names.

View solution in original post

0 Karma

woodcock
Esteemed Legend

The Search-Time Order of Operations is this:

Sourcetype RENAME
EXTRACT-xxx
REPORT-xxx
KV_MODE
FIELDALIAS-xxx
EVAL-xxx
LOOKUP-xxx
MILLISECONDS
FILTER
EVENTTYPING
TAGGING

So use EVAL instead of EXTRACT and try this:

[YourSourcetypeHere]
EVAL-app=replace(app, "^'|'$", "")

And so on for all of the field names.

0 Karma

aakwah
Builder

Hello,

To handle it at search time, you can add the following to props.conf (on searchheads):

[Sourcetype]
EXTRACT-app = app=\'(?<app>\w+)\'

and so on for other fields.

Regards

0 Karma

woodcock
Esteemed Legend

No this will not help because EXTRACT happens before KV_MODE; that's why I asked how the fields were being created.

0 Karma

aakwah
Builder

@woodcock as per my test (Splunk version 6.5.3) EXTRACT is working fine with KV_MODE=auto

0 Karma

woodcock
Esteemed Legend

This is very strange but hey; there you go!

0 Karma

aakwah
Builder

I believe that allowing EXTRACT to work after KV_MODE is intended to make some tweaks on the automatically extracted fields.

0 Karma

woodcock
Esteemed Legend

During search, you can do it like this:

... | foreach * [rex field=<<FIELD>> mode=sed "s/^'// s/'$//"]

During indexing, we will need to know how you are indexing your fields.

0 Karma

mustafag
Path Finder

During the search, below command is working but I need to fix in props.conf .
index =test | | rex mode=sed "s/'//g"

0 Karma

woodcock
Esteemed Legend

Your SEDCMD approach is wrong because it does not consider the fact that the ' character frequently occur inside of the field data with an escape character and this will strip the quote but leave the escape and be very confusing. How are you creating your fields now? Are you using KV_MODE=auto?

0 Karma

mustafag
Path Finder

yes I am using auto mode.

0 Karma

mustafag
Path Finder

Just to add..
When I am using the search with below, It's shows the special character ' removed.
index =test | | rex mode=sed "s/'//g"

but when I add the below in props.conf, special character are not removing.

SEDCMD-RemoveSingleQuotes = s//'//g

0 Karma

aakwah
Builder

SEDCMD is used at index time.

as per docs (http://docs.splunk.com/Documentation/Splunk/6.2.1/admin/Propsconf):

SEDCMD-<class> = <sed script>
* Only used at index time.
0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

What Is Splunk? Here’s What You Can Do with Splunk

Hey Splunk Community, we know you know Splunk. You likely leverage its unparalleled ability to ingest, index, ...

Level Up Your .conf25: Splunk Arcade Comes to Boston

With .conf25 right around the corner in Boston, there’s a lot to look forward to — inspiring keynotes, ...

Manual Instrumentation with Splunk Observability Cloud: How to Instrument Frontend ...

Although it might seem daunting, as we’ve seen in this series, manual instrumentation can be straightforward ...