Getting Data In

How can I get rid of this strange field delimiter and get fields "data" and "UID" correctly separated?

chthies
Explorer

Hi! Hope all are fine, and thanks in advance for any help

I'm having problems ingesting Linux Audit Log. For some reason, a weird field delimiter is not being correctly interpreted by Splunk. I'm pasting the examples

How can I get rid of this and get fields "data" and "UID" correctly separated?

chthies_0-1652383648745.pngchthies_1-1652383661450.pngchthies_2-1652383669224.png

 

 

0 Karma
1 Solution

m_pham
Splunk Employee
Splunk Employee

I would take a look at the CHARSET config in "props.conf" of where your input is located. 

CHARSET = <string>
* When set, Splunk software assumes the input from the given [<spec>] is in
  the specified encoding.
* Can only be used as the basis of [<sourcetype>] or [source::<spec>],
  not [host::<spec>].
* A list of valid encodings can be retrieved using the command "iconv -l" on
  most *nix systems.
* If an invalid encoding is specified, a warning is logged during initial
  configuration and further input from that [<spec>] is discarded.
* If the source encoding is valid, but some characters from the [<spec>] are
  not valid in the specified encoding, then the characters are escaped as
  hex (for example, "\xF3").
* When set to "AUTO", Splunk software attempts to automatically determine the
  character encoding and convert text from that encoding to UTF-8.
* For a complete list of the character sets Splunk software automatically
  detects, see the online documentation.
* This setting applies at input time, when data is first read by Splunk
  software, such as on a forwarder that has configured inputs acquiring the
  data.
* Default (on Windows machines): AUTO
* Default (otherwise): UTF-8

https://docs.splunk.com/Documentation/Splunk/latest/Admin/propsconf

https://docs.splunk.com/Documentation/Splunk/latest/Data/Configurecharactersetencoding

 

If you have access to the raw log, I guess you can try to paste it into regex101 and create your own regex to replace the character with an empty space. REDACT any sensitive data before you paste it into regex101.

Example props.conf on wherever is parsing the data:

[my_sourcetype]
SEDCMD-removeWeirdCharacter = s/<square_character_here>/ /

  

View solution in original post

m_pham
Splunk Employee
Splunk Employee

I would take a look at the CHARSET config in "props.conf" of where your input is located. 

CHARSET = <string>
* When set, Splunk software assumes the input from the given [<spec>] is in
  the specified encoding.
* Can only be used as the basis of [<sourcetype>] or [source::<spec>],
  not [host::<spec>].
* A list of valid encodings can be retrieved using the command "iconv -l" on
  most *nix systems.
* If an invalid encoding is specified, a warning is logged during initial
  configuration and further input from that [<spec>] is discarded.
* If the source encoding is valid, but some characters from the [<spec>] are
  not valid in the specified encoding, then the characters are escaped as
  hex (for example, "\xF3").
* When set to "AUTO", Splunk software attempts to automatically determine the
  character encoding and convert text from that encoding to UTF-8.
* For a complete list of the character sets Splunk software automatically
  detects, see the online documentation.
* This setting applies at input time, when data is first read by Splunk
  software, such as on a forwarder that has configured inputs acquiring the
  data.
* Default (on Windows machines): AUTO
* Default (otherwise): UTF-8

https://docs.splunk.com/Documentation/Splunk/latest/Admin/propsconf

https://docs.splunk.com/Documentation/Splunk/latest/Data/Configurecharactersetencoding

 

If you have access to the raw log, I guess you can try to paste it into regex101 and create your own regex to replace the character with an empty space. REDACT any sensitive data before you paste it into regex101.

Example props.conf on wherever is parsing the data:

[my_sourcetype]
SEDCMD-removeWeirdCharacter = s/<square_character_here>/ /

  

chthies
Explorer

Excellent advice! Works!

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

What is the "weird" character? What settings do you have already configured?

0 Karma

chthies
Explorer

Thanks for answering! I've attached an image of how I'm seeing the character. Did you see it?

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...