Getting Data In

SEDCMD for MAC address that are missing leading zeroes between colons

craigkleen
Communicator

So, some companies in their infinite wisdom strip leading zeroes from the bytes WITHIN MAC addresses, so we end up with logs that make it a little hard to search consistently. I suppose a little mental arithmetic is easy enough, but I'd really love to find a way to fix this problem at index time, even though I know it will marginally increase my license. Here's an example of one log:

Oct 17 10:43:28 10.0.0.1 dhcpd[13431]: Option 82: received a REQUEST DHCP packet from relay-agent 10.0.1.1 with a circuit-id of "31:31:31:2d:31:31:31:2d:41:31" and remote-id of "6c:6d:6e:6f:70:71:72:73:74:75:76:77" for 10.0.1.100 (0:5:a6:b:ea:80) lease time is undefined seconds.

I'd like to be able to replace ALL single hex digits that shows up in the bolded string to be a compliant xx:xx:xx:xx:xx:xx style MAC address, inserting leading zeroes along the way, without having to have SEDCMD run multiple times in a row.

0 Karma

maciep
Champion

I think you'd probably want to make this a bit more robust. But to just give you an idea, you could do something like this in your props.conf on your indexer/parsing layer.

[your:sourcetype]
SEDCMD-mac_1 = s/\(([\d\w]):/(0\1:/g
SEDCMD-mac_2 = s/:([\d\w]):/:0\1:/g
SEDCMD-mac_3 = s/:([\d\w])\)/:0\1)/g

Each of the sed commands is replacing a single digit with the same digit but includes a leading 0.

The first sed command takes care of the first octet (based on the open paren) The second takes care of the next 4 (based on :xx:). And the final takes care of the last (based on the close paren). There might be a better way, but should get you started I think. Keep in mind, this is just looking for anything that matches within parens, so you might want to be sure you're actually replacing the mac address and not something else that might be in your event.

Hope it helps.

0 Karma

craigkleen
Communicator

Unfortunately, that middle one doesn't work well. Because the regex matches colons on either side, if the mac address is entirely single digits, it will match the first :x: but the next one skips a field because it would look like "x:" to SED. What I've had to do so far is:

SEDCMD-macfix1 = s/(.*Option 82.*)\(([a-f0-9]:)/\1(0\2/g
SEDCMD-macfix2 = s/(.*Option 82.*\([a-f0-9]{2}):([a-f0-9]:)/\1:0\2/g
SEDCMD-macfix3 = s/(.*Option 82.*\([a-f0-9]{2}:[a-f0-9]{2}):([a-f0-9]:)/\1:0\2/g
SEDCMD-macfix4 = s/(.*Option 82.*\([a-f0-9]{2}:[a-f0-9]{2}:[a-f0-9]{2}):([a-f0-9]:)/\1:0\2/g
SEDCMD-macfix5 = s/(.*Option 82.*\([a-f0-9]{2}:[a-f0-9]{2}:[a-f0-9]{2}:[a-f0-9]{2}):([a-f0-9]:)/\1:0\2/g
SEDCMD-macfix6 = s/(.*Option 82.*\([a-f0-9]{2}:[a-f0-9]{2}:[a-f0-9]{2}:[a-f0-9]{2}:[a-f0-9]{2}):([a-f0-9]\))/\1:0\2/g

I was just hoping someone out there would know some more tricks to sed to shorten that.

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...