Hello, I have an issue in which my searches are suddenly offset by one field. In other words, the Action field now contains clientip and so forth. I do not use IFE (Interactive Field Extractions) because on occasion the regex would be thrown off for a similar reason so I decided to simply handle field extractions using a static, field by field method in the transforms.conf file. This has always worked but today I noticed the offset due to the single digit in the first month field.
The SQUID syslog format on my appliance sends out data in the following manner. The first set of date fields use MMM D HH:MM:SS and the second set of date fields use MMM DD HH:MM:SS. For this reason, I believe the white space in the first Day field is being ignored thus offsetting the rest of the field extractions.
month,day,systime,host,month,day,systime,format,time,duration,server_ip,uri_host,clientip,action,bytes,method,uri_path,username,hierarchy,content_type
Here is a Sample Squid Syslog Entry (using fake/sanitized data)
Sep 6 12:26:39 192.168.1.68 Sep 06 12:26:39 AN_SQUID_VIP_HOST_LOG 1378484799.674 339 172.16.40.40 www.testdomain.com 90.90.90.90 TCP_MISS/200 45667 GET /jobs/saved?cmd=save&save_job=4431425 - DIRECT/172.16.40.43 -
My transforms/props on the Search Head. I capitalized the first group to distinguish the first date fields from the second set. Only the lower case fields get used by my Splunk for Squid app. I have heard of using delims="/s" or delims="/t" as a way to handle white space in fields but that isn't working. Please advise. Thanks in advance!
props.conf
[squid]
SHOULD_LINEMERGE=false
REPORT-squidfields = squid_custom_fields
TIME_FORMAT = %b %e %H:%M:%S
MAX_TIMESTAMP_LOOKAHEAD=19
KV_MODE = none
transforms.conf
[squid_custom_fields]
DELIMS = " "
FIELDS = Month,Day,Systime,host,month,day,systime,format,time,duration,server_ip,uri_host,clientip,action,bytes,method,uri_path,username,hierarchy,content_type
I don't know why and if this will permanently resolve my issue but I fixed it by changing the transforms.conf file as follows.
Before
FIELDS = month,day,systime,host,month,day,systime,format,time,duration,server_ip,uri_host,clientip,action,bytes,method,uri_path,username,hierarchy,content_type
After
FIELDS = Month,Day,Day,Systime,host,month,day,systime,format,time,duration,server_ip,uri_host,clientip,action,bytes,method,uri_path,username,hierarchy,content_type
Update -- I'm not convinced the reason for this is behavior is due to a white space in the 2nd field rather it is getting skipped altogether and being picked up by the 3rd field. So for the following, Month = Sep and Systime = 6 (space then 6) and so on.
Sep 6 12:26:39 192.168.1.68 Sep 06 12:26:39 AN_SQUID_VIP_HOST_LOG 1378484799.674 339 172.16.40.40 www.testdomain.com 90.90.90.90 TCP_MISS/200 45667 GET /jobs/saved?cmd=save&save_job=4431425 - DIRECT/172.16.40.43 -