I am ingesting a csv file from my server. I have tried many configurations on the props.conf to no success. Any assistance with what I am doing incorrectly.
Props.conf
[OpsCenter]
INDEXED_EXTRACTRIONS = csv
FIELD_DELIMITER = ,
FIELD_QUOTE = "
DATETIME_CONFIG = CURRENT
FIELD_HEADER_REGEX = ^Report
HEADER_FIELD_LINE_NUMBER = 4
file being ingested.
Daily Status Splunk
Customizable tabular report of all backup data (size, start time, status code etc) by client. In this report, accelerator optimization, savings, or factor is not considered as part of deduplication optimization, savings, or factor. In all other reports, the deduplication values include accelerator values.
Report Time Frame: Previous 24 Hours
Client Name Job Duration Job File Count Throughput (KB/sec) Job Primary ID Policy Name Post Deduplication Size(MB) Job Status Storage Unit Name
cpchiisi04.chi.cintas.com 4:31:00 1,439 106,432 161510 Isilon_SQL_Chi 824,210.78 Successful stu_disk_cpchibk002
cpmasisi04 23:40:57 1,595,462 55,103 952085 Isilon_Common 7,455.01 Successful stu_disk_cpmasbk006
cpmasisi04 24:00:58 1,051,251 101,290 952091 Isilon_Shares 3,440.43 Successful stu_disk_cpmasbk006
cpmasisi04 22:19:14 4,180,140 59,338 952093 Isilon_Shares 1,683.77 Successful stu_disk_cpmasbk006
cpmasisi04 35:13:56 9,271,112 51,439 952095 Isilon_Shares 11,470.67 Successful stu_disk_cpmasbk006
cpmasisi04 9:01:01 0 0 952807 Isilon_SQL 0 Failed UNKNOWN
crbo042.na.cintas.com 1:05:14 31,711,314 128,426 953499 VMDK_NonProd 7,785.33 Successful stu_disk_cpmasbk007
cdmasalc13.na.cintas.com 0:06:15 328,231 636,212 953575 VMDK_NonProd 297.52 Successful stu_disk_cpmasbk007
cpmasisi04 7:04:13 23,488 126,773 953825 Isilon_SQL 1,635,803.66 Successful stu_disk_cpmasbk007
cpmasfs01 1:30:53 53,000 24,894 953915 Creative_Marketing 126,956 Failed cpmasbk006-hcart-robot-tld-0
Report generated on May 20, 2018 8:00:17 AM
So I copied your data, added commas and suppressed the thousand separators. It looks like this :
Daily Status Splunk
Customizable tabular report of all backup data (size, start time, status code etc) by client. In this report, accelerator optimization, savings, or factor is not considered as part of deduplication optimization, savings, or factor. In all other reports, the deduplication values include accelerator values.
Report Time Frame: Previous 24 Hours
Client Name,Job Duration,Job File Count, Throughput (KB/sec), Job Primary ID, Policy Name, Post Deduplication Size(MB,) Job Status, Storage Unit Name
cpchiisi04.chi.cintas.com,4:31:00,1439,106432,161510,Isilon_SQL_Chi,824210.78,Successful,stu_disk_cpchibk002
cpmasisi04,23:40:57,1595462,55103,952085,Isilon_Common,7455.01,Successful,stu_disk_cpmasbk006
cpmasisi04,24:00:58,1051251,101290,952091,Isilon_Shares,3440.43,Successful,stu_disk_cpmasbk006
cpmasisi04,22:19:14,4180140,59338,952093,Isilon_Shares,1683.77,Successful,stu_disk_cpmasbk006
cpmasisi04,35:13:56,9271112,51439,952095,Isilon_Shares,11470.67,Successful,stu_disk_cpmasbk006
cpmasisi04,9:01:01,0,0,952807,Isilon_SQL,0,Failed,UNKNOWN
crbo042.na.cintas.com,1:05:14,31711314,128426,953499,VMDK_NonProd,7785.33,Successful,stu_disk_cpmasbk007
cdmasalc13.na.cintas.com,0:06:15,328231,636212 953575,VMDK_NonProd,297.52,Successful,stu_disk_cpmasbk007
cpmasisi04,7:04:13,23488,126773,953825,Isilon_SQL,1635803.66,Successful,stu_disk_cpmasbk007
cpmasfs01,1:30:53, 53000,24894,953915,Creative_Marketing,126956,Failed,cpmasbk006-hcart-robot-tld-0
As it appears, the name of the fields is on line 4.
The following sourcetype configuration enables to index the data.
[sourcetype_answer]
DATETIME_CONFIG = CURRENT
HEADER_FIELD_LINE_NUMBER = 4
INDEXED_EXTRACTIONS = csv
KV_MODE = none
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
category = Structured
description = Comma-separated value format. Set header and other settings in "Delimited Settings"
disabled = false
pulldown_type = true
Just a few comments:
I hope this answers your question.
So I copied your data, added commas and suppressed the thousand separators. It looks like this :
Daily Status Splunk
Customizable tabular report of all backup data (size, start time, status code etc) by client. In this report, accelerator optimization, savings, or factor is not considered as part of deduplication optimization, savings, or factor. In all other reports, the deduplication values include accelerator values.
Report Time Frame: Previous 24 Hours
Client Name,Job Duration,Job File Count, Throughput (KB/sec), Job Primary ID, Policy Name, Post Deduplication Size(MB,) Job Status, Storage Unit Name
cpchiisi04.chi.cintas.com,4:31:00,1439,106432,161510,Isilon_SQL_Chi,824210.78,Successful,stu_disk_cpchibk002
cpmasisi04,23:40:57,1595462,55103,952085,Isilon_Common,7455.01,Successful,stu_disk_cpmasbk006
cpmasisi04,24:00:58,1051251,101290,952091,Isilon_Shares,3440.43,Successful,stu_disk_cpmasbk006
cpmasisi04,22:19:14,4180140,59338,952093,Isilon_Shares,1683.77,Successful,stu_disk_cpmasbk006
cpmasisi04,35:13:56,9271112,51439,952095,Isilon_Shares,11470.67,Successful,stu_disk_cpmasbk006
cpmasisi04,9:01:01,0,0,952807,Isilon_SQL,0,Failed,UNKNOWN
crbo042.na.cintas.com,1:05:14,31711314,128426,953499,VMDK_NonProd,7785.33,Successful,stu_disk_cpmasbk007
cdmasalc13.na.cintas.com,0:06:15,328231,636212 953575,VMDK_NonProd,297.52,Successful,stu_disk_cpmasbk007
cpmasisi04,7:04:13,23488,126773,953825,Isilon_SQL,1635803.66,Successful,stu_disk_cpmasbk007
cpmasfs01,1:30:53, 53000,24894,953915,Creative_Marketing,126956,Failed,cpmasbk006-hcart-robot-tld-0
As it appears, the name of the fields is on line 4.
The following sourcetype configuration enables to index the data.
[sourcetype_answer]
DATETIME_CONFIG = CURRENT
HEADER_FIELD_LINE_NUMBER = 4
INDEXED_EXTRACTIONS = csv
KV_MODE = none
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
category = Structured
description = Comma-separated value format. Set header and other settings in "Delimited Settings"
disabled = false
pulldown_type = true
Just a few comments:
I hope this answers your question.
help me understand why this worked. I had tried a config with
DATETIME_CONFIG = CURRENT
HEADER_FIELD_LINE_NUMBER = 4
INDEXED_EXTRACTIONS = CSV
FIELD_DELIMITER = ,
Not sure of the order I used but this didn't work. Is there a specific order you need to follow with props.conf? any insight as to why it may not have worked for me would be wonderful. Thank you so much for your response.
Hi. Can you please share the content of the csv file (first 10 lines), as I did?
Did you take care of the thousand separator?
I am not able to extract the header fields correctly. It only extracts EXTRA_FIELD_X
Hello. Can you confirm that the name of the fields are 'Client Name Job Duration Job File Count Throughput (KB/sec) Job Primary ID Policy Name Post Deduplication Size(MB) Job Status Storage Unit Name'? If this is the case, why are there no ',' on this line? And also, why not set FIELD_HEADER_REGEX to ^Client?
I copied it out of excel so it see's it as separate cells and not with the "," Those are the headers. I thought the FIELD_HEADER_REGEX was to the last line I wanted removed. I am still trying to learn about props.conf files.