Getting Data In

Multiple lines in a CSV being combined into a single event

chrishartsock
Path Finder

Hello all,

I am pulling a simple CSV file. It only has two fields: a url and an identification number. For example:
https://google.com, 531
https://amazon.com, 9849

The problem is, Splunk is combining multiple events into a single event, which I do not want. I believe it may be due to the fact that there are no timestamps in the events (I would like for it to just set _time as the index time) and so it is combining events while looking for a timestamp. However, I have tried to correct this by setting DATETIME_CONFIG = CURRENT, and had no luck.

The data is pulled from a file on a Universal Forwarder. It then goes to a Heavy Forwarder, which sends it to our indexers. The config files are as follows:
UF:
inputs.conf:
[monitor://C:\UrlFile.csv]
sourcetype = url
index = security
ignoreOlderThan = 7d
disabled = false

props.conf:
[source::C:\UrlFile.csv]
KV_MODE = none
DATETIME_CONFIG = CURRENT
SHOULD_LINEMERGE = false

Search Head:
props.conf:
[source::C:\UrlFile.csv]
KV_MODE = none
MAX_TIMESTAMP_LOOKAHEAD = 1
REPORT-url = url_extract
SHOULD_LINEMERGE = false

transforms.conf:
[url_extract]
DELIMS = ","
FIELDS = "url", "id"

Any help will be greatly appreciated.

Thanks!

PS: This question is very similar to the question at this link, https://answers.splunk.com/answers/123998/issues-with-multiple-lines-in-csv-file-being-treated-as-a-..., but his data has timestamps whereas mine does not.

Tags (1)
0 Karma
1 Solution

niketnilay
Legend

@chrishartsock...

If your CSV File has only URL and ID field, I would expect it to have one row per URL. Hence this seems more like a lookup candidate to me rather than indexing. How many rows are there in this file and how frequently does this file change. If you add URL and ID as the column heading you can better upload UrlFile.csv as Lookup table.

Can you try the following props.conf at the Universal forwarder level?

[ url_file_csv ]
INDEXED_EXTRACTIONS=csv
FIELD_NAMES=url,id
DATETIME_CONFIG=CURRENT
SHOULD_LINEMERGE=false
NO_BINARY_CHECK=true
CHARSET=AUTO
KV_MODE=none
category=Custom
description=URL and ID Comma-separated value format.
disabled=false
pulldown_type=true
____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

View solution in original post

0 Karma

arkonner
Path Finder

Presently I have the same problem but what I looking for is only to recognize the CR as new event in files with the structure below, because the data extraction allow only two conditions [space and comma]

05/25/2018 15:21:55,INFO,10.3.140.197,j.brown,User logged in
05/25/2018 15:29:36,INFO,10.3.7.254,j.smith,User logged in
05/25/2018 15:29:59,INFO,10.3.7.254,j.smith,Temp Token Request
05/25/2018 15:29:59,INFO,,j.smith,Message sent:Backup Token Assigned
05/25/2018 15:33:12,INFO,10.3.7.254,j.smith,User logged in
05/25/2018 17:25:58,INFO,10.3.7.254,j.smith,User logged in
05/25/2018 17:26:23,ERROR,10.3.7.254,j.smith,Smart Token Request
05/25/2018 17:26:23,INFO,,j.smith,Message sent:Smart Token Request Failed
05/25/2018 17:27:10,ERROR,10.3.7.254,j.smith,Smart Token Request
05/25/2018 17:27:10,INFO,,j.smith,Message sent:Smart Token Request Failed

0 Karma

niketnilay
Legend

@chrishartsock...

If your CSV File has only URL and ID field, I would expect it to have one row per URL. Hence this seems more like a lookup candidate to me rather than indexing. How many rows are there in this file and how frequently does this file change. If you add URL and ID as the column heading you can better upload UrlFile.csv as Lookup table.

Can you try the following props.conf at the Universal forwarder level?

[ url_file_csv ]
INDEXED_EXTRACTIONS=csv
FIELD_NAMES=url,id
DATETIME_CONFIG=CURRENT
SHOULD_LINEMERGE=false
NO_BINARY_CHECK=true
CHARSET=AUTO
KV_MODE=none
category=Custom
description=URL and ID Comma-separated value format.
disabled=false
pulldown_type=true
____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

View solution in original post

0 Karma

chrishartsock
Path Finder

niketnilay,

This worked beautifully for me.

Thanks!

0 Karma

niketnilay
Legend

@chrishhartsock... Glad it worked. Do consider the option to use lookup instead of indexing data if
1) Not too many rows
2) Data does not update frequently
3) If URL and ID are present in your existing Data, you can explore outputlookup command to perform periodic updates through scheduled searches.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

chrishartsock
Path Finder

@niketnilay
I would love to be able to use a lookup, but it is updated every fifteen minutes and there are around 5,000 new events every hour. However, if there is a better way to do it I am definitely open to suggestions.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi chrishartsock,
try to configure your props.conf using sourcetype instead source and should run:
props.conf:

[url]
KV_MODE = none
MAX_TIMESTAMP_LOOKAHEAD = 1
REPORT-url = url_extract
SHOULD_LINEMERGE = false

Bye.
Giuseppe

0 Karma

chrishartsock
Path Finder

cusello,

Thanks for the reply. I changed the props.conf on both the UF and the SH to use the sourcetype, but I am still seeing the issue.

Chris

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi Chris,
Sorry but last time I did not notice a mistake in the configuration: delete MAX_TIMESTAMP_LOOKAHEAD = 1.

Every way, to be sure of props.conf, download an example of your file UrlFile.csv and follow the web procedure for Data Input [Settings -- Add data]: it's not important to complete uploading but you have to identify and save the correct props.conf.
You could upload file with the correct props.conf in a test index so you can test your props.conf.
In this way you can verify on the fly your configuration.

When it's OK remeber to copy it both on your Indexers and forwarders.

Bye.
Giuseppe

0 Karma
.conf21 Now Fully Virtual!
Register for FREE Today!

We've made .conf21 totally virtual and totally FREE! Our completely online experience will run from 10/19 through 10/20 with some additional events, too!