Hi Team,
We have onboarded csv data into Splunk and each row in csv is ingested into _raw field . I need to bring this back to tabular format and run query against it. Kindly assist.
Note- We are not supposed to add csv files directly into the Splunk via "Add inputs" option.
Regards,
Sid
I assume that each CSV file is ingested as one source. In that case, a search by source would retrieve all entries in that file. All you need is to add back headers. But keep in mind:
Here, I will use an ordinary CSV as example:
c, b, a
1,2,3
4,,5
,6,7
8,,
You can use the following
index=myindex sourcetype=mysourcetype source=mycsv
| tojson
| appendpipe
[foreach *
[eval header = mvappend(header, "<<FIELD>>")]]
| fields header
| where isnotnull(header)
| stats values(header) as header values(_raw) as json
| foreach json mode=multivalue
[ eval csv = mvappend(csv, "\"" . mvjoin(mvmap(header, if(isnull(spath(<<ITEM>>, header)),"", spath(<<ITEM>>, header))), "\",\"") . "\"")]
| eval csv = mvjoin(header, ",") . "
" . mvjoin(csv, "
")
The mock data will give a csv field with the following value:
a,b,c
"3","2","1"
"5","","4"
"7","6",""
"","","8"
As explained above, much depends on the original CSV. For example,
You can play with the following data emulation and compare with real data.
| makeresults format=csv data="c, b, a
1,2,3
4,,5
,6,7
8,,"
| foreach *
[eval _raw = if(isnull(_raw), "", _raw . ",") . if(isnull(<<FIELD>>), "", <<FIELD>>)]
``` the above emulates
index=myindex sourcetype=mysourcetype source=mycsv
```
@inventsekar There are multiple csv files from which data gets loaded into Splunk. So, the _raw will have column headers and other rows for each file. All I need is to convert back into rows and columns format just like what we see in csv. "table" command will not serve the purpose for my scenario.
Hi @Splunk_sid .. We may need more details from your side.
Your current search query, what table format you are looking for, ...
@inventsekar We have onboarded the data coming from csv files in inputs.conf as below and the data is loaded every day as new csv is created with date stamp.
6/17/24 3:07:26.000 AM | col1,col2,col3,col4,col5,col6 host = <host> |
6/17/24 3:07:26.000 AM | data1,data2,data3,data4,data5,data6 host = <host> |
We need the output in below format when we run query:
col1 | col2 | col3 | col4 | col5 | col6 |
data1 | data2 | data3 | data4 | data5 | data6 |
Hi @Splunk_sid
>>> Note- We are not supposed to add csv files directly into the Splunk via "Add inputs" option.
so you have onboarded the CSV file or not yet?
if you have onboarded the CSV file, then, just use the table command..
index=yourCSVindex source=someSource sourcetype=some | table *