Splunk Search

How to convert raw csv data into table format

Splunk_sid
Loves-to-Learn Lots

Hi Team,

We have onboarded csv data into Splunk and each row in csv is ingested into _raw field . I need to bring this back to tabular format and run query against it. Kindly assist.

Note- We are not supposed to add csv files directly into the Splunk via "Add inputs" option.

Regards,

Sid

Labels (4)
0 Karma

yuanliu
SplunkTrust
SplunkTrust

I assume that each CSV file is ingested as one source.  In that case, a search by source would retrieve all entries in that file.  All you need is to add back headers.  But keep in mind:

  1. You cannot get the original order of headers back.  Headers will be in ASCII order.
  2. You cannot get the original row order back.  Rows will be in matrix ASCII order.
  3. If your original headers contain patterns incompatible with Splunk header standards, those headers will be changed to Splunk headers.

Here, I will use an ordinary CSV as example:

 

c, b, a
1,2,3
4,,5
,6,7
8,,

 

You can use the following

 

index=myindex sourcetype=mysourcetype source=mycsv
| tojson
| appendpipe
    [foreach *
        [eval header = mvappend(header, "<<FIELD>>")]]
| fields header
| where isnotnull(header)
| stats values(header) as header values(_raw) as json
| foreach json mode=multivalue
    [ eval csv = mvappend(csv, "\"" . mvjoin(mvmap(header, if(isnull(spath(<<ITEM>>, header)),"", spath(<<ITEM>>, header))), "\",\"") . "\"")]
| eval csv = mvjoin(header, ",") . "
" . mvjoin(csv, "
")

 

The mock data will give a csv field with the following value:

 

a,b,c
"3","2","1"
"5","","4"
"7","6",""
"","","8"

 

As explained above, much depends on the original CSV.  For example,

  • I took precaution to quote each "cell" even though your original CSV may not use quotation marks.
  • I did not quote headers, even though your original CSV may have used quotation marks.
  • With certain column names and/or cell values, additional coding is needed.

You can play with the following data emulation and compare with real data.

 

| makeresults format=csv data="c, b, a
1,2,3
4,,5
,6,7
8,,"
| foreach *
    [eval _raw = if(isnull(_raw), "", _raw . ",") . if(isnull(<<FIELD>>), "", <<FIELD>>)]
``` the above emulates
index=myindex sourcetype=mysourcetype source=mycsv
```

 

0 Karma

Splunk_sid
Loves-to-Learn Lots

@inventsekar There are multiple csv files from which data gets loaded into Splunk. So, the _raw will have column headers and other rows for each file. All I need is to convert back into rows and columns format just like what we see in csv. "table" command will not serve the purpose for my scenario.

0 Karma

inventsekar
SplunkTrust
SplunkTrust

Hi @Splunk_sid .. We may need more details from your side. 

Your current search query, what table format you are looking for, ...

thanks and best regards,
Sekar

PS - If this or any post helped you in any way, pls consider upvoting, thanks for reading !
0 Karma

Splunk_sid
Loves-to-Learn Lots

@inventsekar We have onboarded the data coming from csv files in inputs.conf as below and the data is loaded every day as new csv is created with date stamp.

[monitor:<path>/file_*.csv]
disabled = false
sourcetype = <sourcetype>
index=<index>
 
With this config, we are getting the data into splunk and each row in csv is loaded as separate event.
Query:  index=<index> sourcetype=<sourcetype>.
 
All we need is to see the data similar to csv ie. we need to have a single line header and corresponding data for those columns (just how we see when we load the csv from Add inputs via Splunk GUI). 
 
As of now, events are like as shown below and it repeats every day for new csv files
 
6/17/24
3:07:26.000 AM
 
col1,col2,col3,col4,col5,col6  

host = <host>
source =<source>
sourcetype =<sourcetype> 

6/17/24
3:07:26.000 AM
 
data1,data2,data3,data4,data5,data6

host = <host>
source =<source>
sourcetype =<sourcetype> 

 

We need the output in below format when we run query:

col1col2col3col4col5col6
data1data2data3data4data5data6

 

Regards,
Sid
0 Karma

inventsekar
SplunkTrust
SplunkTrust

Hi @Splunk_sid 

>>> Note- We are not supposed to add csv files directly into the Splunk via "Add inputs" option.
so you have onboarded the CSV file or not yet?

 

if you have onboarded the CSV file, then, just use the table command.. 

index=yourCSVindex source=someSource sourcetype=some | table *

 

thanks and best regards,
Sekar

PS - If this or any post helped you in any way, pls consider upvoting, thanks for reading !
0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...