Indexing data with multiple headers

timmalos · ‎10-18-2013

Hi
I got a file like this:

"No.","time",Header1,Header2,...,Header128
"1","2013/10/18 14:59",Value1,Value2,...,Value128
"2","2013/10/18 15:00",Value1,Value2,...,Value128
"3","2013/10/18 15:01",Value1,Value2,...,Value128
"4","2013/10/18 15:02",Value1,Value2,...,Value128
"5","2013/10/18 15:03",Value1,Value2,...,Value128
"6","2013/10/18 15:04",Value1,Value2,...,Value128
"7","2013/10/18 15:05",Value1,Value2,...,Value128
"8","2013/10/18 15:06",Value1,Value2,...,Value128
"9","2013/10/18 15:07",Value1,Value2,...,Value128
"10","2013/10/18 15:08",Value1,Value2,...,Value128
"11","2013/10/18 15:09",Value1,Value2,...,Value128
"12","2013/10/18 15:10",Value1,Value2,...,Value128
"13","2013/10/18 15:11",Value1,Value2,...,Value128
"14","2013/10/18 15:12",Value1,Value2,...,Value128
"15","2013/10/18 15:13",Value1,Value2,...,Value128
"No.","time",Header129,Header130,...,Header256
"1","2013/10/18 14:59",Value129,Value130,...,Value256
"2","2013/10/18 15:00",Value129,Value130,...,Value256
"3","2013/10/18 15:01",Value129,Value130,...,Value256
"4","2013/10/18 15:02",Value129,Value130,...,Value256
"5","2013/10/18 15:03",Value129,Value130,...,Value256
"6","2013/10/18 15:04",Value129,Value130,...,Value256
"7","2013/10/18 15:05",Value129,Value130,...,Value256
"8","2013/10/18 15:06",Value129,Value130,...,Value256
"9","2013/10/18 15:07",Value129,Value130,...,Value256
"10","2013/10/18 15:08",Value129,Value130,...,Value256
"11","2013/10/18 15:09",Value129,Value130,...,Value256
"12","2013/10/18 15:10",Value129,Value130,...,Value256
"13","2013/10/18 15:11",Value129,Value130,...,Value256
"14","2013/10/18 15:12",Value129,Value130,...,Value256
"15","2013/10/18 15:13",Value129,Value130,...,Value256
etc...

In fact its a simple csv with ONE raw Header and 15 raw with values,but columns are splitted by 128.
Hence I got one header raw followed by 15 raws with datas.

If the number of headers>128( lets take 200 in this example) then i got 128 columns for the first 1+15 raws. After that i got 1 Header raw followed by 15 data raws that have 200-128=72 columns.

Each header is unique.
I need to have, for each header, the 15 datas of the 15 last minutes into Splunk

What's the best way to index this?

Thanks a lot for your help,

EDIT 1 :
Headers can look like :

"No.","time","00:00:2B","00:00:2C","00:00:2D","00:00:2E","00:00:2F","00:00:30","00:00:31","00:00:32","00:00:33","00:00:34","00:00:35","00:00:36","00:00:37","00:00:38","00:00:39","00:00:3A","00:00:3B","00:00:8B","00:00:8C","00:00:8D","00:00:8E","00:00:8F","00:00:90","00:00:91","00:00:92","00:00:93","00:00:94","00:00:95","00:00:96","00:00:97","00:00:98","00:00:99","00:00:9A","00:00:9B","00:00:9C","00:00:9D","00:00:9E","00:00:9F","00:00:A0","00:00:A1","00:00:A2","00:00:A3","00:00:A4","00:00:A5","00:00:A6","00:00:A7","00:00:A8","00:00:A9","00:00:AA","00:00:AD","00:00:AE","00:00:AF","00:00:B0","00:00:B1","00:00:B2","00:00:B3","00:00:B4","00:00:B5","00:00:B6","00:00:B7","00:00:B8","00:00:BA","00:00:BB","00:00:BC","00:00:BD","00:00:BE","00:00:BF","00:00:C0","00:00:C1","00:00:C2","00:00:C3","00:00:C4","00:00:C5","00:00:C6","00:00:C7","00:00:C8","00:00:C9","00:00:CA","00:00:CC","00:00:CD","00:00:CE","00:00:CF","00:00:D0","00:00:D1","00:00:D2","00:00:D3","00:00:D4","00:00:D6","00:00:D7","00:00:D8","00:00:D9","00:00:DA","00:00:DB","00:00:DC","00:00:DD","00:00:DE","00:00:DF","00:00:E0","00:00:E1","00:00:E2","00:00:E3","00:00:E4","00:00:E5","00:00:E6","00:00:E7","00:00:E8","00:00:E9","00:00:EA","00:00:EB","00:00:EC","00:00:ED","00:00:EE","00:00:EF","00:00:F0","00:00:F1","00:00:F2","00:00:F3","00:00:F4","00:00:F7","00:00:F8","00:00:FA","00:00:FB","00:00:FC","00:00:FD","00:00:FE","00:00:FF"

Or

"No.","time","CL3-A.01(Zoe01).0001","CL3-A.01(Zoe01).0002","CL3-A.01(Zoe01).0003","CL3-A.01(Zoe01).0004","CL3-A.01(Zoe01).0005","CL3-A.01(Zoe01).0006","CL3-A.01(Zoe01).0007","CL3-A.01(Zoe01).0008","CL3-A.01(Zoe01).0009","CL3-A.01(Zoe01).000A","CL3-A.01(Zoe01).000B","CL3-A.01(Zoe01).000C","CL3-A.01(Zoe01).000D","CL3-A.01(Zoe01).000E","CL3-A.01(Zoe01).000F","CL3-A.01(Zoe01).0010","CL3-A.01(Zoe01).0011","CL3-A.01(Zoe01).0012","CL3-A.01(Zoe01).0013","CL3-A.01(Zoe01).0014","CL3-A.01(Zoe01).0015","CL3-A.01(Zoe01).0016","CL3-A.01(Zoe01).0017","CL3-A.01(Zoe01).0018","CL3-A.01(Zoe01).0019","CL3-A.01(Zoe01).001A","CL3-A.01(Zoe01).001B","CL3-A.01(Zoe01).001C","CL3-A.01(Zoe01).001D","CL3-A.01(Zoe01).001E","CL3-A.01(Zoe01).001F","CL3-A.02...

guilmxm · ‎06-18-2014

Hi,

I have exactly the same case than you with multi-headers csv files, have you had any success in your configuration ?

This is quite important for me, so i would really appreciate your answer 🙂

Thanks in advance

Guilhem

alacercogitatus · ‎10-18-2013

The best way is to either 1) set a LINE_BREAKER = ([\r\n]+)"No." and set SHOULD_LINEMERGE=TRUE or 2) put all 256 columns across the csv and use props/transforms to do header extraction.

"No.","time",Header1,...,Header256

For 1 above, you will be splitting the incoming text at the header, so each event is 16 rows, including the header row. You might need to play with the REGEX above to get it to split properly. You will need to restart the indexer to get this done.

alacercogitatus · ‎10-25-2013

So now set your LINE_BREAKER as above to split the event out per header line, and then use the indexed extractions csv config.

timmalos · ‎10-25-2013

I saw this and tried:
[vspco1]
INDEXED_EXTRACTIONS = CSV
HEADER_FIELD_LINE_NUMBER = 1

But obviously Splunk considers only the first line as an header and not the 16th and nexts 32th,etc... Cant manage to see if Splunk can do this with these ney params in v6.

alacercogitatus · ‎10-25-2013

so with Splunk 6, it will read the file and look for a header. Check here: http://docs.splunk.com/Documentation/Splunk/6.0/Data/Extractfieldsfromfileheadersatindextime. If you have 5: http://docs.splunk.com/Documentation/Splunk/5.0.5/Data/Extractfieldsfromfileheadersatindextime

timmalos · ‎10-25-2013

Any advice?

timmalos · ‎10-20-2013

No, i don't know anything about headers. I edited my post to show you

alacercogitatus · ‎10-18-2013

Are the headers the same values, just different positions?

timmalos · ‎10-18-2013

Yes exactly

alacercogitatus · ‎10-18-2013

so the header is not consistent?

timmalos · ‎10-18-2013

I think i'll use a script to put all columns accross the csv. My problem is that the headers change all the time as they are in fact an ID that i need to associate to the 15 following values.I 've seen Splunk 6 now can index files with header but i must stay on 5 for the moment.

Indexing data with multiple headers

Why You Can't Miss .conf25: Unleashing the Power of Agentic AI with Splunk & Cisco

Deep Dive into Federated Analytics: Unlocking the Full Power of Your Security Data

Your summer travels continue with new course releases

Are you a member of the Splunk Community?

Indexing data with multiple headers

Why You Can't Miss .conf25: Unleashing the Power of Agentic AI with Splunk & Cisco

Deep Dive into Federated Analytics: Unlocking the Full Power of Your Security Data

Your summer travels continue with new course releases