Splunk Search

PROPS Configuration for HTML data source

SplunkDash
Motivator

Hello,

How I would write my Props Configuration (Tme Prefix, Time Format,  LINE/EVENT Breaker...etc) for following HTML data source. A segment of HTML data from source file  provided  below. Any help will be highly appreciated. Thank you so much.

<HTML><META HTTP-EQUIV="expires" CONTENT="0">

<HEAD><TITLE></TITLE></HEAD>

<STYLE type=text/css>

td , th { white-space:nowrap;font-family: sans-serif; font-size: 10px }

html,body { height:100% }

.qtw100 td,.qthw100 td { padding:0px;}

.qtw100 { width:100%; }

.qthw100 { width:100%;height:100%; }

.spnode,.nspnode { text-align:center;border-style:inset; }

.spnode { border-left-width:10px;border-bottom-width:10px; }

.hd   { background-color:#FFFFFF;text-align:right; }

.hdw { background-color:#FFFFFF;width:1%; }

.CD0D0D0 { background-color:#D0D0D0; color:#D0D0D0; }

.C00CC00 { background-color:#00CC00; color:#00CC00; }

.CCCCC00 { background-color:#CCCC00; color:#CCCC00; }

.CFFFFFF { background-color:#FFFFFF; color:#FFFFFF; }

.C66FFFF { background-color:#66FFFF; color:#66FFFF; }

.CFF0000 { background-color:#FF0000; color:#FF0000; }

.CFFFF00 { background-color:#FFFF00; color:#FFFF00; }

.C00FF00 { background-color:#00FF00; color:#00FF00; }

.CFF00FF { background-color:#FF00FF; color:#FF00FF; }

.HFFFFFF { background-color:#FFFFFF; text-align:center; }

.H66FFFF { background-color:#66FFFF; text-align:center; }

.HFF0000 { background-color:#FF0000; text-align:center; }

.HFFFF00 { background-color:#FFFF00; text-align:center; }

.H00FF00 { background-color:#00FF00; text-align:center; }

.HFF00FF { background-color:#FF00FF; text-align:center; }

.condtiming { display: none; position: absolute; width: 100% }

.cpu_us { background-color:#00FF00;color:#00FF00;font-size:1px; }

.cpu_ss { background-color:#FF0000;color:#FF0000;font-size:1px; }

.cell_1px { background-color:#FFFFFF;font-size:1px; }

.a_html { background-color:#FFFFFF;color:#FFFFFF;border:1px solid #FFFFFF; }

</STYLE>

<SCRIPT type="text/javascript" language="JavaScript"><!--

function HideDIV(d) { document.getElementById(d).style.display = "none"; }

function ShowDIV(d) { document.getElementById(d).style.display = "block"; }

//--></SCRIPT>

<BODY LINK=BLACK VLINK=BLACK>

<B>SAP </B>&reg;<B> IQ </B>Query Plan<BR>

<B>Query: </B><BR>

<B>Version: </B>16.1.040.1549/14760/P/SP04.08/Sun_Sparc/OS 5.11/64bit/2020-11-24 01:09:36

<P ALIGN=LEFT><B>Query Tree</B>

<TABLE class="qtw100" BORDER=0 CELLSPACING=0 ALIGN=CENTER>

<TR><TD ALIGN=CENTER COLSPAN=3><TABLE class="qthw100" BORDER=0 CELLSPACING=0><TR><TD WIDTH=50%></TD><TD BGCOLOR=BLACK>|||||</TD><TD>&nbsp;</TD><TD WIDTH=50%>3,677,556,487,906 rows (est.)</TD></TR></TABLE></TD></TR>

<TR VALIGN=TOP>

  <TD COLSPAN=3 ALIGN=CENTER>

   <TABLE BORDER CELLSPACING=0><TR><TD BGCOLOR=#CCAACC class="nspnode"><A NAME=TREE07><A HREF=#07>#07</A> Root of an UPDATE</TD></TR></TABLE>

  </TD>

</TR>

<TR><TD ALIGN=CENTER COLSPAN=3><TABLE class="qthw100" BORDER=0 CELLSPACING=0><TR><TD WIDTH=50%></TD><TD BGCOLOR=BLACK>|||||</TD><TD>&nbsp;</TD><TD WIDTH=50%>3,677,556,487,906 rows (est.)</TD></TR></TABLE></TD></TR>

<TR VALIGN=TOP>

  <TD COLSPAN=3 ALIGN=CENTER>

   <TABLE BORDER CELLSPACING=0><TR><TD BGCOLOR=#AAFFFF class="nspnode"><A NAME=TREE40><A HREF=#40>#40</A> Parallel Combiner (ordered)</TD></TR></TABLE>

  </TD>

</TR>

<TR><TD ALIGN=CENTER COLSPAN=3><TABLE class="qthw100" BORDER=0 CELLSPACING=0><TR><TD WIDTH=50%></TD><TD BGCOLOR=BLACK>|||||</TD><TD>&nbsp;</TD><TD BGCOLOR=BLACK>|||||</TD><TD>&nbsp;</TD><TD WIDTH=50%>3,677,556,487,906 rows (est.)</TD></TR></TABLE></TD></TR>

<TR VALIGN=TOP>

  <TD COLSPAN=3 ALIGN=CENTER>

   <TABLE BORDER CELLSPACING=0><TR><TD BGCOLOR=#CCFFFF class="nspnode"><A NAME=TREE135><A HREF=#135>#135</A> Order By</TD></TR></TABLE>

  </TD>

</TR>

<TR><TD ALIGN=CENTER COLSPAN=3><TABLE class="qthw100" BORDER=0 CELLSPACING=0><TR><TD WIDTH=50%></TD><TD BGCOLOR=BLACK>|||||</TD><TD>&nbsp;</TD><TD BGCOLOR=BLACK>|||||</TD><TD>&nbsp;</TD><TD WIDTH=50%>3,677,556,487,906 rows (est.)</TD></TR></TABLE></TD></TR>

<TR VALIGN=TOP>

  <TD COLSPAN=3 ALIGN=CENTER>

   <TABLE BORDER CELLSPACING=0 WIDTH=100%><TR><TD BGCOLOR=#CCCCAA class="nspnode"><A NAME=TREE03><A HREF=#03>#03</A> Join (Sort-Merge)</TD></TR></TABLE>

  </TD>

</TR>

<TR VALIGN=TOP>

  <TD ALIGN=CENTER>

   <TABLE class="qtw100" BORDER=0 CELLSPACING=0 ALIGN=CENTER>

    <TR><TD ALIGN=CENTER COLSPAN=1><TABLE class="qthw100" BORDER=0 CELLSPACING=0><TR><TD WIDTH=50%></TD><TD BGCOLOR=BLACK>||||</TD><TD>&nbsp;</TD><TD BGCOLOR=BLACK>||||</TD><TD>&nbsp;</TD><TD WIDTH=50%>247,522,712 rows (est.)</TD></TR></TABLE></TD></TR>

    <TR VALIGN=TOP>

     <TD COLSPAN=1 ALIGN=CENTER>

      <TABLE BORDER CELLSPACING=0><TR><TD BGCOLOR=#CCFFFF class="nspnode"><A NAME=TREE168><A HREF=#168>#168</A> Order By</TD></TR></TABLE>

     </TD>

    </TR>

    <TR><TD ALIGN=CENTER COLSPAN=1><TABLE class="qthw100" BORDER=0 CELLSPACING=0><TR><TD WIDTH=50%></TD><TD BGCOLOR=BLACK>||||</TD><TD>&nbsp;</TD><TD BGCOLOR=BLACK>||||</TD><TD>&nbsp;</TD><TD WIDTH=50%>247,522,712 rows (est.)</TD></TR></TABLE></TD></TR>

    <TR VALIGN=TOP>

     <TD COLSPAN=1 ALIGN=CENTER>

      <TABLE BORDER CELLSPACING=0><TR><TD BGCOLOR=#FFCCFF class="nspnode"><A NAME=TREE01><A HREF=#01>#01</A> Leaf &lt;cdwsa.IRDBM_F1095B_17 AS a&gt;</TD></TR></TABLE>

     </TD>

    </TR>

   </TABLE>

  </TD>

  <TD>&nbsp;&nbsp;</TD>

  <TD ALIGN=CENTER>

   <TABLE class="qtw100" BORDER=0 CELLSPACING=0 ALIGN=CENTER>

    <TR><TD ALIGN=CENTER COLSPAN=1><TABLE class="qthw100" BORDER=0 CELLSPACING=0><TR><TD WIDTH=50%></TD><TD BGCOLOR=BLACK>||||</TD><TD>&nbsp;</TD><TD BGCOLOR=BLACK>||||</TD><TD>&nbsp;</TD><TD WIDTH=50%>193,759,886 rows (est.)</TD></TR></TABLE></TD></TR>

    <TR VALIGN=TOP>

     <TD COLSPAN=1 ALIGN=CENTER>

      <TABLE BORDER CELLSPACING=0><TR><TD BGCOLOR=#CCFFFF class="nspnode"><A NAME=TREE201><A HREF=#201>#201</A> Order By</TD></TR></TABLE>

     </TD>

    </TR>

    <TR><TD ALIGN=CENTER COLSPAN=1><TABLE class="qthw100" BORDER=0 CELLSPACING=0><TR><TD WIDTH=50%></TD><TD BGCOLOR=BLACK>||||</TD><TD>&nbsp;</TD><TD BGCOLOR=BLACK>||||</TD><TD>&nbsp;</TD><TD WIDTH=50%>193,759,886 rows (est.)</TD></TR></TABLE></TD></TR>

    <TR VALIGN=TOP>

     <TD COLSPAN=1 ALIGN=CENTER>

      <TABLE BORDER CELLSPACING=0><TR><TD BGCOLOR=#FFCCFF class="nspnode"><A NAME=TREE02><A HREF=#02>#02</A> Leaf &lt;brlpb.temp_CVR_MONTH_B AS b&gt;</TD></TR></TABLE>

     </TD>

    </TR>

   </TABLE>

  </TD>

</TR>

</TABLE>

<P ALIGN=LEFT><B>Query Text</B>

<TABLE BORDER=1 ALIGN=CENTER CELLPADDING=2 CELLSPACING=0 WIDTH=100%><TR><TD><PRE> <FONT SIZE=-1>update &quot;cdwsa&quot;.&quot;IRDBM_F1095B_17&quot; as &quot;a&quot;

  set &quot;a&quot;.&quot;DEP4_COV_IND_M1&quot; = &quot;b&quot;.&quot;COVERED_IND&quot; from

  &quot;cdwsa&quot;.&quot;IRDBM_F1095B_17&quot; as &quot;a&quot;,&quot;temp_CVR_MONTH_B&quot; as &quot;b&quot;

  where(&quot;a&quot;.&quot;INFO_RETURN_OTH_ENTITY_ID4&quot; = &quot;b&quot;.&quot;INFO_RETURN_OTH_ENTITY_ID&quot;)</FONT></PRE></TD></TR></TABLE><P>

<P ALIGN=LEFT><B>Query Detail</B>

<TABLE BORDER=0 ALIGN=CENTER CELLSPACING=2 CELLPADDING=2>

<TR><TD>

<TABLE BGCOLOR=#CCAACC BORDER=1 CELLSPACING=0>

<TR><TH COLSPAN=2><A NAME=07><A HREF=#TREE07>#07 Root of an UPDATE</A></TH></TR>

<TR><TD><B>Child Node 1</B></TD><TD><A HREF=#40>#40</A></TD></TR>

<TR><TD><B>Estimated Result Rows</B></TD><TD>3,677,556,487,906</TD></TR>

<TR><TD><B>User Name</B></TD><TD>brlpb   (SA connHandle: 12123  SA connID: 35)</TD></TR>

<TR><TD><B>Est. Temp Space Used (Mb)</B></TD><TD>56140712.3</TD></TR>

<TR><TD><B>Requested attributes</B></TD><TD>No Scroll Hold Chained </TD></TR>

<TR><TD><B>Effective Number of Users</B></TD><TD>1</TD></TR>

<TR><TD><B>Number of CPUs</B></TD><TD>32</TD></TR>

<TR><TD><B>Executed on</B></TD><TD>SunOS/mtb1120plcdwstg/5.11/11.3/sun4v</TD></TR>

<TR><TD><B>IQ Main Cache Size (Mb)</B></TD><TD>275000</TD></TR>

<TR><TD><B>IQ Temp Cache Size (Mb)</B></TD><TD>250000</TD></TR>

<TR><TD><B>IQ Large Memory Size (Mb)</B></TD><TD>275000</TD></TR>

<TR><TD><B>Threads used for executing local invariant predicates</B></TD><TD>1</TD></TR>

<TR><TD><B>Number of CPUs (actual)</B></TD><TD>256</TD></TR>

<TR><TD><B>Option CREATE_HG_WITH_EXACT_DISTINCTS</B></TD><TD>OFF</TD></TR>

<TR><TD><B>Option CORE_Options125</B></TD><TD>4096  (default: 0)</TD></TR>

<TR><TD><B>Option Query_Plan_As_HTML</B></TD><TD>ON</TD></TR>

<TR><TD><B>Option Max_Hash_Rows</B></TD><TD>2500000  (default: 30000000)</TD></TR>

<TR><TD><B>Option Max_Temp_Space_Per_Connection</B></TD><TD>3000000  (default: 0)</TD></TR>

<TR><TD><B>Option Infer_Subquery_Predicates</B></TD><TD>OFF</TD></TR>

<TR><TD><B>Option Prefetch_Sort_Percent</B></TD><TD>50  (default: 20)</TD></TR>

<TR><TD><B>Option Ase_Binary_Display</B></TD><TD>ON</TD></TR>

<TR><TD><B>Option String_rtruncation</B></TD><TD>OFF</TD></TR>

<TR><TD><B>Output Vector</B></TD><TD>2 entries (9 data bytes)</TD></TR>

<TR><TD><B>Output 1</B></TD><TD>a._RowId</TD></TR>

<TR><TD><B>Output 1     Data Type</B></TD><TD>unsigned bigint (20, 0)</TD></TR>

<TR><TD><B>Output 1     Base Distincts</B></TD><TD>247,522,712</TD></TR>

<TR><TD><B>Output 1     Note</B></TD><TD>Declared Primary Key</TD></TR>

<TR><TD><B>Output 2</B></TD><TD>b.COVERED_IND</TD></TR>

<TR><TD><B>Output 2     Data Type</B></TD><TD>varchar(1)</TD></TR>

<TR><TD><B>Output 2     Base Distincts</B></TD><TD>3</TD></TR>

</TABLE>

</TD></TR>

<TR><TD>

<TABLE BGCOLOR=#AAFFFF BORDER=1 CELLSPACING=0>

<TR><TH COLSPAN=2><A NAME=40><A HREF=#TREE40>#40 Parallel Combiner (ordered)</A></TH></TR>

<TR><TD><B>Parent Node</B></TD><TD><A HREF=#07>#07</A></TD></TR>

<TR><TD><B>Child Node 1</B></TD><TD><A HREF=#135>#135</A></TD></TR>

<TR><TD><B>Estimated Result Rows</B></TD><TD>3,677,556,487,906</TD></TR>

<TR><TD><B>Max. Possible Parallel Arms</B></TD><TD>32</TD></TR>

<TR><TD><B>Optimization Note</B></TD><TD>Input Ordering Preserved</TD></TR>

<TR><TD><B>Output Vector</B></TD><TD>2 entries (9 data bytes)</TD></TR>

<TR><TD><B>Output 1</B></TD><TD>a._RowId</TD></TR>

<TR><TD><B>Output 1     Data Type</B></TD><TD>unsigned bigint (20, 0)</TD></TR>

<TR><TD><B>Output 1     Base Distincts</B></TD><TD>247,522,712</TD></TR>

<TR><TD><B>Output 1     Note</B></TD><TD>Declared Primary Key</TD></TR>

<TR><TD><B>Output 2</B></TD><TD>b.COVERED_IND</TD></TR>

<TR><TD><B>Output 2     Data Type</B></TD><TD>varchar(1)</TD></TR>

<TR><TD><B>Output 2     Base Distincts</B></TD><TD>3</TD></TR>

</TABLE>

</TD></TR>

<TR><TD>

<TABLE BGCOLOR=#CCFFFF BORDER=1 CELLSPACING=0>

<TR><TH COLSPAN=2><A NAME=135><A HREF=#TREE135>#135 Order By</A></TH></TR>

<TR><TD><B>Parent Node</B></TD><TD><A HREF=#40>#40</A></TD></TR>

<TR><TD><B>Child Node 1</B></TD><TD><A HREF=#03>#03</A></TD></TR>

<TR><TD><B>Estimated Result Rows</B></TD><TD>3,677,556,487,906</TD></TR>

<TR><TD><B>Optimization Note</B></TD><TD>Parallel sort load</TD></TR>

<TR><TD><B>Optimization Note</B></TD><TD>Parallel sort retrieval</TD></TR>

<TR><TD><B>Max. Possible Parallel Arms</B></TD><TD>32</TD></TR>

<TR><TD><B>Metadata Column Count</B></TD><TD>1</TD></TR>

<TR><TD><B>Ordering Expression 1</B></TD><TD>a._RowId`(1)</TD></TR>

<TR><TD><B>Output Vector</B></TD><TD>2 entries (9 data bytes)</TD></TR>

<TR><TD><B>Output 1</B></TD><TD>a._RowId`(1)</TD></TR>

<TR><TD><B>Output 2</B></TD><TD>b.COVERED_IND`(1)</TD></TR>

</TABLE>

</TD></TR>

Labels (1)
Tags (1)
0 Karma
1 Solution

richgalloway
SplunkTrust
SplunkTrust

Ah, so it would have been great to mention in the OP that events are separated by <TABLE> rather than <HTML>.  That changes the config a little.

[mysourcetype]
DATETIME_CONFIG = CURRENT
LINE_BREAKER = ([\r\n]+)\<TABLE>
TRUNCATE = 100000
EVENT_BREAKER_ENABLE = true
EVENT_BREAKER = ([\r\n]+)\<TABLE>
---
If this reply helps you, Karma would be appreciated.

View solution in original post

richgalloway
SplunkTrust
SplunkTrust

You didn't say what you want to use as a timestamp and I didn't see anything obvious, so I'll assume there is no timestamp in the HTML. 

This is a long post and it's only a portion of an event so the TRUNCATE setting is important to ensure you don't lose data.  Along the same lines, consider stripping out any unneeded parts to save storage and license usage.

[mysourcetype]
DATETIME_CONFIG = CURRENT
LINE_BREAKER = ([\r\n]+)\<HTML>
TRUNCATE = 100000
EVENT_BREAKER_ENABLE = true
EVENT_BREAKER = ([\r\n]+)\<HTML>
---
If this reply helps you, Karma would be appreciated.
0 Karma

SplunkDash
Motivator

Thank you so much. Yes, you are correct.....there is no data/time stamp and current time should be the right option to use, which you did. 

I tried with your code, but not sure why I am getting just one event. Any help will be highly appreciated, thank you again.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

How many events are you expecting?  Is the one event a single HTML element or multiple?  Did you restart the indexer/HF after making the changes to props.conf?

---
If this reply helps you, Karma would be appreciated.
0 Karma

SplunkDash
Motivator

Yes, I restarted....Thank you so much. Yes, more than one events within that and each of the events is separated by  Table. Please feel free to let me know if you have any questions. Thank you again, any help will be highly appreciated.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Ah, so it would have been great to mention in the OP that events are separated by <TABLE> rather than <HTML>.  That changes the config a little.

[mysourcetype]
DATETIME_CONFIG = CURRENT
LINE_BREAKER = ([\r\n]+)\<TABLE>
TRUNCATE = 100000
EVENT_BREAKER_ENABLE = true
EVENT_BREAKER = ([\r\n]+)\<TABLE>
---
If this reply helps you, Karma would be appreciated.
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...