Getting Data In

How to index only part of a html log file?

gnanaraj_mcc
Loves-to-Learn Lots

Hi,
i have situation to index a log file which is actually a html file. This is an access log. This looks complex to me. The html file has got different sections. i want only the actual access log part (section 4) to be indexed.

Can someone help me to come up with props.conf/transforms.conf files for this?

Section 1: (general information about the log)

XXX XXX Server log

Log file format version 12.20 [debug]
Log started at 03:57:42 Mar 13 2017
DB Time when log started 03:57:42 Mar 13 2017
Server name XXX
Machine name XXX
Machine address XXX
Build Number 3424
Server Node started at 03:57:02 Mar 13 2017

Section 2: Java Properties ( It is of a table format) i am not able to put it in table form here. Its a big table, i gave only sample
Java Properties

OS os.name Linux
os.version 2.6.32-642.11.1.el6.x86_64
os.arch amd64
os.home null
DB Oracle.JDBC.version 5.1.x.000150
Mssql.JDBC.version 5.1.x.000085
user.language en

Section 3: Filter section. This got few text box and buttons (to click and filter)

Log Filter

Thread: (text box)
Login: (text box)
IP: (text box)
Method: (text box)

Filter (button) Clear Filter (Button)

Section 4: Actual log file, in a table format

DB Dateand Time ALM Node Date and Time Thread Request Type Login IP Level Method Message

Mar 13 03:57:48.942 Mar 13 03:57:49.613 WrapperSimpleAppMain N/A N/A N/A ERR FTPServerInitializerImpl.startFTPServer(145) FTP server failed to start

Mar 1304:00:01.692 Mar 1304:00:02.363 newjob:alm.qc.job.automail[939e4baa-c814-40f1-834b-fe46b83336e3] N/A N/A N/A WRN AutoMailLogicImpl.getDefaultFromAddress(862)

0 Karma

sloshburch
Splunk Employee
Splunk Employee
0 Karma

DalJeanis
Legend

If the event is in HTML, then please post the html with <tags> and all, for us to be able to help you. You will need to mark it as code by using the 101 010 button, or by putting a grave accent (`) before and after the code.

0 Karma

gnanaraj_mcc
Loves-to-Learn Lots

Hi Dal, i have answered to @SloshBurch, is that the same thing you are asking. do let me know.
thank you

0 Karma

niketn
Legend

@gnanaraj_mcc ... if you select the HTML code and click on the code button (button with 101010) while posting, it will not escape any charater.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

sloshburch
Splunk Employee
Splunk Employee

Yea, I'm not seeing any html 😞
When you say "open from my desktop" what program are you viewing the file in? If its a web browser then you need to open it in a text editor (like Notepad or better yet notepad++) to see the file like Splunk will see it.

Ultimately, we'll need to do some conf file stanzas to make this work. Alternatively, if the source machine is able to log in a pure text log file (no html markup) then you can save a TON of work.

0 Karma

sloshburch
Splunk Employee
Splunk Employee

Would you clarify if this is what you see on the webpage? What's on the filesystem that you have access to? The format that you see it in a text editor is going to be important.

0 Karma

gnanaraj_mcc
Loves-to-Learn Lots

Thanks @SloshBurch. The details i posted above is from the file i open from my desktop. i have posted the actual content of the html file below. i cant paste the full content due to character limitation to submit in splunk answers
/qc/data/logs/sa/ and /data/logs/qc/qc are the locations.
FYI.. there are two different html files, both need to be indexed. let me know any more details required

0 Karma

gnanaraj_mcc
Loves-to-Learn Lots
<html>
<style>
     h1 { font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 18px; font-weight: normal; color: navy;} 
     h2 { font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 12px; font-weight: bold; color: navy;} 
     tr { font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 12px; font-weight: normal; color: #000000;} 
     td { font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 12px; font-weight: normal; color: #000000; border: 0 solid dimgray; border-top-width: 1pt; border-right-width: 1pt;vertical-align:text-top;} 
     hr { font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 12px; font-weight: normal; color: navy;} 
     body { font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 12px; font-weight: normal; color: #000000;} 
     table { font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 12px; font-weight: normal; color: #000000; border: 0 solid dimgray;} 
      td.navy {color: navy;} 
     tr.filter { font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 12px; font-weight: normal; color: #000000;} 
     td.filter { font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 12px; font-weight: normal; color: #000000; border: 0 solid dimgray;} 
    </style>
     <script type="text/javascript"> 
        <!-- 
        function JSTrim(p_strToBeTrimmed) 
     { 
         var vChar 
         var vLength 
         var i 
         var vFirstNotSpace 
         var vLastNotSpace 

            vLength = p_strToBeTrimmed.length 
            for (i = 0; i < vLength;i++) 
         { 
             vChar = p_strToBeTrimmed.charAt(i) 
             if (vChar != " ") 
             { 
                 vFirstNotSpace = i 
                 i = vLength 
             } 
         } 
         for (i = vLength-1 ; i>=0;i--) 
         { 
             vChar = p_strToBeTrimmed.charAt(i) 
             if (vChar != " ") 
             { 
                 vLastNotSpace = i 
                 i = -1 
             } 
         } 
         return p_strToBeTrimmed.substring(vFirstNotSpace,vLastNotSpace+1); 
     } 


     function toggle(f_level, f_thread, f_reqType, f_method, f_message, f_login, f_IP){ 

            mybody=document.getElementsByTagName("body").item(0); 
            mytable= mybody.getElementsByTagName("table").item(3); 
            mytablebody=mytable.getElementsByTagName("tbody").item(0); 
            trArray = mytablebody.getElementsByTagName("tr"); 
            numOfRows =mytablebody.getElementsByTagName("tr").length; 

            var levels = "ERR"; 
            if(f_level != "ERR"){ 
                levels+="WRN"; 
                if(f_level != "WRN"){ 
                    levels+="FLW"; 
                    if(f_level != "FLW"){ 
                        levels+="DBG"; 
                    } 
                } 
            } 

            // go over all the row and show/hide them 
            for (i=1;i<numOfRows;i++) { 
                var  tdarr = trArray.item(i).getElementsByTagName("td"); 
                thread   =   tdarr.item(2).childNodes.item(0).data;  
                 reqType  =   tdarr.item(3).childNodes.item(0).data;  
                login    =   tdarr.item(4).childNodes.item(0).data; 
                IP       =   tdarr.item(5).childNodes.item(0).data; 
                logLevel =   tdarr.item(6).childNodes.item(0).data; 
                method   =   tdarr.item(7).childNodes.item(0).data;  
                message  =   tdarr.item(8).childNodes.item(0).data;  

                logLevel = JSTrim(logLevel); 

                if((levels.search(logLevel)   !=-1) && 
                   (thread.search(f_thread)   !=-1) && 
                    (f_reqType.search(reqType) !=-1) && 
                   (login.search(f_login)     !=-1) && 
                   (IP.search(f_IP)           !=-1) && 
                   (method.search(f_method)   !=-1) && 
                   (message.search(f_message) !=-1)){ 
                    trArray.item(i).style.display="inline"; 
                }else{ 
                    trArray.item(i).style.display="none"; 
                } 
            } 
        } 

        function clearFilter(){ 
            document.filterForm.level.selectedIndex = 0; 
             document.filterForm.reqType.selectedIndex = 0; 
            document.filterForm.thread.value=""; 
            document.filterForm.Method.value=""; 
            document.filterForm.Message.value=""; 
            showAll(); 
        } 
        function showAll(){ 
            mybody=document.getElementsByTagName("body").item(0); 
            mytable= mybody.getElementsByTagName("table").item(3); 
            mytablebody=mytable.getElementsByTagName("tbody").item(0); 
            trArray = mytablebody.getElementsByTagName("tr"); 
            numOfRows =mytablebody.getElementsByTagName("tr").length; 
            for (i=1;i<numOfRows;i++) { 
                trArray.item(i).style.display="inline"; 
            } 
        } 
        function filter(){ 
            var w = document.filterForm.level.selectedIndex; 
            var logLevel = document.filterForm.level.options[w].text; 
             var reqIdx   = document.filterForm.reqType.selectedIndex; 
             var reqType  = document.filterForm.reqType.options[reqIdx].value; 
            var thread   = document.filterForm.thread.value; 
            var method   = document.filterForm.Method.value; 
            var message  = document.filterForm.Message.value; 
            var login    = document.filterForm.Login.value; 
            var IP       = document.filterForm.IP.value; 
            toggle(logLevel,JSTrim(thread),JSTrim(reqType),JSTrim(method),JSTrim(message), JSTrim(login), JSTrim(IP)); 
        } 
    --></script> 
<body bgcolor="seashell">
<h2>XXXX Server log</h2><table>
<tr><td class ="filter">Log file format version</td><td class ="filter">12.20 [debug]</td></tr>
<tr><td class ="filter">Log started at</td><td class ="filter">10:08:43 Mar 28 2017</td></tr>
<tr><td class ="filter">DB Time when log started</td><td class ="filter">10:08:43 Mar 28 2017</td></tr>
<tr><td class ="filter">Server name</td><td class ="filter">XXX</td></tr>
<tr><td class ="filter">Machine name</td><td class ="filter">XXX</td></tr>
<tr><td class ="filter">Machine address</td><td class ="filter">XXX</td></tr>
<tr><td class ="filter">Build Number</td><td class ="filter">XXX</td></tr>
<tr><td class ="filter">Server Node started at</td><td class ="filter">10:08:43 Mar 28 2017</td></tr>
</table>
<h2>Java Properties</h2>
<table cellSpacing="0" style="table-layout:fixed;word-break:break-all;border-width:1pt">
<tr><td width="30%"><b>OS</b></td><td> </td></tr>
<tr><td>os.name</td><td>Linux</td></tr>
<tr><td>os.version</td><td>2.6.32-642.11.1.el6.x86_64</td></tr>
<tr><td>os.arch</td><td>amd64</td></tr>
<tr><td>os.home</td><td>null</td></tr>
<tr><td width="30%"><b>DB</b></td><td> </td></tr>
<tr><td>Oracle.JDBC.version</td><td>5.1.x.000150</td></tr>
<tr><td>Mssql.JDBC.version</td><td>5.1.x.000085</td></tr>
<tr><td width="30%"><b>User</b></td><td> </td></tr>
<tr><td>user.name</td><td>almadm</td></tr>
<tr><td>user.home</td><td>/export/home/almadm</td></tr>
<tr><td>user.dir</td><td>/qc/app/HP/ALM/server</td></tr>
<tr><td>user.language</td><td>en</td></tr>
<tr><td width="30%"><b>Java</b></td><td> </td></tr>
<tr><td>java.vm.vendor</td><td>Oracle Corporation</td></tr>
<tr><td>java.version</td><td>1.7.0_51</td></tr>
<tr><td>java.vm.version</td><td>24.51-b03</td></tr>
<tr><td>java.home</td><td>/qc/app/HP/java/jre</td></tr>
<tr><td>java.class.path</td><td>../wrapper/wrapper.jar:../server/lib/annotations/:../server/lib/ext/:../server/lib/jaspi/:../server/lib/jetty-annotations-9.1.4.v20140401.jar:../server/lib/jetty-client-9.1.4.v20140401.jar:../server/lib/jetty-continuation-9.1.4.v20140401.jar:../server/lib/jetty-deploy-9.1.4.v20140401.jar:../server/lib/jetty-http-9.1.4.v20140401.jar:../server/lib/jetty-io-9.1.4.v20140401.jar:../server/lib/jetty-jaas-9.1.4.v20140401.jar:../server/lib/jetty-jaspi-9.1.4.v20140401.jar:../server/lib/jetty-jmx-9.1.4.v20140401.jar:../server/lib/jetty-jndi-9.1.4.v20140401.jar:../server/lib/jetty-plus-9.1.4.v20140401.jar:../server/lib/jetty-proxy-9.1.4.v20140401.jar:../server/lib/jetty-rewrite-9.1.4.v20140401.jar:../server/lib/jetty-schemas-3.1.jar:../server/lib/jetty-security-9.1.4.v20140401.jar:../server/lib/jetty-server-9.1.4.v20140401.jar:../server/lib/jetty-servlet-9.1.4.v20140401.jar:../server/lib/jetty-servlets-9.1.4.v20140401.jar:../server/lib/jetty-util-9.1.4.v20140401.jar:../server/lib/jetty-webapp-9.1.4.v20140401.jar:../server/lib/jetty-xml-9.1.4.v20140401.jar:../server/lib/jndi/:../server/lib/jsp/:../server/lib/launcher-sources.jar:../server/lib/launcher.jar:../server/lib/monitor/:../server/lib/servlet-api-3.1.jar:../server/lib/setuid/:../server/lib/spdy/:../server/lib/websocket/:../server/lib/annotations/asm-4.1.jar:../server/lib/annotations/asm-commons-4.1.jar:../server/lib/annotations/javax.annotation-api-1.2.jar:../server/lib/jndi/javax.activation-1.1.0.v201105071233.jar:../server/lib/jndi/javax.mail.glassfish-1.4.1.v201005082020.jar:../server/lib/jndi/javax.transaction-api-1.2.jar:../server/lib/jsp/javax.el-3.0.0.jar:../server/lib/jsp/javax.servlet.jsp-2.3.2.jar:../server/lib/jsp/javax.servlet.jsp-api-2.3.1.jar:../server/lib/jsp/javax.servlet.jsp.jstl-1.2.0.v201105211821.jar:../server/lib/jsp/javax.servlet.jsp.jstl-1.2.2.jar:../server/lib/jsp/jetty-jsp-jdt-2.3.3.jar:../server/lib/jsp/org.apache.taglibs.standard.glassfish-1.2.0.v201112081803.jar:../server/lib/jsp/org.eclipse.jdt.core-3.8.2.v20130121.jar:../server/lib/monitor/jetty-monitor-9.1.4.v20140401.jar</td></tr>
<tr><td>java.specification.version</td><td>1.7</td></tr>
<tr><td>java.specification.vendor</td><td>Oracle Corporation</td></tr>
<tr><td>java.specification.name</td><td>Java Platform API Specification</td></tr>
<tr><td>java.vendor.url</td><td>http://java.oracle.com/</td></tr>
<tr><td>java.vm.specification.version</td><td>1.7</td></tr>
<tr><td>java.vm.specification.vendor</td><td>Oracle Corporation</td></tr>
<tr><td>java.vm.specification.name</td><td>Java Virtual Machine Specification</td></tr>
<tr><td>java.class.version</td><td>51.0</td></tr>
<tr><td>java.library.path</td><td>../wrapper</td></tr>
<tr><td>java.io.tmpdir</td><td>/tmp</td></tr>
<tr><td>java.compiler</td><td>null</td></tr>
<tr><td>java.ext.dirs</td><td>/qc/app/HP/java/jre/lib/ext:/usr/java/packages/lib/ext</td></tr>
<tr><td width="30%"><b>Other</b></td><td> </td></tr>
<tr><td>Total memory</td><td>3960MB</td></tr>
<tr><td>Free memory</td><td>3326MB</td></tr>
<tr><td>Max memory to be used</td><td>7919MB</td></tr>
<tr><td>Available Processors</td><td>4</td></tr>
<tr><td>Using config file</td><td>null</td></tr>
</table>

<form NAME ="filterForm">          
<TABLE> 
<tr class ="filter"></TD><B>Log Filter</B><TD></TR> 
<TR class ="filter">         
    <TD class ="filter">Thread:</TD> 
    <TD class ="filter"><INPUT NAME="thread" SIZE=30 TYPE=TEXT  VALUE=""> </TD> 
     <TD class ="filter">Request Type: 
        <SELECT NAME="reqType"> 
             <OPTION VALUE="FREC,REST">ALL</OPTION> 
             <OPTION VALUE="FREC">FREC</OPTION> 
             <OPTION VALUE="REST">REST</OPTION> 
         </SELECT> 
    </TD> 
    <TD class ="filter">Level:</TD> 
    <TD class ="filter"><SELECT NAME="level"> 
            <OPTION VALUE="DBG">DBG</OPTION> 
            <OPTION VALUE="FLW">FLW</OPTION> 
            <OPTION VALUE="WRN">WRN</OPTION> 
            <OPTION VALUE="ERR">ERR</OPTION> 
      </SELECT> 
    </TD> 
</TR> 
<TR class ="filter">         
    <TD class ="filter">Login:</TD> 
    <TD class ="filter"><INPUT NAME="Login" SIZE=30 TYPE=TEXT  VALUE=""> </TD> 
</TR> 
<TR class ="filter">         
    <TD class ="filter">IP:</TD> 
    <TD class ="filter"><INPUT NAME="IP" SIZE=30 TYPE=TEXT  VALUE=""> </TD> 
</TR> 
<TR class ="filter"> 
    <TD class ="filter">Method:</TD> 
    <TD class ="filter"><INPUT NAME="Method" SIZE=30 TYPE=TEXT  VALUE=""></TD> 
</TR> 
<TR class ="filter"> 
    <TD class ="filter">Message:</TD> 
    <TD class ="filter"><INPUT NAME="Message" SIZE=30 TYPE=TEXT  VALUE=""></TD> 
    <TD class ="filter"><BUTTON name="filterB" type="button" onClick="filter()" > Filter </BUTTON>  
        <BUTTON name="clearDilterB" type="button" onClick="clearFilter()">Clear Filter</BUTTON> 
    </TD> 
</TR> 

</TABLE> 
</FORM> 
<table width="100%" cellPadding="4" cellSpacing="0" align="right" style="table-layout:fixed;word-break:break-all;border-width:1pt">
<tr bgcolor="gray">
<td width="7%" style="color: Yellow"><b>DB Date&lt;br/&gt;and Time</b></td>
<td width="7%" style="color: Yellow"><b>ALM Node&lt;br/&gt;Date&lt;br/&gt;and Time</b></td>
<td width="17%" style="color: Yellow"><b>Thread</b></td>
<td width="5%" style="color: Yellow"><b>Request&lt;br/&gt;Type</b></td>
<td width="7%" style="color: Yellow"><b>Login</b></td>
<td width="7%" style="color: Yellow"><b>IP</b></td>
<td width="5%" style="color: Yellow"><b>Level</b></td>
<td width="17%" style="color: Yellow"><b>Method</b></td>
<td width="28%" style="color: Yellow"><b>Message</b></td>
</tr>
<tr bgcolor="yellow"><td>Mar 28<br>14:05:07.513</td><td>Mar 28<br>14:05:08.329</td><td>qtp2094206515-455</td><td>N/A</td><td>N/A</td><td>N/A</td><td>WRN</td><td>CAbsServlet.doPost(378)</td><td>Failed setting thread information<p>com.hp.alm.platform.exception.NoSuchSessionException<p>Messages:<br>The session authentication has failed.;<br><p>Stack Trace:<br>com.hp.alm.platform.exception.NoSuchSessionException: The session authentication has failed.<br>at com.hp.alm.platform.connection.authentication.CLoginSessionDirectory.getItem(CLoginSessionDirectory.java:85)<br>at com.hp.alm.platform.connection.authentication.CLoginSessionDirectory.getItem(CLoginSessionDirectory.java:61)<br>at com.hp.alm.platform.server.web.CAbsServlet.setThreadInfo(CAbsServlet.java:1029)<br>at com.hp.alm.platform.server.web.CAbsServlet.doPost(CAbsServlet.java:376)<br>at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)<br>at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)<br>at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:738)<br>at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1651)&lt;br/&gt;at
0 Karma

sloshburch
Splunk Employee
Splunk Employee

Ok, so this is going to get very messy because html has so much markup in it that it could prove very fragile to parse successfully. This appears to be from a DB monitoring system. What software produces this? Is it a vendor solution or something written in-house?

0 Karma

gnanaraj_mcc
Loves-to-Learn Lots

Burch, this is in-build log files from the application HP Quality Center software.

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In November, the Splunk Threat Research Team had one release of new security content via the Enterprise ...

Index This | Divide 100 by half. What do you get?

November 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this ...

Stay Connected: Your Guide to December Tech Talks, Office Hours, and Webinars!

❄️ Celebrate the season with our December lineup of Community Office Hours, Tech Talks, and Webinars! ...