All Apps and Add-ons

ROW_NUMBER() works in DBX 2.4.1 to create a rising column for DB input, so why can't I get it to work in DBX 3.1.3?

anthonyfry
Explorer

The SQL editor has the following:-
Select

     srp_ordernum,
         srp_source,
         srp_hof_seqno,
         srp_seqno_in_hand_off,
         srp_entry_type,
         srp_entry_type_desc,
         srp_date,
         srp_initials,
         srp_surname,
         srp_location,
         srp_extn,
         srp_text,
         ROW_NUMBER()
    OVER (ORDER BY srp_date,srp_ordernum,srp_seqno_in_hand_off,srp_hof_seqno) Sequence_no
FROM sword.service_request_progress
WHERE srp_date > '01-Aug-2018'
ORDER BY  srp_date

The SQL works in Batch mode, but when i switch to rising column, I get the error java.sql.SQLException: Invalid column index

DBX 3.1.3 instructs me to use the Sequence_no in the WHERE , but when i do i get the following error:

 java.sql.SQLSyntaxErrorException: ORA-00904: "SEQUENCE_NO": invalid identifier
0 Karma

tmuth_splunk
Splunk Employee
Splunk Employee

row_number() is defined at runtime of the query. If you only do inserts, never updates or deletes, then you can just use srp_date as your rising column.

anthonyfry
Explorer

I had thought about that, but the SRP_DATE is only accurate to the minute and there could be thousands of entries in a minute. This led me to assume that there would be duplications in the ingestion every time the cron triggered. I will give it a go and let you know the outcome

0 Karma

anthonyfry
Explorer

I have just checked the data in SPLUNK for yesterday, unfortunately the DBX input is missing 10-15% of events when I compare to the source table on the SQL server. Any other suggestions ?

0 Karma

tmuth_splunk
Splunk Employee
Splunk Employee

The only way I know of would be to use >= instead of > so you won't miss events. The problem now is that you will have duplicate events. If you have some columns that can identify a unique record, you can just use "dedup" in SPL.

I've thought about this problem...a lot.

jcoates
Communicator

http://docs.splunk.com/Documentation/DBX/3.1.3/DeployDBX/Createandmanagedatabaseinputs#Choose_input_...

Note that timestamps are not ideal for rising columns, though they often are the best available choice. Using a timestamp for rising column can produce the following problem conditions:
A high rate of event generation can cause data duplication or loss, because checkpointing in a stream of timestamp-distinguished records assumes there is never more than one row created in a given time. If you set the time to a one second level of resolution and get five records per second, you lose or duplicate four records from every run.
Clock skew, NTP skew corrections, physical moves between timezones, and daylight savings events can cause data mis-ordering, duplication, or loss. If the skew is towards the future, then the resulting checkpoint value may temporarily or permanently stop data collection.
Non-numeric datetime values cannot be evaluated numerically, and lexical sorting can produce unpredictable results. If time series data is ordered lexically, then the resulting checkpoint value may temporarily or permanently stop data collection.

0 Karma

anthonyfry
Explorer

I have changed my SQL to read:
FROM sword.service_request_progress
WHERE srp_date >= ?
ORDER BY srp_date asc

and performed a dedup in the search

however I am still facing an issue with a large number of missing events. I cannot understand why the Row() function worked on the old version of DBX but not the new, seems a backward step if you cannot add calculated fields to your data at this stage
Any other ideas welcome.

0 Karma

tmuth_splunk
Splunk Employee
Splunk Employee

Where does srp_date get set? Is it a database trigger or in some other app? Can events come into the db out of order?

You can change your order by to be the same as the order by in the row_number() analytic function if that does in fact give a definitive sort order:
ORDER BY srp_date,srp_ordernum,srp_seqno_in_hand_off,srp_hof_seqno
However, I don't see that would materially change the behavior in terms of losing events.

0 Karma
Get Updates on the Splunk Community!

Splunk Training for All: Meet Aspiring Cybersecurity Analyst, Marc Alicea

Splunk Education believes in the value of training and certification in today’s rapidly-changing data-driven ...

Investigate Security and Threat Detection with VirusTotal and Splunk Integration

As security threats and their complexities surge, security analysts deal with increased challenges and ...

Observability Highlights | January 2023 Newsletter

 January 2023New Product Releases Splunk Network Explorer for Infrastructure MonitoringSplunk unveils Network ...