31 May 2016

Top N- queries: using the 12c syntax.

One of the new features with Oracle database 12c is the new syntax for Top N queries and pagination. Did we really need this? Should you choose for the new syntax over the way we used to do it, with an inline view? I think so, it simply adds syntactic clarity to the query, and in this blogpost I will show the difference between the "old" and the "new".

For the examples I will use my all time favourite demo data: the EMP table.

SQL> select ename
  2        ,sal
  3    from emp
  4   order by sal desc
  5  /

ENAME             SAL
---------- ----------
KING             5000
FORD             3000
SCOTT            3000
JONES            2975
BLAKE            2850
CLARK            2450
ALLEN            1600
TURNER           1500
MILLER           1300
WARD             1250
Widlake          1250
ADAMS            1100
JAMES             950
SMITH             800

14 rows selected.
As you can tell from the output above, KING has the highest salary and FORD and SCOTT have the same salary which is the second highest.

If you wanted to write a Top N query before Oracle database 12c, let's say the Top 2 of most earning EMP, you would probably have written something with an inline view.

SQL> select ename
  2        ,sal
  3    from (select ename
  4                ,sal
  5            from emp
  6           order by sal desc
  7         )
  8   where rownum <= 2
  9  /

ENAME             SAL
---------- ----------
KING             5000
SCOTT            3000
This query might do the job most of the time, but the results might not be what you were looking for. In this case the requirement is "the Top 2 of most earning EMP", SCOTT and FORD should have both been in the results as they have the same salary.
To resolve this, you would have to rewrite your query using an Analytic Ranking Function in the inline view:
SQL> select ename
  2        ,sal
  3    from (select ename
  4                ,sal
  5                ,rank() over (order by sal desc) rn
  6            from emp
  7         )
  8   where rn <= 2
  9  /

ENAME             SAL
---------- ----------
KING             5000
SCOTT            3000
FORD             3000

Using the new Top N syntax in Oracle database 12c, the query is a lot easier to understand.

SQL> select ename
  2        ,sal
  3    from emp
  4   order by sal desc
  5   fetch first 2 rows with ties
  6  /

ENAME             SAL
---------- ----------
KING             5000
SCOTT            3000
FORD             3000
On line 4 the results are sorted based on the salary (highest on top) and line 5 instructs to get the first two rows. Because of the addition "with ties" both SCOTT and FORD are shown in the results.
To see what happens under the covers, an explain plan is created for this query.
SQL> explain plan for
  2  select ename
  3        ,sal
  4    from emp
  5   order by sal desc
  6   fetch first 2 rows with ties
  7  /

Explained.

SQL> select *
  2    from table (dbms_xplan.display())
  3  /

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3291446077

---------------------------------------------------------------------------------
| Id  | Operation                | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |      |    14 |   644 |     4  (25)| 00:00:01 |
|*  1 |  VIEW                    |      |    14 |   644 |     4  (25)| 00:00:01 |
|*  2 |   WINDOW SORT PUSHED RANK|      |    14 |   154 |     4  (25)| 00:00:01 |
|   3 |    TABLE ACCESS FULL     | EMP  |    14 |   154 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("from$_subquery$_002"."rowlimit_$$_rank"<=2)
   2 - filter(RANK() OVER ( ORDER BY INTERNAL_FUNCTION("SAL") DESC )<=2)

16 rows selected.
The output above shows that the Analytic Function RANK is used.

18 May 2016

Rounding amounts, divide cents over multiple lines

In previous articles I wrote about dealing with a missing cent when you need to divide a certain amount over multiple lines. In these articles, links are at the bottom, I described a method to calculate the difference on the last row.
Then a question arose (as a comment):
What if for example i have 42 records and i wish to divide 100 by 42. I would get a rounded value of 2.38. If i multiply this by 42 it would amount to just 99.96. What if want to spread the .04 difference on 4 records, not just the last record. In effect i'll be having 4 records with 2.39. Right now i'm doing this via cursor. Im kinda hoping i can do this using sql or analytic functions
Let's create a table first, called T:
create table t
as
select rownum id
  from dual
 connect by level <= 42
In order to determine the number of rows, we need three pieces of information:
  1. The amount that we need to divide
  2. The total number of rows in the set
  3. The difference between the rounded amount and the amount that we need to divide
 select id
      ,100 amount
      ,count(*) over () entries
      ,round (100 / count(*) over (), 2) rounded
    from t
To calculate how many rows will have the extra cent added or subtracted, the following formula is used.
abs (amount - (entries * rounded)) * 100
When you want the rounding done on the "first" rows, a simple CASE expression can be used to mark the rows
case 
when rownum <= abs ((amount - (entries * rounded)) * 100)
then 'x'
end as indicator
The query now looks like the following, with the first ten rows:
with amounts
as
( select id
      ,100 amount
      ,count(*) over () entries
      ,round (100 / count(*) over (), 2) rounded
    from t
)
select id
      ,amount
      ,entries
      ,rounded
      ,case 
       when rownum <= abs ((amount - (entries * rounded)) * 100)
       then 'x'
       end as indicator
 from amounts
;
 ID     AMOUNT  ENTRIES    ROUNDED F
---------- ---------- ---------- ---------- -
  1   100       42       2.38 x
  2   100       42       2.38 x
  3   100       42       2.38 x
  4   100       42       2.38 x
  5   100       42       2.38
  6   100       42       2.38
  7   100       42       2.38
  8   100       42       2.38
  9   100       42       2.38
 10   100       42       2.38
The last piece of the puzzle is to determine if we need to add or subtract the cent. Using the SIGN function is an easy way to determine this.
sign (amount - (entries * rounded)) as pos_neg
Putting everything together will give you the following query (with the first 10 rows)
with amounts
as
( select id
      ,100 amount
      ,count(*) over () entries
      ,round (100 / count(*) over (), 2) rounded
    from t
)
,indicators as 
(
select id
      ,amount
      ,entries
      ,rounded
      ,case 
       when rownum <= abs ((amount - (entries * rounded)) * 100)
       then 'x'
       end as indicator
      ,sign (amount - (entries * rounded)) as pos_neg
 from amounts
)
select id
      ,rounded +
       (nvl2 (indicator, 0.01, 0) * pos_neg) final_amount
  from indicators
;
 ID FINAL_AMOUNT
---------- ------------
  1    2.39
  2    2.39
  3    2.39
  4    2.39
  5    2.38
  6    2.38
  7    2.38
  8    2.38
  9    2.38
 10    2.38

And there it is. The rounding is divided over the first four rows.


Don't want to use the first rows, but the last rows instead? Use the following expression to set the indicator
case 
when rownum >
 case sign ((amount - (entries * rounded)))
   when -1 then entries - abs ((amount - (entries * rounded)) * 100)
   else entries - (amount - (entries * rounded)) * 100
   end
then 'x'
end as indicator

Links

18 March 2016

Temporal validity, multiple end dates

Recently I got involved in a question on Temporal Validity Periods together with Chris Saxon, one of the askTom-answer team.

The question was along the lines of: "What if I have a single start date but two possible end dates. One of the end dates is filled automatically by a background proces (could be a job) while the other one is to signal that the end date is set manually by the user. Could you use Temporal Validity to get the correct rows?"

My first reaction was: "Yes, you can have multiple validity periods on a single table... but wouldn't it be better to have a single end date and another column to signal if it was set by an automatic proces or by a manual process?". In the latter case you would only need a single validity period.

It got me thinking though... would it be possible to define a single validity period when three (one start and two end dates) are involved?

Let's start by creating a table and populate it with some sample data.

   create table t
   (id              number       primary key
   ,name            varchar2(50) not null
   ,start_date      date         not null
   ,end_date        date
   ,manual_end_date date
   );

The table is fairly straight forward: a surrogate primary key, a name, a start date and two end dates: end_date and manual_end_date.

Next populate the table with some sample data. The name column is used to signal what the row represent, being a row that is current or not (based on sysdate).

   insert into t (id, name, start_date) values (1, 'no end, current', sysdate - 10);
   insert into t (id, name, start_date) values (2, 'no end, not current', sysdate + 10);

   insert into t (id, name, start_date, end_date) values (3, 'current', sysdate - 10, sysdate + 10);
   insert into t (id, name, start_date, end_date) values (4, 'not current, past', sysdate - 10, sysdate -9);
   insert into t (id, name, start_date, end_date) values (5, 'not current, future', sysdate + 10, sysdate +20);

   insert into t (id, name, start_date, manual_end_date) values (6, 'current', sysdate - 10, sysdate + 10);
   insert into t (id, name, start_date, manual_end_date) values (7, 'not current, past', sysdate - 10, sysdate -9);
   insert into t (id, name, start_date, manual_end_date) values (8, 'not current, future', sysdate + 10, sysdate +20);

   commit;

select *
  from t;

        ID NAME                                               START_DAT END_DATE  MANUAL_EN ABSOLUTE_
---------- -------------------------------------------------- --------- --------- --------- ---------
         1 current, no end                                    08-MAR-16
         2 not current, no end                                28-MAR-16
         3 current                                            08-MAR-16 28-MAR-16           28-MAR-16
         4 not current, past                                  08-MAR-16 09-MAR-16           09-MAR-16
         5 not current, future                                28-MAR-16 07-APR-16           07-APR-16
         6 current                                            08-MAR-16           28-MAR-16 28-MAR-16
         7 not current, past                                  08-MAR-16           09-MAR-16 09-MAR-16
         8 not current, future                                28-MAR-16           07-APR-16 07-APR-16

8 rows selected.

Now that the table is populated with the sample data, we can start the experiment. Can you have a temporal validity period based on an expression?

   SQL> alter table t
     2  add period for regular (start_date, coalesce (end_date, manual_end_date))
     3  /
   add period for regular (start_date, coalesce (end_date, manual_end_date))
                                                *
   ERROR at line 2:
   ORA-02000: missing ) keyword

Uhm,... no.
What about moving the expression over to a virtual column and use the virtual column in the temporal validity period?

   SQL> alter table t
     2  add absolute_end_date generated always
     3  as (coalesce (end_date, manual_end_date))
     4  /

   Table altered.


   SQL> alter table t
     2  add period for virt_col (start_date, absolute_end_date)
     3  /

   Table altered.

So far, so good. When playing around with temporal validity periods I ran into a bug by extracting the DDL using DBMS_METADATA, so let's see what Oracle makes of this combination of virtual column and the validity period.

   SQL> set long 9000
SQL> select dbms_metadata.get_ddl ('TABLE', 'T')
  2    from dual
  3  /

DBMS_METADATA.GET_DDL('TABLE','T')
--------------------------------------------------------------------------------

  CREATE TABLE "ALEX"."T"
   ( "ID" NUMBER,
 "NAME" VARCHAR2(50) NOT NULL ENABLE,
 "START_DATE" DATE NOT NULL ENABLE,
 "END_DATE" DATE,
 "MANUAL_END_DATE" DATE,
 "ABSOLUTE_END_DATE" DATE GENERATED ALWAYS AS (COALESCE("END_DATE","MANUAL_END_D
ATE")) VIRTUAL ,
 PERIOD FOR "VIRT_COL"("START_DATE","ABSOLUTE_END_DATE") ,
  PRIMARY KEY ("ID")
  USING INDEX PCTFREE 10 INITRANS 2 MAXTRANS 255
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  TABLESPACE "USERS"  ENABLE
   ) SEGMENT CREATION IMMEDIATE
  PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255
 NOCOMPRESS LOGGING
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  TABLESPACE "USERS"

As you can see from the output above, all seems to be just fine.
But now for the real test and query the table using the temporal validity.

   SQL> select *
     2    from t as of period for virt_col sysdate
     3  /

           ID NAME                                               START_DAT END_DATE  MANUAL_EN ABSOLUTE_
   ---------- -------------------------------------------------- --------- --------- --------- ---------
            1 current, no end                                    08-MAR-16
            3 current                                            08-MAR-16 28-MAR-16           28-MAR-16
            6 current                                            08-MAR-16           28-MAR-16 28-MAR-16

That worked! Awesome!.. Virtual Columns with a Temporal validity period, wonderful.

I ran these test on my virtual box, using the pre-built developer appliance, so it might not be the latest patch release or anything.

   SQL> select *
  2    from v$version
  3  /

BANNER                                                                               CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production              0
PL/SQL Release 12.1.0.2.0 - Production                                                    0
CORE 12.1.0.2.0 Production                                                                0
TNS for Linux: Version 12.1.0.2.0 - Production                                            0
NLSRTL Version 12.1.0.2.0 - Production                                                    0

Links

22 November 2015

APEX 5.1: features shown at DOAG

On the last day of the DOAG conference, in the last time slot Patrick Wolf from the APEX development team did a session on the next release of APEX.
For the developer the most significant change in the Page Designer is the component view. This will be a tab in the center pane instead of a completely separate page.

Most time was spent showing the new Interactive Grid. The Interactive Grid will be a, loosley speaking, combination of an Interactive Report and a Tabular form. The menu from the Interactive Report also gets a make over. The functionality to format the result set (like pivot, group by) will be taken out of the action-menu and will get its own button next to the search bar.

Initially the Interactive Grid will be read only (like the Interactive Report) and can be set to allow data changes. As a developer you can specify which operations are allowed and add authorization to each of the DML. What is really neat is that you can also specify which column can be edited.
From a right click menu in the Interactive Grid several actions can be performed like adding or removing a row.

It is also possible to specify what item type will need to be used when you edit the field.

After this demo, Patrick showed the master-detail functionality, which also allows editing in the master and the detail. He described that only the changes are sent to the database when the save button is pressed.

Of course the new charting engine was also shown and emphasized that not all chart-types available in Oracle JET will be created in APEX through a declaritive wizard.
In case you hadn't heard: the new charting engine for APEX will be using Oracle JET.
What surprised me was that Patrick mentioned that some charts in the JET toolkit don't offer the functionality which is currently provided in AnyChart (the current charting engine) and that in some cases they will continue to use AnyChart for those components. If I remember correctly this was the case for the maps. The European maps aren't as detailed as the APEX team would like them to be, but who knows by the time that APEX 5.1 is ready for release the JET charts might be up to par.

Declarative column group headers, the ability to move columns and creating an overflow report (just like the ones that you know from Oracle Forms) were also briefly demonstrated by Patrick.
All in all, a great last session showing a glimpse into the future of Oracle APEX.
Of course all of what is shown might not be in APEX 5.1, the first slide (after the title slide) was Oracle's safe harbor statement.

16 October 2015

Updating Identity Columns

During my presentation "Oracle 12c for Developers" at the Sloveninan and Croatian User Groups I got the same question twice about Identity Columns:

Is it possible to update an Identity Column?
During the presentation I show how it is not possible to insert a value for a "Generated Always Identity" column.
Let's take a look at an example:
SQL> create table t
  2  (id number  generated as identity
  3  ,name varchar2(35)
  4  );

Table created.

SQL> insert into t(name) values ('hello')
  2  /

1 row created.
   
In the first section the table is created with a "Generated Always Identity" column.
The second part shows that you can insert into the table, as long as the identity column isn't used in the insert statement. Trying to do so will lead to an exception.
SQL> insert into t values (1, 'World')
  2  /
insert into t values (1, 'World')
*
ERROR at line 1:
ORA-32795: cannot insert into a generated always identity column
   
The same is true when you attempt to insert a record type:
SQL> declare
  2     r t%rowtype;
  3  begin
  4     r.id := 42;
  5     r.name := 'world';
  6     insert into t values r;
  7  end;
  8  /
declare
*
ERROR at line 1:
ORA-32795: cannot insert into a generated always identity column
ORA-06512: at line 6
   
Now back to the question, what about an update? Let's try some different methods:
SQL> update t
  2     set id = id
  3  /
   set id = id
       *
ERROR at line 2:
ORA-32796: cannot update a generated always identity column
SQL> update t
  2     set id = id + 1
  3  /
   set id = id + 1
       *
ERROR at line 2:
ORA-32796: cannot update a generated always identity column
SQL> update t
  2     set id = id + 0
  3  /
   set id = id + 0
       *
ERROR at line 2:
ORA-32796: cannot update a generated always identity column
   
In short: updating a "Generated Always Identity" column is not allowed either.
One final remark: an identity column is not the same as a primary key. If you want to use a "Generated Always Identity" column as your primary key, then you will have to specify that explicitly.
SQL> create table t
  2  (id number  generated as identity primary key
  3  ,name varchar2(35)
  4  );

Table created.
   

12 October 2015

Busy October

October is going to be a busy month, and it already started.
The First one is the Slovenian User Group immediately followed by the Kroatian User Group.

For the Sloveninan User Group (SIOUG) I will do the following presentations:

    • Oracle 12c for Developers
      Analytic Functions: Unleash the Power
      APEX, Node.js and HTML5: Magic!
  • On top of that I was asked to join the "Ask the Experts" panel on APEX.

    In Kroatia (for the HrOUG) these presentations are on the agenda:

    • Oracle 12c for Developers
      Analytic Functions: Unleash the Power
  • Next week wednesday Oracle Open World kicks off with the ACE Director Briefing and of course OOW itself.
    On the Usergroup day (Sunday) I will participate in two presentations. First "APEX, Node.js and HTML5: Magic!" and later that day I have a 7-minute session during the EOUC "12 more things about Oracle 12c".
    The last session is going to be a fun one. As the title claims there are going to be more than 12 topics in one session with a lot of excellent speakers.

    Jonathan Lewis Less well-known enhancements of the 12c Optimizer
    Julian Dontcheff Oracle Database 12c In-Memory Advisor
    Gurcan Orhan Adapting DB 12C In-Memory to ODI 12c
    Osama Mustafa How to plugin a non-CDB database to a Container Database (CDB)
    Bjoern Rost How ASM has reached maturity in 12c
    Alex Nuijten JSON in the database
    Brendan Tierney Running R in the Database using Oracle R Enterprise
    Douwe Pieter van den Bos Maximum security architecture
    Christian Antognini Real-time Monitoring of Composite Database Operations
    Martin Widlake 12C - Clustered Data aware TABLE_CACHED_BLOCKS
    Heli Helskyaho Design your 12c Databases using Oracle SQL Dev Data Modeler
    Oded Raz Oracle 12c Privileges Analysis
    Tim Hall
    Alex Zaballa
    Maria Colgan
    Lucas Jemella

    I believe you can still register for this session.
    More Than Another 12 on Oracle Database 12c [UGF3190] Moscone South Room 306 time 13:30 - 15:15

    By the time OOW is done, so is October.

    07 August 2015

    Rounding Amounts, the missing cent: with the MODEL clause

    Rounding amounts may lead to rounding-issues, I have written how this may be resolved in a previous blogpost using some analytic functions.
    Playing around a little bit, I also came up with a method to resolve the rounding difference with the MODEL clause.

    To create an example, first let's create a table with only three records in it.

       SQL> create table t
         2  as
         3  select rownum + 42 id
         4    from dual
         5  connect by level <= 3
         6  ;
    
       Table created.
       SQL> select *
          2    from t
          3  /
    
                    ID
            ----------
                    43
                    44
                    45
    

    In the code below the DIMENSION is the ID from the table. The measures (the "cells" that we can manipulate in the MODEL clause) are used as follows

    Amount The amount that is the result of the division per record
    Total The amount that we want to divide over the records
    Diff Placeholder for the difference between the sum of the rounded amounts and the amount that needs to be divided.
    Indicator We need to know the highest value for the DIMENSION (to add the difference to). Since we can't determine the highest value of the DIMENSION, this MEASURE is used for that.
    When we execute the query we get the following results, except for the DIMENSION column (ID) and the MEASURE Indicator all columns are empty.
          SQL> select *
            2    from t
            3  model
            4  dimension by (id)
            5    measures (cast (null as number) amount
            6    ,cast (null as number) total
            7    ,cast (null as number) diff
            8    ,id indicator
            9    )
           10  rules (
           11  )
           12  /
    
           ID           AMOUNT      TOTAL       DIFF  INDICATOR
          ---------- ---------- ---------- ---------- ----------
                  43                                          43
                  44                                          44
                  45                                          45
    
    It is time to add some rules to the MODEL clause.
    The first rule is to add the amount which needs to be divided.
    total [0] = 100   
    
    This will add a "magic" row to the resultset with Dimension 0 where the measure column total is filled with 100, the amount that we want to divide.
    The reason I call it a "magic" row is, is because it is not in the table and made up. In this row I will store some values that I need to do my calculations and such.
    The second rule is
     indicator [0] = max (indicator) [any]
     
    In this rule I will determine which row is the "last" row, the one with the highest ID.
    Next rule is to do the actual calculation: divide the amount by the number of rows in the resultset. Of course don't count the "magic" row, hence the condition id > 0.
     amount[id > 0] = round (total[0]/(count(*)[id > 0]), 2)
     
    To determine the total of the rounded values, we will use the following rule:
     amount[0] = sum (amount)[id > 0]
     
    This total amount is also placed on the "magic" row.
    Calculating the difference between the amount that we want to divide and the actual divided amount is done in the following rule:
     diff[0] = total[cv()] - amount[cv()]
     
    The difference is added to the "last" row in the last rule:
     amount[indicator[0]]  = amount[cv()] + diff[0]
     
    To see the complete query in action:
       SQL> select *
         2    from t
         3  model
         4  dimension by (id)
         5    measures (cast (null as number) amount
         6    ,cast (null as number) total
         7    ,cast (null as number) diff
         8    ,id indicator
         9    )
        10  rules (
        11    total [0] = 100
        12   ,indicator [0] = max (indicator) [any]
        13   ,amount[id> 0] = round (total[0]/(count(*)[id>0]), 2)
        14   ,amount[0] = sum (amount)[id>0]
        15   ,diff[0] = total[cv()] - amount[cv()]
        16   ,amount[indicator[0]]  = amount[cv()] + diff[0]
        17  )
        18  /
    
           ID     AMOUNT      TOTAL       DIFF  INDICATOR
       ---------- ---------- ---------- ---------- ----------
               43      33.33                               43
               44      33.33                               44
               45      33.34                               45
                0      99.99        100        .01         45
       
    As you can see in the output above the values are rounded (in the AMOUNT column) and the last row takes the difference.
    But also our "magic" row is added to the output, to filter that one out of the resultset simply add a where clause.
          SQL> select id
            2      ,amount
            3    from (select *
            4    from t
            5  model
            6  dimension by (id)
            7    measures (cast (null as number) amount
            8    ,cast (null as number) total
            9    ,cast (null as number) diff
           10    ,id indicator
           11    )
           12  rules (
           13    total [0] = 100
           14   ,indicator [0] = max (indicator) [any]
           15   ,amount[id> 0] = round (total[0]/(count(*)[id>0]), 2)
           16   ,amount[0] = sum (amount)[id>0]
           17   ,diff[0] = total[cv()] - amount[cv()]
           18   ,amount[indicator[0]]  = amount[cv()] + diff[0]
           19  ))
           20  where id> 0 order by id
           21  /
    
                  ID     AMOUNT
          ---------- ----------
                  43      33.33
                  44      33.33
                  45      33.34
          

    Links