Analytic Functions in Oracle 8i and 9i

Analytic Functions in Oracle 8i and 9i

宿題    1、用SQL得出表a和表b的纪录数的和。    2、在表EMP中,根据每个部门,统计每个部门中所有员工的薪水,并列出员工的累计薪水总和。如    ENAME DEPT        SAL       RUNNING_SAL            TOTAL_SAL    CLERK 10       1000       1000                     4000    WARD 10       3000       3000                     4000    SAM 20       2000       2000                     5000    JOHN 20       2000       4000                     5000

JULIA   20              1000              5000                                          5000

1.    select sum(t.n)    from    (    select count(*) as n    from a    union all    select count(*) as n    from b    ) t      2.    select T2.ENAME,T2.DEPTNO as DEPT ,nvl(T2.SAL,0) as SAL,nvl(T1.TOTAL_SAL,0) as TOTAL_SAL    from    (    select deptno as DEPT,sum(sal) as TOTAL_SAL    from scott.emp    group by deptno    ) T1 ,    scott.emp T2    where T2.DEPTNO=T1.DEPT(+)    order by T2.DEPTNO        select ename, deptno,sal, sum(sal) over(partition by deptno order by ename) as running_sal,    sum(sal) over(partition by deptno) as total_sal from scott.emp      select t1.empno,t1.ename,t1.deptno,t1.sal, sum(t2.sal) running_sal,t3.total    from scott.emp t1,scott.emp t2,(select deptno,sum(sal) total from scott.emp group by deptno) t3    where t1.deptno=t2.deptno and t1.empno>=t2.empno and t1.deptno=t3.deptno    group by t1.deptno,t1.empno,t1.ename,t1.sal,t3.total    order by t1.deptno,t1.empno      select t1.deptno,t1.ename,t1.sal,    decode(t1.deptno,(select min(empno) from scott.emp t3 where t3.deptno=t2.deptno),t1.sal,t1.empno>t2.empno,(select sum(t2.sal) from scott.emp t2 where t1.empno>t2.empno and t1.deptno=t2.deptno)+t1.sal))      from scott.emp t1    order by t1.deptno      select T2.ENAME,T2.DEPTNO as DEPT ,nvl(T2.SAL,0) as SAL,   nvl((select sum(t3.sal) from scott.emp t3 where t2.empno<=t3.empno and t2.deptno=t3.deptno),0) as RUNNING_SAL,   nvl(T1.TOTAL_SAL,0) as TOTAL_SAL    from   (   select deptno as DEPT,sum(sal) as TOTAL_SAL   from scott.emp   group by deptno   ) T1 ,   scott.emp T2    where T2.DEPTNO=T1.DEPT(+)    order by T2.DEPTNO asc,T2.EMPNO desc       select ename, sal , deptno ,    (select sum(sal) from emp t2 where t2.deptno=emp.deptno and t2.ename<=emp. empno) running_sal ,   (select sum(sal) from emp t3 where t3.deptno=emp.deptno) total_sal   from emp order by deptno, empno

Contents






   
   

   
   
   
   





Analytic Functions, which have been available since Oracle 8.1.6, are designed to address such problems as "Calculate a running total", "Find percentages within a group", "Top-N queries", "Compute a moving average" and many more. Most of these problems can be solved using standard PL/SQL, however the performance is often not what it should be. Analytic Functions add extensions to the SQL language that not only make these operations easier to code; they make them faster than could be achieved with pure SQL or PL/SQL. These extensions are currently under review by the ANSI SQL committee for inclusion in the SQL specification.

?

Analytic functions compute an aggregate value based on a group of rows. They differ from aggregate functions in that they return multiple rows for each group. The group of rows is called a window and is defined by the analytic clause. For each row, a "sliding" window of rows is defined. The window determines the range of rows used to perform the calculations for the "current row". Window sizes can be based on either a physical number of rows or a logical interval such as time.
Analytic functions are the last set of operations performed in a query except for the final ORDER BY clause. All joins and all WHERE, GROUP BY, and HAVING clauses are completed before the analytic functions are processed. Therefore, analytic functions can appear only in the select list or ORDER BY clause.
The Syntax of analytic functions is rather straightforward in appearance
Analytic-Function(<Argument>,<Argument>,...)
OVER (
 <Query-Partition-Clause>
 <Order-By-Clause>
 <Windowing-Clause>
)

Analytic-Function

Specify the name of an analytic function, Oracle actually provides many analytic functions such as AVG, CORR, COVAR_POP, COVAR_SAMP, COUNT, CUME_DIST, DENSE_RANK, FIRST, FIRST_VALUE, LAG, LAST, LAST_VALUE, LEAD, MAX, MIN, NTILE, PERCENT_RANK, PERCENTILE_CONT, PERCENTILE_DISC, RANK, RATIO_TO_REPORT, STDDEV, STDDEV_POP, STDDEV_SAMP, SUM, VAR_POP, VAR_SAMP, VARIANCE.

Arguments

Analytic functions take 0 to 3 arguments.

Query-Partition-Clause

The PARTITION BY clause logically breaks a single result set into N groups, according to the criteria set by the partition expressions. The words "partition" and "group" are used synonymously here. The analytic functions are applied to each group independently, they are reset for each group.

Order-By-Clause

The ORDER BY clause specifies how the data is sorted within each group (partition). This will definitely affect the outcome of any analytic function.

Windowing-Clause

The windowing clause gives us a way to define a sliding or anchored window of data, on which the analytic function will operate, within a group. This clause can be used to have the analytic function compute its value based on any arbitrary sliding or anchored window within a group. More information on windows can be found .
This example shows the cumulative salary within a departement row by row, with each row including a summation of the prior rows salary.
set autotrace traceonly explain
break on deptno skip 1
column ename format A6
column deptno format 999
column sal format 99999
column seq format 999
SELECT ename "Ename", deptno "Deptno", sal "Sal",
 SUM(sal)
   OVER (ORDER BY deptno, ename) "Running Total",
 SUM(SAL)
   OVER (PARTITION BY deptno
         ORDER BY ename) "Dept Total",
 ROW_NUMBER()
   OVER (PARTITION BY deptno
         ORDER BY ENAME) "Seq"
FROM emp
ORDER BY deptno, ename
/
Ename  Deptno    Sal Running Total Dept Total  Seq
------ ------ ------ ------------- ---------- ----
CLARK      10   2450          2450       2450    1
KING            5000          7450       7450    2
MILLER          1300          8750       8750    3

ADAMS      20   1100          9850       1100    1
FORD            3000         12850       4100    2
JONES           2975         15825       7075    3
SCOTT           3000         18825      10075    4
SMITH            800         19625      10875    5

ALLEN      30   1600         21225       1600    1
BLAKE           2850         24075       4450    2
JAMES            950         25025       5400    3
MARTIN          1250         26275       6650    4
TURNER          1500         27775       8150    5
WARD            1250         29025       9400    6Execution Plan --------------------------------------------------- 0      SELECT STATEMENT Optimizer=CHOOSE 1    0   WINDOW (SORT) 2    1     TABLE ACCESS (FULL) OF 'EMP' Statistics --------------------------------------------------- 0  recursive calls 0  db block gets 3  consistent gets 0  physical reads 0  redo size 1658  bytes sent via SQL*Net to client 503  bytes received via SQL*Net from client 2  SQL*Net roundtrips to/from client 1  sorts (memory) 0  sorts (disk) 14  rows processed
The example shows how to calculate a "Running Total" for the entire query. This is done using the entire ordered result set, via SUM(sal) OVER (ORDER BY deptno, ename).
Further, we were able to compute a running total within each department, a total that would be reset at the beginning of the next department. The PARTITION BY deptno in that SUM(sal) caused this to happen, a partitioning clause was specified in the query in order to break the data up into groups.
The ROW_NUMBER() function is used to sequentially number the rows returned in each group, according to our ordering criteria (a "Seq" column was added to in order to display this position).
The execution plan shows, that the whole query is very well performed with only 3 consistent gets, this can never be accomplished with standard SQL or even PL/SQL.
How can we get the Top-N records by some set of fields ?
Prior to having access to these analytic functions, questions of this nature were extremely difficult to answer.
There are some problems with Top-N queries however; mostly in the way people phrase them. It is something to be careful about when designing reports. Consider this seemingly sensible request:
I would like the top three paid sales reps by department
The problem with this question is that it is ambiguous. It is ambiguous because of repeated values, there might be four people who all make the same salary, what should we do then ?
Let's look at three examples, all use the well known table EMP.
Sort the sales people by salary from greatest to least. Give the first three rows. If there are less then three people in a department, this will return less than three records.
set autotrace on explain
break on deptno skip 1
SELECT * FROM (
 SELECT deptno, ename, sal, ROW_NUMBER()
 OVER (
   PARTITION BY deptno ORDER BY sal DESC
 ) Top3 FROM emp
)
WHERE Top3 <= 3
/
DEPTNO ENAME             SAL       TOP3
---------- ---------- ---------- ----------
       10 KING             5000          1
          CLARK            2450          2
          MILLER           1300          3

       20 SCOTT            3000          1
          FORD             3000          2
          JONES            2975          3

       30 BLAKE            2850          1
          ALLEN            1600          2
          TURNER           1500          3

9 rows selected.

Execution Plan
--------------------------------------------
  0      SELECT STATEMENT Optimizer=CHOOSE
  1    0   VIEW
  2    1     WINDOW (SORT)
  3    2       TABLE ACCESS (FULL) OF 'EMP'
This query works by sorting each partition (or group, which is the deptno), in a descending order, based on the salary column and then assigning a sequential row number to each row in the group as it is processed. The use of a WHERE clause after doing this to get just the first three rows in each partition.
Give me the set of sales people who make the top 3 salaries - that is, find the set of distinct salary amounts, sort them, take the largest three, and give me everyone who makes one of those values.
SELECT * FROM (
 SELECT deptno, ename, sal,
        DENSE_RANK()
 OVER (
   PARTITION BY deptno ORDER BY sal desc
 ) TopN FROM emp
)
WHERE TopN <= 3
ORDER BY deptno, sal DESC
/

   DEPTNO ENAME             SAL       TOPN
---------- ---------- ---------- ----------
       10 KING             5000          1
          CLARK            2450          2
          MILLER           1300          3

       20 SCOTT            3000          1  <--- !
          FORD             3000          1  <--- !
          JONES            2975          2
          ADAMS            1100          3

       30 BLAKE            2850          1
          ALLEN            1600          2
       30 TURNER           1500          3


10 rows selected.

Execution Plan
--------------------------------------------
  0      SELECT STATEMENT Optimizer=CHOOSE
  1    0   VIEW
  2    1     WINDOW (SORT PUSHED RANK)
  3    2       TABLE ACCESS (FULL) OF 'EMP'
Here the DENSE_RANK function was used to get the top three salaries. We assigned the dense rank to the salary column and sorted it in a descending order.
The DENSE_RANK function computes the rank of a row in an ordered group of rows. The ranks are consecutive integers beginning with 1. The largest rank value is the number of unique values returned by the query. Rank values are not skipped in the event of ties. Rows with equal values for the ranking criteria receive the same rank.
The DENSE_RANK function does not skip numbers and will assign the same number to those rows with the same value. Hence, after the result set is built in the inline view, we can simply select all of the rows with a dense rank of three or less, this gives us everyone who makes the top three salaries by department number.
The windowing clause gives us a way to define a sliding or anchored window of data, on which the analytic function will operate, within a group. The default window is an anchored window that simply starts at the first row of a group an continues to the current row.
We can set up windows based on two criteria: RANGES of data values or ROWS offset from the current row. It can be said, that the existance of an ORDER BY in an analytic function will add a default window clause of RANGE UNBOUNDED PRECEDING. That says to get all rows in our partition that came before us as specified by the ORDER BY clause.
Let's look at an example with a sliding window within a group and compute the sum of the current row's SAL column plus the previous 2 rows in that group. If we need a report that shows the sum of the current employee's salary with the preceding two salaries within a departement, it would look like this.
break on deptno skip 1
column ename format A6
column deptno format 999
column sal format 99999

SELECT deptno "Deptno", ename "Ename", sal "Sal",
 SUM(SAL)
   OVER (PARTITION BY deptno
         ORDER BY ename
         ROWS 2 PRECEDING) "Sliding Total"
FROM emp
ORDER BY deptno, ename
/
Deptno Ename     Sal Sliding Total
------ ------ ------ -------------
   10 CLARK    2450          2450
      KING     5000          7450
      MILLER   1300          8750

   20 ADAMS    1100          1100
      FORD     3000          4100
      JONES    2975          7075  ^
      SCOTT    3000          8975  |
      SMITH     800          6775  /-- Sliding Window

   30 ALLEN    1600          1600
      BLAKE    2850          4450
      JAMES     950          5400
      MARTIN   1250          5050
      TURNER   1500          3700
      WARD     1250          4000
The partition clause makes the SUM (sal) be computed within each department, independent of the other groups. Tthe SUM (sal) is ' reset ' as the department changes. The ORDER BY ENAME clause sorts the data within each department by ENAME; this allows the window clause: ROWS 2 PRECEDING, to access the 2 rows prior to the current row in a group in order to sum the salaries.
For example, if you note the SLIDING TOTAL value for SMITH is 6 7 7 5, which is the sum of 800, 3000, and 2975. That was simply SMITH's row plus the salary from the preceding two rows in the window.
Range windows collect rows together based on a WHERE clause. If I say ' range 5 preceding ' for example, this will generate a sliding window that has the set of all preceding rows in the group such that they are within 5 units of the current row. These units may either be numeric comparisons or date comparisons and it is not valid to use RANGE with datatypes other than numbers and dates.
Example
Count the employees which where hired within the last 100 days preceding the own hiredate. The range window goes back 100 days from the current row's hiredate and then counts the rows within this range. The solution ist to use the following window specification:
COUNT(*) OVER (ORDER BY hiredate ASC RANGE 100 PRECEDING)
column ename heading "Name" format a8
column hiredate heading "Hired" format a10
column hiredate_pre heading "Hired-100" format a10
column cnt heading "Cnt" format 99
SELECT ename, hiredate, hiredate-100 hiredate_pre,
      COUNT(*)
      OVER (
         ORDER BY hiredate ASC
         RANGE 100 PRECEDING
      ) cnt
 FROM emp
ORDER BY hiredate ASC
/
Name     Hired      Hired-100  Cnt
-------- ---------- ---------- ---
SMITH    17-DEC-80  08-SEP-80    1
ALLEN    20-FEB-81  12-NOV-80    2
WARD     22-FEB-81  14-NOV-80    3
JONES    02-APR-81  23-DEC-80    3
BLAKE    01-MAY-81  21-JAN-81    4
CLARK    09-JUN-81  01-MAR-81    3
TURNER   08-SEP-81  31-MAY-81    2
MARTIN   28-SEP-81  20-JUN-81    2
KING     17-NOV-81  09-AUG-81    3
JAMES    03-DEC-81  25-AUG-81    5
FORD     03-DEC-81  25-AUG-81    5
MILLER   23-JAN-82  15-OCT-81    4
SCOTT    09-DEC-82  31-AUG-82    1
ADAMS    12-JAN-83  04-OCT-82    2
We ordered the single partition by hiredate ASC. If we look for example at the row for CLARK we can see that his hiredate was 09-JUN-81, and 100 days prior to that is the date 01-MAR-81. If we look who was hired between 01-MAR-81 and 09-JUN-81, we find JONES (hired: 02-APR-81) and BLAKE (hired: 01-MAY-81). This are 3 rows including the current row, this is what we see in the column "Cnt" of CLARK's row.
As an example, compute the average salary of people hired within 100 days before for each employee. The query looks like this:
column ename heading "Name" format a8
column hiredate heading "Hired" format a10
column hiredate_pre heading "Hired-100" format a10
column avg_sal heading "Avg-100" format 999999
SELECT ename, hiredate, sal,
      AVG(sal)
      OVER (
         ORDER BY hiredate ASC
         RANGE 100 PRECEDING
      ) avg_sal
 FROM emp
ORDER BY hiredate ASC
/
Name     Hired             SAL Avg-100
-------- ---------- ---------- -------
SMITH    17-DEC-80         800     800
ALLEN    20-FEB-81        1600    1200
WARD     22-FEB-81        1250    1217
JONES    02-APR-81        2975    1942
BLAKE    01-MAY-81        2850    2169
CLARK    09-JUN-81        2450    2758
TURNER   08-SEP-81        1500    1975
MARTIN   28-SEP-81        1250    1375
KING     17-NOV-81        5000    2583
JAMES    03-DEC-81         950    2340
FORD     03-DEC-81        3000    2340
MILLER   23-JAN-82        1300    2563
SCOTT    09-DEC-82        3000    3000
ADAMS    12-JAN-83        1100    2050
Look at CLARK again, since we understand his range window within the group. We can see that the average salary of 2758 is equal to (2975+2850+2450)/3. This is the average of the salaries for CLARK and the rows preceding CLARK, those of JONES and BLAKE. The data must be sorted in ascending order.
Row Windows are physical units; physical number of rows, to include in the window. For example you can calculate the average salary of a given record with the (up to 5) employees hired before them or after them as follows:
set numformat 9999
SELECT ename, hiredate, sal,
AVG(sal)
 OVER (ORDER BY hiredate ASC ROWS 5 PRECEDING) AvgAsc,
COUNT(*)
 OVER (ORDER BY hiredate ASC ROWS 5 PRECEDING) CntAsc,
AVG(sal)
 OVER (ORDER BY hiredate DESC ROWS 5 PRECEDING) AvgDes,
COUNT(*)
 OVER (ORDER BY hiredate DESC ROWS 5 PRECEDING) CntDes
FROM emp
ORDER BY hiredate
/
ENAME      HIREDATE    SAL AVGASC CNTASC AVGDES CNTDES
---------- --------- ----- ------ ------ ------ ------
SMITH      17-DEC-80   800    800      1   1988      6
ALLEN      20-FEB-81  1600   1200      2   2104      6
WARD       22-FEB-81  1250   1217      3   2046      6
JONES      02-APR-81  2975   1656      4   2671      6
BLAKE      01-MAY-81  2850   1895      5   2675      6
CLARK      09-JUN-81  2450   1988      6   2358      6
TURNER     08-SEP-81  1500   2104      6   2167      6
MARTIN     28-SEP-81  1250   2046      6   2417      6
KING       17-NOV-81  5000   2671      6   2392      6
JAMES      03-DEC-81   950   2333      6   1588      4
FORD       03-DEC-81  3000   2358      6   1870      5
MILLER     23-JAN-82  1300   2167      6   1800      3
SCOTT      09-DEC-82  3000   2417      6   2050      2
ADAMS      12-JAN-83  1100   2392      6   1100      1
The window consist of up to 6 rows, the current row and five rows " in front of " this row, where " in front of " is defined by the ORDER BY clause. With ROW partitions, we do not have the limitation of RANGE partition - the data may be of any type and the order by may include many columns. Notice, that we selected out a COUNT(*) as well. This is useful just to demonstrate how many rows went into making up a given average. We can see clearly that for ALLEN's record, the average salary computation for people hired before him used only 2 records whereas the computation for salaries of people hired after him used 6.
Frequently you want to access data not only from the current row but the current row " in front of " or " behind " them. For example, let's say you need a report that shows, by department all of the employees; their hire date; how many days before was the last hire; how many days after was the next hire.
Using straight SQL this query would be difficult to write. Not only that but its performance would once again definitely be questionable. The approach I typically took in the past was either to " select a select " or write a PL/SQL function that would take some data from the current row and " find " the previous and next rows data. This worked, but introduce large overhead into both the development of the query and the run-time execution of the query.
Using analytic functions, this is easy and efficient to do.
set echo on

column deptno format 99 heading Dep
column ename format a6 heading Ename
column hiredate heading Hired
column last_hire heading LastHired
column days_last heading DaysLast
column next_hire heading NextHire
column days_next heading NextDays

break on deptno skip 1

SELECT deptno, ename, hiredate,
LAG(hiredate,1,NULL)
 OVER (PARTITION BY deptno
       ORDER BY hiredate, ename) last_hire,
hiredate - LAG(hiredate,1,NULL)
 OVER (PARTITION BY deptno
       ORDER BY hiredate, ename) days_last,
LEAD(hiredate,1,NULL)
 OVER (PARTITION BY deptno
       ORDER BY hiredate, ename) next_hire,
LEAD(hiredate,1,NULL)
 OVER (PARTITION BY deptno
       ORDER BY hiredate, ename) - hiredate days_next
FROM emp
ORDER BY deptno, hiredate
/
Dep Ename  Hired     LastHired DaysLast NextHire  NextDays
--- ------ --------- --------- -------- --------- --------
10 CLARK  09-JUN-81                    17-NOV-81      161
   KING   17-NOV-81 09-JUN-81      161 23-JAN-82       67
   MILLER 23-JAN-82 17-NOV-81       67

20 SMITH  17-DEC-80                    02-APR-81      106
   JONES  02-APR-81 17-DEC-80      106 03-DEC-81      245
   FORD   03-DEC-81 02-APR-81      245 09-DEC-82      371
   SCOTT  09-DEC-82 03-DEC-81      371 12-JAN-83       34
   ADAMS  12-JAN-83 09-DEC-82       34

30 ALLEN  20-FEB-81                    22-FEB-81        2
   WARD   22-FEB-81 20-FEB-81        2 01-MAY-81       68
   BLAKE  01-MAY-81 22-FEB-81       68 08-SEP-81      130
   TURNER 08-SEP-81 01-MAY-81      130 28-SEP-81       20
   MARTIN 28-SEP-81 08-SEP-81       20 03-DEC-81       66
   JAMES  03-DEC-81 28-SEP-81       66
The LEAD and LAG routines could be considered a way to " index into your partitioned group ". Using these functions you can access any individual row. Notice for example in the above printout, it shows that the record  for KING  includes the data (in bold red font) from the prior row (LAST HIRE) and the next row (NEXT-HIRE). We can access the fields in records preceding or following the current record in an ordered partition easily.
LAG ( value_expr [, offset] [, default] )
  OVER ( [query_partition_clause] order_by_clause )
LAG provides access to more than one row of a table at the same time without a self join. Given a series of rows returned from a query and a position of the cursor, LAG provides access to a row at a given physical offset prior to that position.
If you do not specify offset, then its default is 1. The optional default value is returned if the offset goes beyond the scope of the window. If you do not specify default, then its default value is null.
The following example provides, for each person in the EMP table, the salary of the employee hired just before:SELECT ename,hiredate,sal, LAG(sal, 1, 0) OVER (ORDER BY hiredate) AS PrevSalFROM emp WHERE job = 'CLERK'; Ename  Hired       SAL PREVSAL ------ --------- ----- ------- SMITH  17-DEC-80   800       0 JAMES  03-DEC-81   950     800 MILLER 23-JAN-82  1300     950 ADAMS  12-JAN-83  1100    1300
LEAD ( value_expr [, offset] [, default] )
  OVER ( [query_partition_clause] order_by_clause )
LEAD provides access to more than one row of a table at the same time without a self join. Given a series of rows returned from a query and a position of the cursor, LEAD provides access to a row at a given physical offset beyond that position.
If you do not specify offset, then its default is 1. The optional default value is returned if the offset goes beyond the scope of the table. If you do not specify default, then its default value is null.
The following example provides, for each employee in the EMP table, the hire date of the employee hired just after:
SELECT ename, hiredate,
LEAD(hiredate, 1)
 OVER (ORDER BY hiredate) AS NextHired
FROM emp WHERE deptno = 30;
Ename  Hired     NEXTHIRED
------ --------- ---------
ALLEN  20-FEB-81 22-FEB-81
WARD   22-FEB-81 01-MAY-81
BLAKE  01-MAY-81 08-SEP-81
TURNER 08-SEP-81 28-SEP-81
MARTIN 28-SEP-81 03-DEC-81
JAMES  03-DEC-81
The FIRST_VALUE and LAST_VALUE functions allow you to select the first and last rows from a group. These rows are especially valuable because they are often used as the baselines in calculations.
Example
The following example selects, for each employee in each department, the name of the employee with the lowest salary.
break on deptno skip 1

SELECT deptno, ename, sal,
 FIRST_VALUE(ename)
 OVER (PARTITION BY deptno
       ORDER BY sal ASC) AS MIN_SAL_HAS
FROM emp
ORDER BY deptno, ename;
DEPTNO ENAME             SAL MIN_SAL_HAS
---------- ---------- ---------- -----------
       10 CLARK            2450 MILLER
          KING             5000 MILLER
          MILLER           1300 MILLER

       20 ADAMS            1100 SMITH
          FORD             3000 SMITH
          JONES            2975 SMITH
          SCOTT            3000 SMITH
          SMITH             800 SMITH

       30 ALLEN            1600 JAMES
          BLAKE            2850 JAMES
          JAMES             950 JAMES
          MARTIN           1250 JAMES
          TURNER           1500 JAMES
          WARD             1250 JAMES
The following example selects, for each employee in each department, the name of the employee with the highest salary.SELECT deptno, ename, sal, FIRST_VALUE(ename)  OVER (PARTITION BY deptno ORDER BY sal DESC) AS MAX_SAL_HAS FROM emp ORDER BY deptno, ename;
DEPTNO ENAME             SAL MAX_SAL_HAS
---------- ---------- ---------- -----_-----
       10 CLARK            2450 KING
          KING             5000 KING
          MILLER           1300 KING

       20 ADAMS            1100 FORD
          FORD             3000 FORD
          JONES            2975 FORD
          SCOTT            3000 FORD
          SMITH             800 FORD

       30 ALLEN            1600 BLAKE
          BLAKE            2850 BLAKE
          JAMES             950 BLAKE
          MARTIN           1250 BLAKE
          TURNER           1500 BLAKE
          WARD             1250 BLAKE
The following example selects, for each employee in department 30 the name of the employee with the lowest salary using an inline view
SELECT deptno, ename, sal,
 FIRST_VALUE(ename)
 OVER (ORDER BY sal ASC) AS MIN_SAL_HAS
FROM (SELECT * FROM emp WHERE deptno = 30)
DEPTNO ENAME             SAL MIN_SAL_HAS
---------- ---------- ---------- -----------
       30 JAMES             950 JAMES
          MARTIN           1250 JAMES
          WARD             1250 JAMES
          TURNER           1500 JAMES
          ALLEN            1600 JAMES
          BLAKE            2850 JAMES
A crosstab query, sometimes known as a pivot query, groups your data in a slightly different way from those we have seen hitherto. A crosstab query can be used to get a result with three rows (one for each project), with each row having three columns (the first listing the projects and then one column for each year) -- like this:
Project        2001        2002
    ID         CHF         CHF
-------------------------------
   100      123.00      234.50
   200      543.00      230.00
   300      238.00      120.50
Example
Let's say you want to show the top 3 salary earners in each department as columns. The query needs to return exactly 1 row per department and the row would have 4 columns. The DEPTNO, the name of the highest paid employee in the department, the name of the next highest paid, and so on. Using analytic functions this almost easy, without analytic functions this was virtually impossible.
SELECT deptno,
 MAX(DECODE(seq,1,ename,null)) first,
 MAX(DECODE(seq,2,ename,null)) second,
 MAX(DECODE(seq,3,ename,null)) third
FROM (SELECT deptno, ename,
      row_number()
      OVER (PARTITION BY deptno
            ORDER BY sal desc NULLS LAST) seq
      FROM emp)
WHERE seq <= 3
GROUP BY deptno
/
DEPTNO FIRST      SECOND     THIRD
---------- ---------- ---------- ----------
       10 KING       CLARK      MILLER
       20 SCOTT      FORD       JONES
       30 BLAKE      ALLEN      TURNER
Note the inner query, that assigned a sequence (RowNr) to each employee by department number in order of salary.
SELECT deptno, ename, sal,
row_number()
OVER (PARTITION BY deptno
      ORDER BY sal desc NULLS LAST) RowNr
FROM emp;
DEPTNO ENAME             SAL      ROWNR
---------- ---------- ---------- ----------
       10 KING             5000          1
       10 CLARK            2450          2
       10 MILLER           1300          3
       20 SCOTT            3000          1
       20 FORD             3000          2
       20 JONES            2975          3
       20 ADAMS            1100          4
       20 SMITH             800          5
       30 BLAKE            2850          1
       30 ALLEN            1600          2
       30 TURNER           1500          3
       30 WARD             1250          4
       30 MARTIN           1250          5
       30 JAMES             950          6
The DECODE in the outer query keeps only rows with sequences 1, 2 or 3 and assigns them to the correct "column". The GROUP BY gets rid of the redundant rows and we are left with our collapsed result. It may be easier to understand if you see the resultset without the aggregate function MAX grouped by deptno.
SELECT deptno,
 DECODE(seq,1,ename,null) first,
 DECODE(seq,2,ename,null) second,
 DECODE(seq,3,ename,null) third
FROM (SELECT deptno, ename,
      row_number()
      OVER (PARTITION BY deptno
            ORDER BY sal desc NULLS LAST) seq
      FROM emp)
WHERE seq <= 3
/
DEPTNO FIRST      SECOND     THIRD
---------- ---------- ---------- ----------
       10 KING
       10            CLARK
       10                       MILLER
       20 SCOTT
       20            FORD
       20                       JONES
       30 BLAKE
       30            ALLEN
       30                       TURNER
The MAX aggregate function will be applied by the GROUP BY column DEPTNO. In any given DEPTNO above only one row will have a non-null value for FIRST, the remaining rows in that group will always be NULL. The MAX function will pick out the non-null row and keep that for us. Hence, the group by and MAX will collapse our resultset, removing the NULL values from it and giving us what we want.
This new set of functionality holds some exiting possibilities. It opens up a whole new way of looking at the data. It will remove a lot of procedural code and complex or inefficient queries that would have taken a long tome to develop, to achieve the same result.
Further articles about Analytic Functions can be found in:

Read more

【MySQL入门·查询进阶】8.8 分页查询

【MySQL入门·查询进阶】8.8 分页查询

【MySQL入门·查询进阶】8.8 分页查询 * * MySQL 分页查询详解 在数据库查询中,分页查询是一种非常常见的操作,尤其是在处理大量数据时。MySQL 提供了多种方式来实现分页查询,其中最为常见的是使用 LIMIT 和 OFFSET 子句。本文将详细解析 MySQL 中的分页查询技术,并探讨其在实际应用中的最佳实践。 一、LIMIT 和 OFFSET 的基本用法 在 MySQL 中,LIMIT 子句用于限制查询结果返回的记录数,而 OFFSET 则用于指定从结果集的哪一条记录开始返回。其基本语法如下: SELECT * FROM table_name LIMIT number_to_return OFFSET starting_position; 或者,更常见的简写形式: SELECT * FROM

By Ne0inhk
【MySQL入门·内置函数】9.1 锁相关

【MySQL入门·内置函数】9.1 锁相关

【MySQL入门·内置函数】9.1 锁相关 * * * * * * * 锁相关 MySQL锁机制是数据库管理系统中的关键部分,它确保了多个事务在并发执行时能够保持数据的一致性和完整性。锁的种类和用法直接影响到数据库的性能和并发能力。下面我们将详细探讨MySQL中的锁机制。 一、锁的基本概念 在MySQL中,锁是用于管理对共享资源的并发访问的一种机制。当一个事务需要访问某个资源时,它会先尝试获取对该资源的锁。如果锁已经被其他事务持有,那么当前事务就需要等待,直到锁被释放。 二、MySQL的锁类型 MySQL支持多种类型的锁,以满足不同场景下的需求。主要的锁类型包括: 1. 共享锁(S锁) 共享锁(Shared Lock,简称S锁)允许多个事务同时读取同一份数据,但不允许任何事务对数据进行修改(写操作)。这种锁类型主要用于提高并发读取的效率,因为在不修改数据的情况下,多个事务可以共享同一份数据资源。当事务需要对数据进行读取时,会先尝试获取共享锁,如果数据已被其他事务加上排他锁(X锁),则当前事务需要等待直到排他锁释放。 2. 排他锁(X

By Ne0inhk
【MySQL入门·内置函数】9.2 JSON函数

【MySQL入门·内置函数】9.2 JSON函数

【MySQL入门·内置函数】9.2 JSON函数 * * * * * * * JSON函数 MySQL数据库管理系统提供了丰富的JSON函数,使得在数据库中直接处理JSON数据变得更加方便和高效。这些函数允许用户在MySQL中查询、修改和验证JSON数据,无需将数据提取到应用程序中进行处理。下面我们将详细解析MySQL中常用的JSON函数及其用法。 一、JSON创建与解析函数 1. JSON_OBJECT() 创建一个JSON对象。 SELECT JSON_OBJECT('name', 'John', 'age', 30); 2. JSON_ARRAY() 创建一个JSON数组。 SELECT JSON_ARRAY(1, 2, 3, 'four'); 3. JSON_PARSE() 将JSON格式的字符串解析为JSON值。 SELECT JSON_

By Ne0inhk
【MySQL入门·内置函数】9.3 窗口函数

【MySQL入门·内置函数】9.3 窗口函数

【MySQL入门·内置函数】9.3 窗口函数 * * MySQL 窗口函数详解 在MySQL中,窗口函数(Window Functions)为数据分析提供了强大的功能,它们允许用户在结果集的每一行上执行计算,这些计算是基于与当前行相关的行的子集(称为窗口)进行的。窗口函数在处理诸如排名、累计、移动平均等复杂查询时非常有用。 MySQL 8.0及更高版本开始支持窗口函数,为数据分析和报表制作提供了极大的便利。下面我们将详细解析MySQL中的窗口函数及其用法。 1. 窗口函数的基本概念 窗口函数通常与OVER()子句一起使用,用于定义窗口的范围和排序方式。窗口可以是整个结果集,也可以是结果集中的一部分行。 2. 窗口函数的基本语法 窗口函数的基本语法如下: SELECT column1, column2, window_function(column3) OVER ( [PARTITION BY partition_expression, ... ] ORDER BY sort_expression [ASC

By Ne0inhk