More efficient SQL for returning the last record

Saturday 21 November 2015 @ 4:57 pm

I wrote up a method I called the “wormhole” years ago. Normally when you calculate the maximum of a column you can refer to that maximum in any formula in the report. The wormhole technique takes advantage of this behavior to transport other values from the same row as the maximum value. For instance, I can move several values from the last record of the report and show them at the top of the report. To do this I append the other fields to the original, take the maximum of the combined string and then split the maximum value back into pieces. Recently I realized that this method can also be used in SQL expressions to return multiple values, and in SQL commands to simplify certain queries.

Say you have a table of transactions by customer. You want to generate a list that shows the customer name and three fields from their last order (date, order ID and amount). This is simple to do in Crystal without resorting to SQL, but it requires that Crystal bring back ALL of the orders. You then either sort and display only the group footer, or you do a group selection to select the maximum order date for each group.

But if you only want the maximum record to come back from the database then you need to do the work in SQL. There are several methods to return just the last record. The most common involves running two queries and joining them. The first query would pull a list of ALL of the orders for each customer and would return the four fields. The second query would use a “Group By” on the customer and return only the last or Max() date, for each customer. Then you would inner join these two results on both the Customer ID and the Order Date to have all the fields from the first query but limited to the date from the second query. Here are is an example of this query that can be run as a SQL command in the Xtreme sample database:

SELECT `Customer`.`Customer ID`, `Customer`.`Customer Name`, `Orders`.`Order ID`, 
`Orders`.`Order Amount`, `Orders`.`Order Date`
FROM (`Customer` `Customer` 
INNER JOIN `Orders` `Orders` ON `Customer`.`Customer ID`=`Orders`.`Customer ID`)
INNER JOIN ( 
   SELECT `Customer`.`Customer ID`, max(`Orders`.`Order ID`) as LastOrder
   FROM `Customer` `Customer` INNER JOIN `Orders` `Orders` 
   ON `Customer`.`Customer ID`=`Orders`.`Customer ID`
   Group By `Customer`.`Customer ID`) LastOrd
ON `Orders`.`Order ID`=`LastOrd`.`LastOrder` 
and `Customer`.`Customer ID`=`LastOrd`.`Customer ID`

This works fine, but it has to do two separate queries.  This can slow things down, especially if you have to do this several different times. The alternate approach would be:

SELECT `Customer`.`Customer ID`, `Customer`.`Customer Name`, 
max ( 
Format (`Orders`.`Order Date`, 'yyyy/mm/dd') & '-' & 
format (str(`Orders`.`Order ID`) , '00000') & '-' & 
format (str(`Orders`.`Order Amount`), '000000.00') 
) as MaxStr
FROM (`Customer` `Customer` 
INNER JOIN `Orders` `Orders` ON `Customer`.`Customer ID`=`Orders`.`Customer ID`)
Group by `Customer`.`Customer ID`, `Customer`.`Customer Name`

Lines 2 to 6 calculate the maximum of a combined string for each customer. The string includes the Date, the ID and the Amount of the transaction. The string starts with the date formatted as Year-Month-Day. The maximum of that string will grab the last order, and will also include the values of the other three fields. It returns the same values as the fist query and only takes one pass. The only downside is that you get your values inside a combined string.  To split them apart will take a few extra formulas in Crystal, but you might find this worth the extra effort because the query should run faster.

This technique is especially useful when you do a subquery inside a SQL Expression field. SQL Expression fields can only ever return a single “value”, but with this method that one value can include a combination of several fields from the same row of data.








One Response to 'More efficient SQL for returning the last record'

  1. Ken Hamady - November 26th, 2015 at 11:16 pm

    Adam Butt of APB Reports in Trondheim, Norway just reminded me of another approach for getting the first or last row from a query. It only works in databases that support the Row_Number() function (Oracle/SQL Server). It also allows you to select the TopN rows. Here is his example:

    SELECT *
    FROM
    (
    SELECT
    ROW_NUMBER() OVER(PARTITION BY Customer.CustomerID ORDER BY Orders.Order_Date DESC,Orders.Order_ID DESC, Orders.Order_Amount DESC) AS Row_Sort,
    Customer.Customer_ID,
    Customer.Customer_Name,
    Orders.Order_ID,
    Orders.Order_Amount,
    Orders.Order_Date

    FROM
    Customer
    INNER JOIN Orders ON
    Customer.Customer_ID = Orders.Customer_ID
    )X

    WHERE
    X.Row_Sort = 1
    — OR SELECT TOP 5: X.Row_Sort <= 5, etc.
    — JUST CHANGE THE ORDER BY AND PARTITION RULES
    — IN THE Row_Sort COLUMN


Leave a Reply

Recrystallize Pro