Case study for UNION of daily files

Wednesday 11 May 2011 @ 11:22 am

I recently found a way to mix several concepts and solve a problem that others might be facing. The challenge came from a company who stores several hundred thousand log transactions each day. The way the system is set up, each day’s transactions are stored in a separate SQL table. The table name is the date of the transaction.  What they wanted to do was to run reports that summarize these transactions for an entire month, with subtotals by category. This would require combining a month’s worth of daily tables on the fly, and then getting subtotals by category.  At first I recommended a stored procedure because I didn’t think we could calculate the table names on the fly in a SQL command.  But we aren’t allowed to add SPs to the database.  Then I realized that by adding the report as a subreport we would have the chance to do the calculations we needed within Crystal.

So, my first solution was a 2-step process.   We created a summary report for one day with subtotals by category.  The report used a SQL command and we put a parameter field into the FROM clause where the table name would go.  That allowed us to run any day we wanted by entering the date into that parameter.

Then we created a container report with up to 31 details based on the days in the chosen month. It had a formula that calculated the table names for all of the dates in that month. We put the subreport in at the detail level and linked the formula to the parameter so that the report ran one subreport for each day and generated the category summary rows for each day. We exported this to a spreadsheet, creating a handful of rows for each day in the month, each row showing category totals.  Last we created a second report to read the summary rows in the spreadsheet and combine them to get category totals for the entire month.  It worked great, but the 2-step process wasn’t ideal.

Then I thought through another approach that would eliminate the need for two steps – although it might be stretching SQL to make it work. We could create a single SQL command that had 31 separate select statements, with 30 UNIONs to connect them.  We would then need 31 individual parameter fields, one for each FROM statement. This report would be a subreport and the main report would have 31 formulas to calculate the the 31 daily values and pass them down to the 31 parameters.  You can see why I hesitate.

One issue I see is adjusting the SQL when the month has less than 31 days. I figure in short months I could pass down the table of the 28th day a few extra times to fill up the 31 queries.  Then I would add a separate parameter in the WHERE clause of those last few queries to make the criteria return FALSE for any day that didn’t exist in that month.  So the UNION adds nothing for those dates.

The second issue is the size of the returned data set.   I think the back end we are working on is Sybase so I don’t know how well it will do millions of  records, or maybe we can use a GROUP BY in each of the 31 queries so that the dataset coming back to CR is only daily totals.

Either way I have a plan and the theory seems sound.  I will post if they decide to try it this method.  And if anyone else needs help implementing something like this, give me a call.  That is what I do.

One Response to 'Case study for UNION of daily files'

1. steve053 - May 14th, 2011 at 11:56 pm

My first thought was to run a daily sp to summarized the prior day’s table, and insert this into a second summarized table. Then I read that SP’s weren’t an option.

My second thought was to create a view that calculates the prior 31 days. A crystal report could hit the summarized view and pull the correct number of days based on the month. But if a SP isn’t allowed, a view most likely wouldn’t be either. Plus you would have to run the crystal report on the 1st of the month to ensure the proper calculation.

My last thought was to run a daily crystal report to summarized the prior day and output the summary to a file. A second crystal report could be set up to query the summarized files based on the file name and days of the month. Both crystal reports would run more efficiently than trying to summarize a whole month of transactions.