开发者

Why is this SQL SUM statement correct?

开发者 https://www.devze.com 2023-01-10 14:56 出处:网络
In my schema, I have a table Projects, and a table Tasks.Each project is comprised of tasks.Each task has Hours and PercentComplete.

In my schema, I have a table Projects, and a table Tasks. Each project is comprised of tasks. Each task has Hours and PercentComplete.

Example table:

ProjectID    TaskID    Hours   PercentComplete
1            1         100     50
1            2         120     80

I am trying to get the weighted percentage complete for the project. I am doing this using the following SQL statement:

SELECT P.ProjectID, P.ProjectName, SUM(T.Hours) AS Hours, 
SUM(开发者_StackOverflowT.PercentComplete * T.Hours) / 100 AS CompleteHours, 
SUM(T.PercentComplete * T.Hours) / SUM(T.Hours) AS PercentComplete
FROM Projects AS P INNER JOIN
        Tasks AS T ON T.ProjectID = P.ProjectID
WHERE     (P.ProjectID = 1)

My question is about this part of that statement:

SUM(T.PercentComplete * T.Hours) / SUM(T.Hours) AS PercentComplete

This gives me the correct weighted percentage for this project (in the case of the sample data above, 66%). But I cannot seem to wrap my head around why it does this.

Why does this query work?


  • SUM(T.PercentComplete * T.Hours) / 100 is the number of complete hours.
  • SUM(T.Hours) is the total number of hours.
  • The ratio of these two amounts, i.e.:

    (SUM(T.PercentComplete * T.Hours) / 100) / SUM(T.Hours)
    

    is the proportion of hours complete (it should be between 0 and 1).

  • Multiplying this by 100 gives the percentage.

I prefer to keep percentages like this out of the database and move them to the presentation layer. It would be much easier if the database stored "hours completed" and "hours total" and did not store the percentages at all. The extra factors of 100 in the calculations confuse the issue.


Basically you are finding the number of hours completed over the number of hours total.

SUM(T.PercentComplete * T.Hours) computes the total number of hours that you have completed. (100 * 50) = 50 * 100 + (120 * 80) = 146 * 100 is the numerator. 146 hours have been completed on this job, and we keep a 100 multiplier for the percent (because it is [0-100] instead of [0-1])

Then we find the total number of hours worked, SUM(T.Hours), which is 100 + 120 = 220.

Then dividing, we find the weighted average. (146 * 100) / 220 = 0.663636364 * 100 = 66.4%

Is this what you were wondering about?


It calculates the two sums individually by adding up the value for each row then divides them at the end

SUM(T.PercentComplete * T.Hours)

50*  100 +
80 * 120
-------
14,600

SUM(T.Hours)

100 +
120
---
220

Then the division at the end

14,600 / 220
------------
66.3636

Edit As per HLGEM's comment it will actually return 66 due to integer division.


Aggregate functions, such as SUM(), work against the set of data defined by the GROUP BY clause. So if you group by ProjectID, ProjectName, the functions will break things down by that.


The SUM peratiorn first multiply the columns than add

( 100* 50+ 120* 80)  / (100+ 120)
0

精彩评论

暂无评论...
验证码 换一张
取 消