How to create a sql query to return the number of occurrences of sick days.
Where one occurrence is defined as consecutive days out sick. It would also be considered one occurrence if the a weekend fell within that time period or a holiday fell in that time period.
Examples of what is consider one occurrence:
- A person is out sick Monday and Tuesday.
- A person is out sick on a Friday and the following Monday.
- A person is out sick Thursday, Friday is a Holiday, and Monday they are sick.
For these 开发者_如何学运维examples it would be considered three occurrences.
There is a table that contains the sick days (date) (one row for every sick day) and a table that contains the observed holiday dates.
To simplify the tables and fields:
tbl_emp
empid
empname
tbl_sick
empid
sickdate
tbl_holiday
holiday
Logically, this should work, but it probably isn't the most elegant solution.
After creating a bit of sample data, I gather all of the 'suspect' days including known sick days, known holidays and weekends that are adjacent to either a sick day or a holiday. Then I identify the start and end of each group of consecutive days and for each employee count the start dates of ranges that contain a sick day.
/***** SAMPLE DATA *****/
declare @sick table (
empid int,
sick datetime
)
declare @holiday table (
holiday datetime
)
/* Example 1 */
insert into @sick values (1,'2010/01/04'); /* Mon */
insert into @sick values (1,'2010/01/05'); /* Tue */
/* Example 2 */
insert into @sick values (1,'2010/01/15'); /* Fri */
insert into @sick values (1,'2010/01/18'); /* Mon */
/* Example 3 */
insert into @sick values (1,'2010/01/21'); /* Thu */
insert into @holiday values('2010/01/22'); /* Fri */
insert into @sick values (1,'2010/01/25'); /* Mon */
/* Extra Examples */
insert into @sick values (3,'2010/01/08');
insert into @sick values (2,'2010/01/08');
insert into @holiday values ('2010/01/11');
insert into @sick values (3,'2010/01/20');
insert into @sick values (3,'2010/01/21');
/* Extra Holiday */
insert into @holiday values ('2010/02/05');
/***** SAMPLE DATA *****/
/* First a CTE to gather all of the 'suspect' days together
including known sick days, known holidays and weekends
that are adjacent to either a sick day or a holiday */
with suspectdays as (
/* Start with all Sick days */
select
empid,
sick dt,
'sick' [type]
from
@sick
/* Add all Saturdays following a sick Friday */
union
select
empid,
DATEADD(day,1,sick) dt,
'weekend' [type]
from
@sick
where
(DATEPART(WEEKDAY,sick) + @@DATEFIRST) % 7 = 6
/* Add all Sundays following a sick Friday */
union
select
empid,
DATEADD(day,2,sick) dt,
'weekend' [type]
from
@sick
where
(DATEPART(WEEKDAY,sick) + @@DATEFIRST) % 7 = 6
/* Add all Sundays preceding a sick Monday */
union
select
empid,
DATEADD(day,-1,sick) dt,
'weekend' [type]
from
@sick
where
(DATEPART(WEEKDAY,sick) + @@DATEFIRST) % 7 = 2
/* Add all Saturdays preceding a sick Monday */
union
select
empid,
DATEADD(day,-2,sick) dt,
'weekend' [type]
from
@sick
where
(DATEPART(WEEKDAY,sick) + @@DATEFIRST) % 7 = 2
/* Add all Holidays */
union
select
empid,
holiday dt,
'holiday' [type]
from
@holiday,
(select distinct empid from @sick) as a
/* Add all Saturdays following a holiday Friday */
union
select
empid,
DATEADD(day,1,holiday) dt,
'weekend' [type]
from
@holiday,
(select distinct empid from @sick) as a
where
(DATEPART(WEEKDAY,holiday) + @@DATEFIRST) % 7 = 6
/* Add all Sundays following a holiday Friday */
union
select
empid,
DATEADD(day,2,holiday) dt,
'weekend' [type]
from
@holiday,
(select distinct empid from @sick) as a
where
(DATEPART(WEEKDAY,holiday) + @@DATEFIRST) % 7 = 6
/* Add all Sundays preceding a holiday Monday */
union
select
empid,
DATEADD(day,-1,holiday) dt,
'weekend' [type]
from
@holiday,
(select distinct empid from @sick) as a
where
(DATEPART(WEEKDAY,holiday) + @@DATEFIRST) % 7 = 2
/* Add all Saturdays preceding a holiday Monday */
union
select
empid,
DATEADD(day,-2,holiday) dt,
'weekend' [type]
from
@holiday,
(select distinct empid from @sick) as a
where
(DATEPART(WEEKDAY,holiday) + @@DATEFIRST) % 7 = 2
),
/* Now a CTE to identify the start and end of each
group of consecutive days for each employee */
suspectranges as (
select distinct
sd.empid,
( select
max(dt)
from
suspectdays
where
empid = sd.empid and
DATEADD(day,-1,dt) not in (select dt from suspectdays where empid = sd.empid) and
dt <= sd.dt
) rangeStart,
( select
min(dt)
from
suspectdays
where
empid = sd.empid and
DATEADD(day,1,dt) not in (select dt from suspectdays where empid = sd.empid) and
dt >= sd.dt
) rangeEnd
from
suspectdays sd
)
/* For each employee count the start dates of ranges that contain a sick day */
select
empid,
COUNT(rangeStart) SickIncidents
from
suspectranges sr
where
exists (select * from suspectdays where dt between sr.rangeStart and sr.rangeEnd and empid=sr.empid and type='sick')
group by
empid
For the sample data I created, here's the result.
empid SickIncidents
----------- -------------
1 3
2 1
3 2
If i had a little more time, i would try to get a query out for you. But this really reminds me of a sql Challenge that is close enough to your issue.
Different Approaches
Explanation of Approaches
You are probably going to have to end up with a form of recursion or a tricky joining method to figure out if a sequence of days are what you prescribe. Hopes this helps you in your quest for the query.
Here's my attempt. Quite a lot of this query would lend itself to precalculation though rather than doing it all every time. Particularly having a table of working days with an incrementing sequence number.
--Base Tables
WITH tbl_holiday AS
(
SELECT CAST(2010-05-27 AS DATETIME) AS holidate UNION ALL
SELECT CAST(2010-05-28 AS DATETIME)
),
tbl_emp AS
(
SELECT 1 AS empid, 'Bob' AS empname UNION ALL
SELECT 2 AS empid, 'Dave' AS empname
),
tbl_sick AS
(
SELECT 1 AS empid, '2010-04-01' as sickdate UNION ALL
SELECT 1, '2010-04-02' UNION ALL
SELECT 1, '2010-04-09'
),
--Calculated Tables
tbl_WorkingDays AS
(
SELECT dateadd(day,number,'2010-01-01') AS workdate
FROM master.dbo.spt_values
WHERE Type='P' AND number <= DATEDIFF(day,'2010-01-01',getdate())
AND (@@datefirst + datepart(weekday, dateadd(day,number,'2010-01-01'))) % 7 not in (0, 1)
EXCEPT
SELECT * FROM tbl_holiday
),
tbl_NumberedWorkingDays AS
(
SELECT ROW_NUMBER() OVER (ORDER BY workdate) AS N,
workdate
FROM tbl_WorkingDays
)
SELECT e.empid, e.empname, COUNT(nwd.N) AS Absences
FROM tbl_emp e
LEFT JOIN tbl_sick s ON e.empid = s.empid
LEFT JOIN tbl_NumberedWorkingDays nwd ON nwd.workdate = s.SickDate
WHERE (s.empid IS NULL) OR (nwd.N = 1) OR
NOT EXISTS (SELECT * FROM tbl_sick s2 WHERE s.empid = s2.empid AND sickdate = (SELECT
workdate FROM tbl_NumberedWorkingDays WHERE N = nwd.N-1))
GROUP BY e.empid, e.empname
First, the simplest way to do analysis against dates is to have a calendar table which is a sequential list of dates. The advantage is that it makes it easy to indicate which days are not workdays (for whatever reason) and which days are holidays.
Create Table dbo.Calendar (
[Date] Date not null primary key clustered
, IsHoliday bit not null default(0)
, IsWorkday bit not null default(1)
, Constraint CK_Calendar Check ( Case
When IsHoliday = 1 And IsWorkDay <> 1 Then 0
Else 1
End = 1 )
)
The only additional restriction I made here is that if something is a holiday, then it must also be marked as not being a workday. Now let's fill our calendar table:
;With Numbers As
(
Select Row_Number() Over( Order By C1.object_id ) As Value
From sys.columns As C1
Cross Join sys.columns As C2
)
, CalendarItems As
(
Select N.Value, DateAdd(d, N.Value, '2000-01-01')As [Date]
From Numbers As N
Where DateAdd(d, N.Value, '2000-01-01') <= '2100-01-01'
)
Insert Calendar( [Date], IsWorkDay )
Select [Date], Case When DatePart(dw, [Date]) In(1,7) Then 0 Else 1 End As IsWorkDay
From CalendarItems
I'm using a CTE to generate a sequential list of integers which I can then use to populate my table. Often, it is useful to have this table be static. In addition, I've marked days that are on Sunday or Saturday as not being workdays. In the above query, I arbitrarily filled my calendar table with dates from the year 2000 to 2100 however you can easily expand the range if you wish.
We can even update the table to account for your tbl_holiday
table:
Update Calendar
Set IsHoliday = 1
From Calendar
Join tbl_Holiday
On tbl_Holiday.holiday = Calendar.[Date]
Finally, we have our query to get the number of occurrances:
;With WorkDayNums As
(
Select C1.[Date]
, Row_Number() Over ( Order By C1.[Date] ) As Seq
From dbo.Calendar As C1
Where C1.IsWorkDay = 1
)
Select S.empid, Count(*) As SickOccurances
From WorkDayNums As WDN
Join (
Select Min(S1.sickdate) MinSickDate, Max(S1.sickdate) As MaxSickDate
From tbl_sick As S1
) As MinMax
On WDN.[Date] Between MinMax.MinSickDate And MinMax.MaxSickDate
Join tbl_sick As S
On S.sickdate = WDN.[Date]
Where Exists (
Select 1
From WorkDayNums As WDN2
Join tbl_sick As S2
On S2.sickdate= WDN2.[Date]
Where WDN2.Seq = WDN.Seq + 1
)
Group By S.empid
What I did here is to create a sequence for each workday. So if Friday was 1, Monday would be 2 barring any holidays. With this, I can easily see if the next workday was a sick day.
You did not make it clear in your OP about whether more than two sequential days counted as a single occurrence or multiple occurrences. In this approach, each two day combination would count as a single occurrence. So if someone where out Monday, Tuesday and Wednesday, the above query should count that as two occurrences.
Just came across this old question and thought I might be able to offer an elegant solution.
Assuming you have a utility function in your database similar to
create function dbo.Fn_Number (@Start int, @N int)
returns @Number table
(
N int not null primary key
)
as
begin
declare @i int
set @i = @Start
while @i <= @N
begin
insert into @Number values (@i)
set @i = @i + 1
end
return
end
go
And using JC's example data you can execute the following
/* We need to consider the following range */
declare @From datetime
declare @To datetime
select @From = min(sick), @To = max(sick)
from @sick
/* Ignoring holidays and Saturday & Sunday create a table of workdays
for the given range */
declare @Workdays table (workday datetime not null primary key)
insert into @Workdays
select dateadd(day, n.N, @From) as Workdays
from dbo.Fn_Number(0, datediff(day, @From, @To) - 1) n
where -- ignore Saturday and Sunday
datepart(weekday, dateadd(day, n.N, @From)) not in (1, 7) and
-- ignore holidays
dateadd(day, n.N, @From) not in (select holiday from @holiday)
The core of the solution is here. It is saying, by empid, get be a count of the days sick, but ignore any days where the preceding workday was also a sick day (because we want don't want to count that twice)
select empid, count(*) as sick_events
from @sick s
where
not exists
( -- make sure there wasn't a sick day prior to this one for this employee
select *
from @sick sa
where sa.empid = s.empid and sa.sick < s.sick and
not exists
( -- and if there was ensure that there wasn't an intervening workday
select *
from @Workdays w
where w.workday < s.Sick and w.workday > sa.Sick
)
)
group by empid
精彩评论