现在有张加班表,一个人同一天只允许申请一次加班,就是加班日期和userid相同的数据应该只有一条,但是现在由于之前没有做限制,导致很多数据重复
怎么通过sql删掉重复数据
思路:先查询重复数据的最大id(唯一一条),然后查询出所有重复数据,并且id不等于最大id的数据,然后删除即可
分解成几步,这样思路就相对清晰了
1.首先要知道怎么查询重复数据 就是SPONSOR_ID和datetime相同的数据
SELECT
id
FROM
t_yszg_grsq q
WHERE
q.FORM_CODE = 303
AND q.FINISH_STATUS = 3
AND q.DELFLAG = 1
GROUP BY
q.SPONSOR_ID,
q.datetime
HAVING
count(*) > 1
先查询重复数据的id
在查询最大的id就简单了,稍微改下
SELECT
max( ID ) AS maxid
FROM
t_yszg_grsq q
WHERE
q.FORM_CODE = 303
AND q.FINISH_STATUS = 3
AND q.DELFLAG = 1
GROUP BY
q.SPONSOR_ID,
q.datetime
HAVING
count(*) > 1
2.查询出重复的数据的SPONSOR_ID和datetime,
SELECT
t.SPONSOR_ID,
t.datetime
FROM
(
SELECT
q.SPONSOR_ID,
q.datetime
FROM
t_yszg_grsq q
WHERE
q.FORM_CODE = 303
AND q.FINISH_STATUS = 3
AND q.DELFLAG = 1
GROUP BY
q.SPONSOR_ID,
q.datetime
HAVING
count( 1 ) > 1
) AS t;
3.然后查询出SPONSOR_ID和datetime等于上面的数据,那么就可以查询出所有重复的数据,然后再删除不是最大id的数据
SELECT
id
FROM
t_yszg_grsq
WHERE
( SPONSOR_ID, datetime ) IN (
SELECT AT
.SPONSOR_ID,
AT.datetime
FROM
(
SELECT
SPONSOR_ID,
datetime
FROM
t_yszg_grsq
WHERE
FORM_CODE = 303
AND FINISH_STATUS = 3
AND DELFLAG = 1
GROUP BY
SPONSOR_ID,
datetime
HAVING
count(*) > 1
) AT
)
4.现在就简单了,只要删除在3中不在1中的数据即可,逻辑删除DELFLAG=0
SELECT
id
FROM
t_yszg_grsq
WHERE
( SPONSOR_ID, datetime ) IN (
SELECT AT
.SPONSOR_ID,
AT.datetime
FROM
(
SELECT
SPONSOR_ID,
datetime
FROM
t_yszg_grsq
WHERE
FORM_CODE = 303
AND FINISH_STATUS = 3
AND DELFLAG = 1
GROUP BY
SPONSOR_ID,
datetime
HAVING
count(*) > 1
) AT
)
AND ID NOT IN (
SELECT
mt.maxid
FROM
(
SELECT
max( ID ) AS maxid
FROM
t_yszg_grsq q
WHERE
q.FORM_CODE = 303
AND q.FINISH_STATUS = 3
AND q.DELFLAG = 1
GROUP BY
q.SPONSOR_ID,
q.datetime
HAVING
count(*) > 1
) AS mt
);
sql看起来复杂,分解成几步,思路清晰,其实也很简单
|