Donate. I desperately need donations to survive due to my health

Get paid by answering surveys Click here

Click here to donate

Remote/Work from Home jobs

Remove duplicate rows in sql with multiple filters

Firstly Very new to Stack Overflow & from a non IT background. I've 3 tables and each contains different information. Just to demonstrate here is an example.
Table 1
C_ID    C_NAME
10      ABC Co
20      XYZ Co
Table 2
C_ID    S_ID    FromDate    ToDate      EvalDate
10      101     1-Jan-00    31-Dec-12   6-Jan-12
10      102     1-Jan-05    31-Dec-15   15-Jan-15
10      103     1-Jan-07    31-Dec-17   21-Jan-17
20      104     1-Jun-10    30-May-15   1-Jun-15
20      105     6-Jan-11    30-May-17   1-Jun-17
Table 3
S_ID    L_ID    EvalDate    OccDate     Type        Amt
105     22      1-Jun-17    11-Apr-13   Theft       500
104     20      1-Jun-15    11-Apr-13   Theft       500
104     19      1-Jun-15    1-Feb-13    Snow/Ice    2000
104     18      1-Jun-15    1-Jan-13    Fire        1500
105     21      1-Jun-17    20-Dec-12   ManMade     100
104     17      1-Jun-15    20-Dec-12   ManMade     100
103     15      21-Jan-17   15-Aug-11   NatCat      900
102     9       15-Jan-15   15-Aug-11   NatCat      900
101     4       6-Jan-12    15-Aug-11   NatCat      900
103     16      21-Jan-17   28-Feb-11   Fire        2500
102     10      15-Jan-15   28-Feb-11   Fire        2500
102     11      15-Jan-15   11-Jun-10   Collapse    100
103     13      21-Jan-17   10-Sep-09   Water       450
102     7       15-Jan-15   10-Sep-09   Water       500
101     2       6-Jan-12    10-Sep-09   Water       500
103     14      21-Jan-17   10-Sep-08   Flood       6000
102     8       15-Jan-15   10-Sep-08   Flood       6000
101     3       6-Jan-12    10-Sep-08   Flood       6000
103     12      21-Jan-17   6-Jun-07    Fire        1150
102     6       15-Jan-15   6-Jun-07    Fire        1100
101     1       6-Jan-12    6-Jun-07    Fire        1000
What I've been doing so far….
  1. Join all these tables & Sort by C_ID, OccDate (latest first), EvalDate (Latest first) to make sure I get the latest data on top.
  2. copy data in Excel then find out duplicate with the below formula & Take out all which is "Duplicate"
  3. Keep data in the table which has "No" - in Excel
After Joins
C_ID    S_ID    L_ID    EvalDate    OccDate     Type        Amt   Check
20      105     22      1-Jun-17    11-Apr-13   Theft       500   No
20      104     20      1-Jun-15    11-Apr-13   Theft       500   Duplicate
20      104     19      1-Jun-15    1-Feb-13    Snow/Ice    2000  No
20      104     18      1-Jun-15    1-Jan-13    Fire        1500  No
20      105     21      1-Jun-17    20-Dec-12   ManMade     100   No
20      104     17      1-Jun-15    20-Dec-12   ManMade     100   Duplicate
10      103     15      21-Jan-17   15-Aug-11   NatCat      900   No
10      102     9       15-Jan-15   15-Aug-11   NatCat      900   Duplicate
10      101     4       6-Jan-12    15-Aug-11   NatCat      900   Duplicate
10      103     16      21-Jan-17   28-Feb-11   Fire        2500  No
10      102     10      15-Jan-15   28-Feb-11   Fire        2500  Duplicate
10      102     11      15-Jan-15   11-Jun-10   Collapse    100   No
10      103     13      21-Jan-17   10-Sep-09   Water       450   No
10      102     7       15-Jan-15   10-Sep-09   Water       500   Duplicate
10      101     2       6-Jan-12    10-Sep-09   Water       500   Duplicate
10      103     14      21-Jan-17   10-Sep-08   Flood       6000  No
10      102     8       15-Jan-15   10-Sep-08   Flood       6000  Duplicate
10      101     3       6-Jan-12    10-Sep-08   Flood       6000  Duplicate
10      103     12      21-Jan-17   6-Jun-07    Fire        1150  No
10      102     6       15-Jan-15   6-Jun-07    Fire        1100  Duplicate
10      101     1       6-Jan-12    6-Jun-07    Fire        1000  Duplicate
Note: formula in Excel =if(and(c5r2=c5r1,c6r2=c6r1),"Yes","No")) which keeps only latest row as "No"

Output
C_ID    S_ID    L_ID    EvalDate    OccDate     Type      Amt   Check
20      105     22      1-Jun-17    11-Apr-13   Theft     500   No
20      104     19      1-Jun-15    1-Feb-13    Snow/Ice  2000  No
20      104     18      1-Jun-15    1-Jan-13    Fire      1500  No
20      105     21      1-Jun-17    20-Dec-12   ManMade   100   No
10      103     15      21-Jan-17   15-Aug-11   NatCat    900   No
10      103     16      21-Jan-17   28-Feb-11   Fire      2500  No
10      102     11      15-Jan-15   11-Jun-10   Collapse  100   No
10      103     13      21-Jan-17   10-Sep-09   Water     450   No
10      103     14      21-Jan-17   10-Sep-08   Flood     6000  No
10      103     12      21-Jan-17   6-Jun-07    Fire      1150  No

Question? 1. Can all these steps be done in mySQL? if Yes, What the query would be?
Note: i've morethan 800k rows in the DB and it will help me if i break this code.

Comments