Assistance with a duplication query

Hello

When a new customer contacts us, they are allocated a reference number.

Unfortunately our contact centre sometimes logs the same person without checking if they have contacted us before and the customer ends up with two reference numbers. We want to cleanse this, so:

I would like to output instances where the customer's surname, address1 and zipcode are duplicated but only if the customer has different reference numbers.

Eg:

        Reference   Surname          Address1                       Zip

        1875         Faulkner             10 Smith Street             08540

        1876         Faulkner              10 Smith Street            08540

I have tried a few ideas, the latest being:

with Duplicates as
(
	select r.LastName
		, a.Address1
		, a.ZipCode
		, COUNT(*) as DuplicateCount
	FROM Reference r
	INNER JOIN Address a ON a.ReferenceNumber = r.ReferenceNumber 
	LEFT OUTER JOIN Telephone t ON r.ReferenceNumber = t.ReferenceNumber
	LEFT OUTER Join Email e ON r.ReferenceNumber = e.ReferenceNumber
	group by r.LastName
		, a.Address1
		, a.ZipCode
	having COUNT(*) > 1
)

SELECT
	r.ReferenceNumber
	, r.LastName
	, r.FirstName
	,a.ReferenceNumber 
	, a.Address1
	, a.Address2
	, a.Address3
	, a.Address4
	, a.ZipCode
	,t.ReferenceNumber
	, t.TelephoneNumber
	,e.ReferenceNumber
	, e.EmailAddress
	, d.DuplicateCount
FROM Reference r
INNER JOIN Address a ON a.ReferenceNumber = r.ReferenceNumber 
LEFT OUTER JOIN Telephone t ON r.ReferenceNumber = t.ReferenceNumber
LEFT OUTER Join Email e ON r.ReferenceNumber = e.ReferenceNumber
join Duplicates d on d.LastName = r.LastName
					AND d.Address1 = a.Address1
					AND d.ZipCode = a.ZipCode

Unfortunately this returns all duplicates, not those with the same surname, address1 and zipcode and different reference numbers.

Do you have any advice on how I can achieve this?

Many thanks.



March 26th, 2015 5:28pm

you did not provide DDL or sample data..

i think this might work...modify the columns,if needed...

;with cte as (select R.referencenumber,r.LastName, a.Address1, a.ZipCode
	FROM Reference r
	INNER JOIN Address a ON a.ReferenceNumber = r.ReferenceNumber)

	Select A.*
	 from CTE A INNER JOIN CTE B on A.lastname=B.lastname and A.Adress1=B.Address1 and A.zipCode=B.zipCode and
	A.referencenumber<>.B.Referencenumber
	order by A.lastname,A.adress1,A.zipcode
Free Windows Admin Tool Kit Click here and download it now
March 26th, 2015 6:00pm

Thanks for getting back Stan.

I'm, fairly new to this, so with the code you suggested, do you mean adding this on to the code I already have, or running it separately?

March 26th, 2015 6:25pm

Thanks for getting back Stan.

I'm, fairly new to this, so with the code you suggested, do you mean adding this on to the code I already have, or running it separately?

try this in your test server..it will create the table and populate some data into the table and run the code to find duplicates per your criteria and then DROPS the table. do not run this in your production..

basically, the way duplicates are calculated is : it looks for same lastname, address1,zipcode but different referencenumber. in short, at first, i am getting all the referencenumber,lastname,address1,zipcode data into a table. then, i am self joining to the same table that matches with adress1,zipcode,lastname but mismatches referencenumber.

use this code instead of what your code.

create table Reference(referencenumber int primary key identity(1,1),lastname varchar(20),firsname varchar(20)) create table [Address](addressid int primary key identity(1,1),referencenumber int references reference(referencenumber),address1 varchar(20), city varchar(20),state varchar(2),country char(2),zipcode varchar(20)) go insert into reference values('lincoln','sam'),('wright','mike'),('lincoln','will'),('lincoln','sam') go insert into [Address] values(1,'east avenue','atlanta','GA','US',12345), (2,'south avenue','atlanta','GA','US',12345), (3,'west avenue','atlanta','GA','US',12345), (4,'east avenue','atlanta','GA','US',12345)

;with cte as (select R.referencenumber,r.LastName, a.Address1, a.ZipCode FROM Reference r INNER JOIN Address a ON a.ReferenceNumber = r.ReferenceNumber) Select distinct A.* from CTE A INNER JOIN CTE B on A.lastname=B.lastname and A.Address1=B.Address1

and A.zipCode=B.zipCode and A.referencenumber<>B.Referencenumber order by A.lastname,A.address1,A.zipcode drop table [Address] drop table [REFERENCE]

Free Windows Admin Tool Kit Click here and download it now
March 26th, 2015 9:16pm

Thanks Stan - that has really helped. I should have posted code and I really appreciate your taking the time to do so.
March 27th, 2015 3:22am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics