StartDate and EndDate values are multiplied, Can't get the right values

Building: Im trying to build a report where I can calculate the difference between two dates (Example between 1-1-2001 and 31-12-2001) but also showing (per name per identifier) the current balance of a customer per date(As shown in the Report attachment).

Problem: The problem is that the totals are not correct, they get multiplied on random numbers (Checking it with Count and CountDistinct I learned that I probably made a mistake in my SQL, so I copied SQL in this question as well).

I have tried plenty myself and tried to find solutions on the interwebs, but none of them seem to work.  So I am desperate enough to ask for help :).

SELECT
	CASE WHEN t1.identifier IS NULL 		THEN t2.identifier 			ELSE t1.identifier
	END as identifier
,	CASE WHEN t1.firstName IS NULL 		THEN t2.firstName			ELSE t1.firstName
	END as firstName
,	CASE WHEN t1.lastName IS NULL		THEN t2.lastName			ELSE t1.lastName
	END as lastName
,	CASE WHEN t1.initials IS NULL			THEN t2.initials				ELSE t1.initials
	END as initials
,	CASE WHEN t1.name IS NULL			THEN t2.name				ELSE t1.name
	END as name
,	CASE WHEN t1.employmentNumber IS NULL	THEN t2.employmentNumber		ELSE t1.employmentNumber
	END as employmentNumber
,	CASE WHEN t1.externalIdentifier IS NULL	THEN t2.externalIdentifier		ELSE t1.externalIdentifier
	END as externalIdentifier
,	t1.balance 	AS S_balance
,	t1.unvested	AS S_unvested
,	t2.balance	AS V_unvested
,	t2.unvested	AS V_balance
,	SUM(t1.TotalBalance + t1.TotalUnvested) Total1
FROM
(
SELECT
  par.identifier
,par.firstName
  ,par.lastName
  ,par.initials
  ,pos.name
  ,pos.balance
  ,pos.unvested
,par.employmentNumber
,SUM(pos.balance) TotalBalance 
,sum(pos.unvested) TotalUnvested
,Right(par.externalIdentifier,(Len(par.externalIdentifier)-4)) as externalIdentifier
FROM
  participant par
  INNER JOIN position pos
    ON par.identifier = pos.participantIdentifier
WHERE
 (pos.balance != 0 OR pos.unvested != 0)
AND pos.name NOT LIKE (@Euro)
 AND par.Metadata_start_validity <=@Datum
  AND (par.Metadata_end_validity >=@Datum
	OR par.Metadata_end_validity IS NULL)
  AND pos.Metadata_start_validity <=@Datum
  AND (pos.Metadata_end_validity >=@Datum
	OR pos.Metadata_end_validity IS NULL)
GROUP BY
  par.identifier
,par.firstName
  ,par.lastName
  ,par.initials
  ,pos.name
  ,pos.balance
  ,pos.unvested
,par.employmentNumber
,par.externalIdentifier
) as t1
INNER JOIN
(
SELECT
  par.identifier
,par.firstName
  ,par.lastName
  ,par.initials
  ,pos.name
  ,pos.balance
  ,pos.unvested
,par.employmentNumber
,Right(par.externalIdentifier,(Len(par.externalIdentifier)-4)) as externalIdentifier
FROM
  participant par
  INNER JOIN position pos
    ON par.identifier = pos.participantIdentifier
WHERE (pos.balance != 0 OR pos.unvested != 0)
AND pos.name NOT LIKE (@Euro)
 AND par.Metadata_start_validity <=@V_Datum
  AND (par.Metadata_end_validity >=@V_Datum
	OR par.Metadata_end_validity IS NULL)
  AND pos.Metadata_start_validity <=@V_Datum
  AND (pos.Metadata_end_validity >=@V_Datum
	OR pos.Metadata_end_validity IS NULL)
) as t2
ON
t1.identifier = t2.identifier
GROUP BY
	CASE WHEN t1.identifier IS NULL 		THEN t2.identifier 			ELSE t1.identifier
	END 
,	CASE WHEN t1.firstName IS NULL 		THEN t2.firstName			ELSE t1.firstName
	END 
,	CASE WHEN t1.lastName IS NULL		THEN t2.lastName			ELSE t1.lastName
	END 
,	CASE WHEN t1.initials IS NULL			THEN t2.initials				ELSE t1.initials
	END 
,	CASE WHEN t1.name IS NULL			THEN t2.name				ELSE t1.name
	END 
,	CASE WHEN t1.employmentNumber IS NULL	THEN t2.employmentNumber		ELSE t1.employmentNumber
	END 
,	CASE WHEN t1.externalIdentifier IS NULL	THEN t2.externalIdentifier		ELSE t1.externalIdentifier
	END 
,	t1.balance 	
,	t1.unvested	
,	t2.balance	
,	t2.unvested	
,	t1.TotalBalance
,	t1.TotalUnvested


January 9th, 2014 1:41pm

Most likely you're suffering from the problem described in this blog post

Aggregates with multiple tables

That blog also offers solutions which you should be able to apply to your situation.

Free Windows Admin Tool Kit Click here and download it now
January 13th, 2014 8:31pm

The following is dubious. Can you describe in plain English what are you trying to do with these dates:

AND par.Metadata_start_validity <=@V_Datum
  AND (par.Metadata_end_validity >=@V_Datum
	OR par.Metadata_end_validity IS NULL)
  AND pos.Metadata_start_validity <=@V_Datum
  AND (pos.Metadata_end_validity >=@V_Datum
	OR pos.Metadata_end_validity IS NULL)

January 13th, 2014 9:01pm

Thanks Naomi,

That blog fixed my problem.

Free Windows Admin Tool Kit Click here and download it now
January 14th, 2014 7:55am

Please post DDL, so that people do not have to guess what the keys, constraints, Declarative Referential Integrity, data types, etc. in your schema are. Learn how to follow ISO-11179 data element naming conventions and formatting rules (you have no idea, do you?). Temporal data should use ISO-8601 formats. Code should be in Standard SQL as much as possible and not local dialect. 

This is minimal polite behavior on SQL forums. Did you know that camelCase does not work? Google the research. There are no generic <nothing in particular>_name, <magic universal>_identifier, etc in a valid data model. 

This codes tells us you have attribute splitting in the schema:
 T1.balance AS S_balance, T1.unvested AS S_unvested, 
 T2.balance AS V_unvested, T2.unvested AS V_balance, 

I am sorry you have only one Participant and one Position, which is what your singular table name tell the world. 

Your vested and unvested are a status value, but you put them into separate tables. This is as silly as having Male_Personnel and Female_Personnel tables instead of the correct Personnel table. This why you are re-creating the name-address columns every time! 

What kind of entity is an external? You have an external_identifier, so there has to be such an entity. Even worse you have:

RIGHT(PAR.external_identifier, (LEN(PAR.external_identifier)-4))

This tells us that this vague column is a concatenation of two or more data elements. This is not a valid schema design. And when you write:

AR.identifier = POS.participant_identifier

Your magical generic identifier changes to a participant. In RDBMS, a data element has one and only one name, and doer not change from table to table.  Can it also become a squid? An automobile? 

I see you are an old punch card programmer. Back in the old days, we put one expression per card (line) so we could re-arrange the deck. This also meant that the comma went at the front line. And we even parentheses and keywords on single cards! WOW! 

You do not know about COALESCE() and use CASE the way that you would use IF-THEN in COBOL. Look at this code. Why did you think that @datum was a clear, precise data element name? It is the most generic name for a single data element value (not even the element itself). 

 AND PAR.metadata_start_validity <= @vague_generic_something
 AND (PAR.metadata_end_validity >= @vague_generic_something
 OR PAR.metadata_end_validity IS NULL)

COBOL, C and the other procedural languages that you are trying to write in SQL do no have the SQL shorthand, so you use Boolean operators. We would write:

@vague_generic_something
 BETWEEN PAR.metadata_start_validity 
   AND COALESCE(PAR.metadata_end_validity, @vague_generic_something)

But we never put metadata into a schema. NEVER. Validity makes no sense. Is it a date? A physical location? What? It has to be on a scale on which theta operators apply. Your validity hs no ordering! 

You do not know that SQL uses <> and not != from the C family. 

We do not use bit flags; that was assembly language, not SQL. The term Euro is a monetary unit of measurement, not value. 

You are getting garbage data because you have a garbage schema. Post the DDL and someone might be able to help you replace it. 
January 14th, 2014 12:36pm

First of all, thank you for the feedback "--Celko--" , I will use your advise in the future and on future posts.

It's no problem that you never learned the difference between constructive feedback and destructive feedback, but people tend to focus on one knowledge field, yours is SQL so it seems.

Furthermore, I will look into your feedback and see into ISO-11179 and ISO-8601 for writing my SQL. I admit it's sloppy code, but I'm still learning and relatively new to this, I hope you can accept / see that.

I have ordered Kalman's book about Exam 70-461 to learn more about SQL and to take the exam for the certifiacte. If you have any suggestions on how I can obtain more knowledge about SQL and it's ISO certfications I'm open for advise. It's a learning world and everyone has to start somewhere (I have the feeling you wrote perfect code from birth, I'm not that gifted and therefor I apoligize).

Again thanks for the feedback and hope I won't make you mad when I ask a question again.

Free Windows Admin Tool Kit Click here and download it now
January 15th, 2014 3:33am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics