Query regarding FOR XML PATH (Network Steve Forum)

Query regarding FOR XML PATH

I have found a script that contains the following:

stuff(

(

select ', ' + fielda

from

tablea

for xml path (''),

type).value

('.', 'nvarchar(max)')

,1,2, '')

anotherfield

This will concatenate field a and the stuff will remove the leading comma.

I have amended the script to the following and it still works:

Stuff(

(

select ', ' + fielda

from

tablea

for xml path (''))

,1,2, '')

Afield,

Please could somebody tell me why the following has been inserted after the for xml path ('') section

type).value

('.', 'nvarchar(max)')

March 21st, 2015 4:34am

Hi,

value() Method (xml data type)

Performs an XQuery against the XML and returns a value of SQL type. This method returns a scalar value.

You typically use this method to extract a value from an XML instance stored in an xml type column, parameter, or variable. In this way, you can specify SELECT queries that combine or compare XML data with data in non-XML columns.

In your scenario both queries return tha same results.

You can find more information here

https://msdn.microsoft.com/en-us/library/ms178030.aspx

Free Windows Admin Tool Kit Click here and download it now

March 21st, 2015 5:22am

Hi,

Here is one example

with set1 as
(Select idno,accountno
    from tab1)
select idno,STUFF( (SELECT DISTINCT ', ' + accountno
                FROM set1 b WHERE b.idno=a.idno order by accountno FOR XML PATH('')
                ), 1, 2, '')
from set1 a;

on which it is to get concatenation of all Account Number, per one IDNO, and the query is actually grouped by IDNO.

March 21st, 2015 5:24am

Your question is certainly well-founded, because from what you want to achieve, a comma-separated list, the XML thing can only be described as mumbo-jumbo. Removing ,TYPE and .value reduces the mumbo-jumbo factor, and it's understandable that you feel more comfortable.

However, this script illustrates, why you may want the longer thing:

CREATE TABLE tablea(fielda nvarchar(200) NOT NULL)
INSERT tablea(fielda)
   VALUES ('Kalle Anka & co')
INSERT tablea(fielda)
   VALUES ('1 <> 0')
INSERT tablea(fielda)
   VALUES ('"quote"')

SELECT stuff((select', ' + fielda
from tablea
for xml path (''),type).value('.','nvarchar(max)'),1,2,'')

SELECT stuff((select', ' + fielda
from tablea
for xml path ('')),1,2,'')
go
DROP TABLE tablea

If you don't have ,TYPE and .value, characters that are special to XML will be replaced with sequences starting with &. This happens because, well, FOR XML PATH produces an XML document. But for legacy reasons the data type of the XML document is nvarchar(MAX). Adding ,TYPE changes the data type to xml and then we can apply the .value method to extract the value as nvarchar(MAX) with those &-sequences converted back to the original character.

(If you at this point runs from the computer screaming, you have my full sympathy.)

Free Windows Admin Tool Kit Click here and download it now

March 21st, 2015 6:03am

Great info Erland! I'd noticed both syntax's myself but never gave it much thought.

One thing I did notice, however, is that adding the "type" syntax in EXPENSIVE! Like orders of magnitude expensive!

When I looked at the execution plans for your example I was shocked to see just how different the two plans were.

The non-typed version produced a very simple plan with 100% of the cost being a table scan of tablea.
The total subtree cost = 0.0032867)

The the typed version produced a more complex plan that included the "XML Reader" table valued function (98% of the cost).
The total subtree cost = 1.07997)

March 21st, 2015 11:06am

But those numbers are only estimates. Don't pay too much attention to them. To find the actual cost, you would have to run a benchmark.

I would expect that the longer version with TYPE and .value to give you longer execution times. After all, it is doing more work. But how much slower it would be, I don't want to venture to guess.

Free Windows Admin Tool Kit Click here and download it now

March 21st, 2015 11:36am

Erland - I just did a quick test. The differences weren't as bad as the subtree estimates would have indicated but they weren't exactly close either. The typed version (on average) took just over twice as long to execute.

Test conditions were as follows:

AdventureWorks2012 database

On average, the non-typed version executed in ~190 ms and the typed version executes in ~405 ms.

Test bed was the following:
@@VERSION =
Microsoft SQL Server 2012 - 11.0.2218.0 (X64)
Jun 12 2012 13:05:25
Copyright (c) Microsoft Corporation
Developer Edition (64-bit) on Windows NT 6.2 <X64> (Build 9200: )
Hardware =
Processor: Core i7-4770K @ 3.50GHz (4 physical cores / 8 logical cores)
Ram: 16.0 GB @ 2400 MHz
HD: SSD

Here are the two test scripts that I used... (each run in it own SSMS tab)

-- Typed version --
DECLARE @BegDT DATETIME2(7) = SYSDATETIME()

	SELECT 
		sod1.SalesOrderID,
		STUFF((
			SELECT ', ' + CAST(sod2.ProductID AS VARCHAR(8))
			FROM Sales.SalesOrderDetail sod2
			WHERE sod1.SalesOrderID = sod2.SalesOrderID
			FOR XML PATH(''), TYPE).value('.', 'varchar(max)'), 1, 2, '') AS csv
	FROM 
		Sales.SalesOrderDetail sod1
	GROUP BY 
		sod1.SalesOrderID

SELECT DATEDIFF(ms, @BegDT, SYSDATETIME()) AS ExecTimeMS

-- Non-typed version --
DECLARE @BegDT DATETIME2(7) = SYSDATETIME()

	SELECT 
		sod1.SalesOrderID,
		STUFF((
			SELECT ', ' + CAST(sod2.ProductID AS VARCHAR(8))
			FROM Sales.SalesOrderDetail sod2
			WHERE sod1.SalesOrderID = sod2.SalesOrderID
			FOR XML PATH('')), 1, 2, '') AS csv
	FROM 
		Sales.SalesOrderDetail sod1
	GROUP BY 
		sod1.SalesOrderID

SELECT DATEDIFF(ms, @BegDT, SYSDATETIME()) AS ExecTimeMS

If you see holes in my test approach, please let me know.

Thanks,

Jason

March 21st, 2015 1:10pm

Bigger difference that I expected, but I guess it depends on the total context. I don't have AdventureWorks, but I tried this on Northgale, an inflated version of Northwind:

DECLARE @d datetime2(3),
@str nvarchar(MAX)

SELECT @d = sysdatetime()

SELECT @str = (SELECT ShipName + ','
FROM Orders
FOR XML PATH(''))

SELECT datediff(ms, @d, sysdatetime())

SELECT @d = sysdatetime()

SELECT @str = (SELECT ShipName + ','
FROM Orders
FOR XML PATH(''), TYPE).value('.', 'nvarchar(MAX)')

SELECT datediff(ms, @d, sysdatetime())

The difference in my case was not equally startling. Around 240 ms in the first case, and 390 ms for the second.

Free Windows Admin Tool Kit Click here and download it now

March 21st, 2015 2:35pm

In either case, I'd say use the "type" syntax only when working with string data types where there is a possibility that you'll encounter special characters. For everything else, use the the abbreviated syntax.

March 21st, 2015 7:31pm

This topic is archived. No further replies will be accepted.