TOP iterator and parallelism (Network Steve Forum)

TOP iterator and parallelism

Hello Friends,

I just wrote two very simple queries like below.But i am a little confused when i used parallelism for the first one,optimizer comes up with different plan(that is expected) but read all data from index which is not needed i think.What do i miss here ? Please see the screenshots,you will understand what i mean.

Thanks.

Queries;

Exection Plans;

IO;

Edited by Arif Coskun 14 hours 35 minutes ago

September 14th, 2015 12:30pm

Er. MAXDOP 1 means do not use parallelism (IE use one thread)...

Free Windows Admin Tool Kit Click here and download it now

September 14th, 2015 12:37pm

I already know that,the question is when i use parallelism why sql server reads all data from index.

September 14th, 2015 12:55pm

I'd investigate the fragmentation of the index. If you have threads tripping over each other while trying to use it it. I've tested the scenario locally (2012), and I get the same execution plans both ways.

Free Windows Admin Tool Kit Click here and download it now

September 14th, 2015 1:01pm

You can not get same execution plans:) one of them maxdop 1 the others will use maxdop n.are you telling the numbers of rows are same from index scan ?

September 14th, 2015 1:09pm

by the way there is an index on empid.i forgot the say.sorry.

Free Windows Admin Tool Kit Click here and download it now

September 14th, 2015 1:11pm

Yep. I created a table, populated it, create an index on empID and produced execution plans for MAXDOP 1 and default. Same plan.

September 14th, 2015 2:57pm

Free Windows Admin Tool Kit Click here and download it now

September 14th, 2015 3:12pm

The option MAXDOP does not force the QO to opt for a parallel plan. That is the reason the OP is using the option "querytraceon 8649".

Also, notice the OP is using TO

September 14th, 2015 3:30pm

I wonder if you get same estimates after updating statitics in that table with FULLSCAN?

Free Windows Admin Tool Kit Click here and download it now

September 14th, 2015 3:31pm

You're asking essentially why isn't there a parallelized version of the single-threaded plan. In theory that might be useful, but the query optimizer is not perfect, and is stingy about spending too long picking the "perfect plan".

SQL Server only chooses parallel plans for expensive queries, (cost threshold for parallelism Option ), and so this is a query that would never have been parallelized without being forced. Given that, the parallel scan and aggregation is a "good enough" plan.

The other (perhaps more substantive reason) is that in a parallel plan for this query you would need to first discover what the first EmpId is, then distribute index page ranges to multiple threads to perform a parallel scan covering only the first EmpId. Essentially transforming the query into

select top 1 empId,count(*) 
from Orders
where EmpId = (select top 1 empid from orders order by empid)
group by empid

But in the single-threaded case, no such complexity is required. The plan can just start scanning the index, and stop when it encounters the second EmpId. So getting to the optimal plan requires substantially more complicated analysis of the query and the data in the parallel case.

September 14th, 2015 3:53pm

Nice question.

I was able to easily reproduce the basic issue with the below.

CREATE TABLE #Orders
  (
     EmpID  INT,
     Filler CHAR(1000)
  )

CREATE CLUSTERED INDEX ix
  ON #Orders(EmpID)

INSERT INTO #Orders
            (EmpID)
SELECT TOP 1000000 (ROW_NUMBER() OVER (ORDER BY @@SPID)-1) / 50
FROM   master..spt_values v1,
       master..spt_values v2

SET statistics io ON

--Scan count 1, logical reads 27, CPU time = 0 ms, elapsed time = 4 ms.
SELECT TOP 1 EmpID,
             COUNT(*)
FROM   #Orders
GROUP  BY EmpID
OPTION (MAXDOP 1)

--Scan count 1, logical reads 143,283, CPU time = 281 ms,  elapsed time = 287 ms.
SELECT COUNT(*)
FROM   #Orders
OPTION (MAXDOP 1)

--Scan count 7, logical reads 83,996 - DOP 6  CPU time = 375 ms,  elapsed time = 115 ms.
SELECT TOP 1 EmpID,
             COUNT(*)
FROM   #Orders
GROUP  BY EmpID
ORDER  BY EmpID 
OPTION (querytraceon 8649)

The above example creates 20,000 EmployeeId with 50 rows per group. A full table scan gave 143,283 reads and the parallel top 1 plan a variable number of reads per execution with 85,000 being typical. The serial plan gave 27 logical reads.

I was easily able to get the parallel top 1 plan to read the whole table by reducing the number of groups but wanted to demonstrate that this isn't always the case.

The TOP iterator can stop requesting any more rows after a single row is received (as TOP 1). As there is an index on EmpId and that is the desired sort order the serial plan just processed the index in key order and no more rows are requested from the index scan after the stream aggregate emits the first row (i.e. the serial plan only needs to read the 50 rows for the first employeeid plus 1 more to know the group has changed then stops).

When I forced the parallel plan I got an actual execution plan showing that 519,730 rows had been read (i.e. more than 10,000 groups) from the index scan. The parallel page supplier had processed the index in order and distributed these among 6 threads - each of which calculated a partial aggregate. The parent operator to that is an order preserving repartition streams operator that uses hash partitioning to distribute rows across a different 6 threads to its consumer (guaranteeing that all rows from the same group will end up on the same thread). The consumer iterator is a second Stream Aggregate operator that calculates the final aggregate totals per group. 73 rows were emitted from that into the (again order preserving) gather streams operator and that in turn output a single row to the TOP.

Rows get pushed across parallel exchanges in packets rather than individually so it is expected that there will be some lumpiness. The TOP 1 iterator calls GetRow but the GatherStreams operator must wait for its first packet to arrive from each thread (as it is order preserving here) before it can output the first row. It seems as though the subtree below did a lot of unneeded work however in outputting the top 73 groups from which the top 1 was finally chosen. There is a known bug where some threads can fail to stop at all but I wasn't hitting it here. So I just have to attribute this to the "stopping distance" mentioned in my first link.

Free Windows Admin Tool Kit Click here and download it now

September 14th, 2015 4:20pm