OpenQuery() to Linked Server hangs, leaving SPID with open tran, then templog.ldf grows without limit.

Hi,

We have a customer that is using SQL Server 2000, and experiencing the following intermittent problem -- occasionally templog.ldf will grow and grow until it fills the entire disk.

This is a rare problem, it has happened less than10 times in the past two years, for a job that runs once a minute. But it has happened on 3 different SQL Servers, (two production servers, and one test server).

Our suspicion is that the root cause is an OPENQUERY() to pull data from a linked server (Oracle database on Unix). We have seen that these OPENQUERY() statements occasionally hang and cannot be killed. The OPENQUERY() is used to populate a local table, and when we originally populated permanent tables with OPENQUERY(), then there would be a lock on this permanent table and nothing could be done until SQL Server was stopped and restarted.

To workaround the immediate problem, last year we modified all OPENQUERY() statements so that the local table was a #temp table. In that case, cancelling the job left the hung process, but the hung process had a lock on a #temp table and so that didn't keep the next execution of the job from creating a new #temp table and populating it.

However... it appears that the hung process, because it leaves an SPID with an open transaction, keeps the tempdb log file from being truncated. And eventually you run out of disk space, no matter how much disk space is on the server.

We have tried using "kill" on the SPID with the open transaction, but that doesn't work - the process never dies (we waited several days) and the transactions are never rolled back.

We would like to resolve this issue, because it has started coming back with some regularity, and of course it usually happens on nights or weekends. Plus there is the problem of scheduling a stop/restart of SQL Server on a production server, which we would like in all cases to avoid.

Here is some data from the most recent instant of this problem (test server):

Product version: 8.00.760
Product level: SP3
Edition: Standard Edition
Version: Microsoft SQL Server 2000 - 8.00.760 (Intel X86) Dec 17 2002
14:22:05 Copyright (c) 1988-2003 Microsoft Corporation Standard Edition
on Windows NT 5.0 (Build 2195: Service Pack 4)

SELECT * FROM master.dbo.sysprocesses returns this suspicious SPID:

spid: 51
kpid: 1876
blocked: 0
waittype: 0x0042
waittime: 135452422
lastwaittype: OLEDB
waitresource:_name of Oracle linked server_
dbid: 9
uid: 0
cpu: 188
physical_io: 0
memusage: 12
login_time: 2006-08-07 18:53:04
last_batch: 2006-08-07 18:53:04
ecid: 0
open_tran: 2
status: runnable
sid:
hostname:
program_name: SQLAgent - TSQL JobStep ...
hostprocess: 1500
cmd: INSERT
nt_domain: NT AUTHORITY
nt_username: SYSTEM
net_address: 000CF1C7A509
net_library: TCP/IP
loginame: NT AUTHORITY\SYSTEM
context_info:
sql_handle:
stmt_start: 0
stmt_end: -1

Note 1: This data is from 2006-08-09, so the SPID has been hanging around for almost 2 days.
Note 2: We did not attempt to KILL this SPID. In the past when we have done so that changes the status of the SPID but even after waiting several days does not get rid of the SPID& open transaction.
Note 3: These OPENQUERY() statements normally run in 1-5 seconds, so it does not seem likely that it takes more than 2 days to roll this transaction back when we attempt to kill it.
Note 4: From a web search, I have found other people with similar problems but no resolution.
Note 5: We have consciously not installed SQL Server 2000 SP4, because it is our understanding that there is a problem in SP4 with online defragmentation holding locks.

Any suggestions on how to:

1) Kill an SPID like this, when it is waiting for OLEDB?

or

2) Keep templog.ldf from growing without limit when we get a hung transaction like this?

Thanks!
Frank.

August 10th, 2006 2:18am

I think you can get more definite answer in the engine forum for such issue.
Free Windows Admin Tool Kit Click here and download it now
August 10th, 2006 11:03pm

Hi Nan,

Thanks, I wasn't sure where it would go, there didn't seem to be a perfect spot for linked server questions like this.

-Frank.

August 11th, 2006 12:25am

We are running SQL Server 2005 and are experiencing a similar problem. After restarting the service we are able to pull back results from 1 openquery statement. After that, each subsequent openquery statement locks and we are unable to kill them. The only solution we can find is restarting the service. Also, unlike fdc2005, our templog.ldf is not growing.

Report Services is calling reports using the openqueries and therefore this is a production impacting issue. We were using a link with msdaora previously which did not have this issue. However, after a reboot of the server, we were unable to restore that link and had to move to Oracle odbc. So there are two questions:

1) How can we prevent the linked server from hanging?

and/or

2) How can we re-initialize the MSDAORA linked server?

linked server script that did work prior to reboot and does not work after reboot is:

EXEC sp_addlinkedserver @server='GLOG',

@provider='MSDAORA',

@srvproduct = 'Oracle',

@datasrc='CRSTPGC3'

EXEC sp_addlinkedsrvlogin

@rmtsrvname ='GLOG',

@useself ='false',

@locallogin =NULL,

@rmtuser = 'glogowner',

@rmtpassword = 'glogowner'

Any help is appreciated.

Free Windows Admin Tool Kit Click here and download it now
August 11th, 2006 8:10pm

Hi Sloppypoet,

When you check sysprocesses after the openquery() hangs, does it show open transactions? If so, does the database ID refer to tempdb or another database (perhaps one of the Reporting services databases?)

After a reboot and running the first openquery(), if you check sysprocesses does it show a hanging process? Perhaps the very first openquery() is holding a lock on a destination table.

When this happens, if you check the Oracle side, does it show a hanging connection?

Ithink our customer isrunning now with the setup you are using that is hanging after the first query (Oracle ODBC). I will ask them to run sp_linkedservers and post the results.

What version of the Oracle drivers are installed? Can you use the Oracle query tool to query the database (Start | Programs | Oracle - OraHome | Application Development | SQL Plus)?

Our customer also had a problem once with the Oracle drivers install - if you accept the default folder location, then backup and specify a different location, the installation is toasted and cannot be fixed. The only thing we found that works in this case is to uninstall the Oracle drivers, search the registry for all references to Oracle, delete them, then reinstall, being very careful to specify the installation folder the first time when it asks. I thought this was crazy but then found a web posting documenting the same behavior.

Best of luck,

Frank.

August 12th, 2006 7:45pm

Hi Microsoft,

This problem [Openquery() hanging, leaving open connection with open tran, causing templog to blow up] has started happening more frequently. While this is good for troubleshooting, it is bad for the customer, who has to stop and restart SQL Server every time it happens.

Here is the output from sp_linkedservers for the suspect linked server:

SRV_NAMEPROD2
SRV_PROVIDERNAMEMSDAORA
SRV_PRODUCTOracle
SRV_DATASOURCEPROD2
SRV_PROVIDERSTRING
SRV_LOCATIONNULL
SRV_CATNULL

After this latest incident, the customer tried to KILL the hanging SPID, but it doesn't go away, sysprocesses lists the same SPID with cmd = KILLED/ROLLBACK, and continues to show open_tran = 2. Thus templog will continue to grow until they stop/restart SQL Server.

I would like to get this issue resolved - telling a customer that they must stop/restart SQL Server on a production machine whenever something odd happens with an OPENQUERY() connection to Oracle is unacceptable for us. It seems to me that no matter what odd state the OPENQUERY() connection gets into, it should be possible to kill the process and move forward without having to stop/restart SQL Server.

Thanks for any help you can provide.

Regards,
-Frank.

Free Windows Admin Tool Kit Click here and download it now
August 14th, 2006 7:09pm

Suffered the same problem during a migration of system from a foxpro backend to a mssql backend.

if you ran ANY openrowset() command on the sql server(ie mssqlsvr2k) through queryanalyser it wouldn't allow any new processes to connect. Any existing ones would be left alone but once they finished/disconnected nothing else would be allowed to happen. A reset of the server was the only thing that fixed it.

The ONLY way to run an openrowset command like:

INSERT INTO [mssqlsvr2k].[proddb].[dbo].[client]([clientno], [dob], [gname], [surname])
SELECT dbf.[episodeno], dbf.[clientno], dbf.[epitype], dbf.[refdate], dbf.[epi_group]
FROM OPENROWSET('MSDASQL',
'Driver=Microsoft Visual Foxpro Driver;SourceType=DBF;SourceDB=C:\Foxpro Data Copy\',
'SELECT * FROM client') AS dbf
go

through queryanalyser was to install mssql on your local PC and set up a linkserver relationship with the production server (mssqlsvr2k) and run the command on the localhost server.This totally killed the tempdb on the local machine though so i'm guessing this your main problem (ie, your openrowset runs 'locally' on the server).

I tried a bunch of different ways to get it to run on the remote machine but anything I tried that used openrowset locked up the server.

Tried the same code on the localhost server and it ran but tempdb grew until it had taken up all the physical mem & swapfile on my local pc. The only way I could migrate "larger" tables (500meg-1gig +) was using DTS. Have you guys considered setting your transfers up as a DTS?

August 15th, 2006 9:31am

Hi teeth777,

Thanks for the info, it seems like this is a more common problem than I thought, independent of the back-end data source... searching for "openquery to linked server hangs", here's just a few examples:

Oracle: http://groups.google.com/group/microsoft.public.sqlserver.odbc/browse_thread/thread/4c6a57ddf22aa5a9/3893e142c56f2306%233893e142c56f2306

ASE: http://groups.google.com/group/microsoft.public.sqlserver.connect/browse_thread/thread/7a2df55eaa3c63fe/f7562f954e3726d7%23f7562f954e3726d7

Sybase: http://groups.google.com/group/microsoft.public.sqlserver.odbc/browse_thread/thread/e1270ce025dfc4ad/52cb71b66c31a33b

In general: http://groups.google.com/groups?q=openquery+to+linked+server+hangs&start=0&scoring=d&hl=en&lr=&

I've shied away from DTS after using it in the late 90s with SQL 7. From a maintenance point of view, having these separate packages outside of the code is another item to keep track of, and most of our transfers run once a minute and are parameterized to pull just the new activity in a table or set of tables. Using openquery to do so (if it works reliably) is very clean because then all the code is in one place.

I wonder if it's like sqlmail - looks good in concept, but unreliablein practice (when I last used it, sqlmail also requireda stop/restart of SQL Server if the slightest thing went wrong with the downstream mail server... I couldn't believe it at the time but it was confirmed by a Microsoft tech, they said there was no other way to fix sqlmail if it became unhappy).

Can we get some input from Microsoft on this???

Thanks,
Frank.

Free Windows Admin Tool Kit Click here and download it now
August 15th, 2006 6:02pm

We were able to get the link working again, though we do not have a clear resolution. Ultimately we attempted to reinstall the Oracle Client. That process hung but we were able to back out of it.We thenre-started SQL Services to clear the deadlocked SPIDs. Miraculously, after these stepswere done, the link began working again. Our gut feeling is that the root cause was the Oracle client but we do not have any evidence to support that conclusion. We will be testing this theory soon to determine a clearcut resolution.

Thanks fdc2005 for the insight on the Oracle client itself.

August 15th, 2006 6:39pm

I am having the same issue with a light modification. I am using OpenRowset() to open a connection with Visual FoxPro and I cannot see any lingering spid's or anything in sql server to indicate that a connection with FoxPro exists. One the server where FoxPro runs there are a .dbc, .dbt and .dbx file that remain locked. The problem is once these lockes are manually broken (via Computer Management -> System Tools -> Shared Folders -> Open Files) the SQL Server can no longer establish a connection to the files.

Rebooting servers fixes the problem, but rebooting production regularly servers just doesn't cut it.

The error I get when trying to connect via SQL is:

Server: Msg 7399, Level 16, State 1, Procedure sprcFoxProUpdates_TEST, Line 74
OLE DB provider 'MSDASQL' reported an error.
[OLE/DB provider returned message: [Microsoft][ODBC Visual FoxPro Driver]Cannot open file \\crius\tpe\test2\.]
OLE DB error trace [OLE/DB Provider 'MSDASQL' IDBInitialize::Initialize returned 0x80004005: ].

Any help on this issue would be greatly appreciated, that means YOU MICROSOFT.

Free Windows Admin Tool Kit Click here and download it now
August 18th, 2006 11:34pm

Hi All,

It has been two weeks now without a response from Microsoft. In that time this problem has happened again and my customer is becoming quite frustrated with having to stop and restart SQL Server at random times because a hung SPID cannot be killed.

I have opened this issue as a bug:

https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=187192

Please visit this bug report and vote on it if you are also having trouble with this issue.

Thanks,
-Frank.

August 27th, 2006 7:45pm

Hi,

I am having a similar problem with the hanging connection being left on Sybase which I am Linked Server to from SQL Server. I tried your above link but no luck, would love to vote.

Thanks,

Free Windows Admin Tool Kit Click here and download it now
August 30th, 2006 7:25am

Hi JuanitaG,

That is puzzling, I do not have much experience with the connect site. I tried it from a different machine where I was not logged into Windows Live and can still jump to the page with bug report.

Try google for"openquery linked server hangs", the first link that comes up (August30 2006, 1am Eastern) is the bug report, the second link that comes up is this thread on MSDN forums.

Here is the cached version of the bug report:

http://72.14.203.104/search?q=cache:FSBfT5Alxa8J:https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx%3FFeedbackID%3D187192+openquery+linked+server+hangs&hl=en&gl=us&ct=clnk&cd=1

Regards,
-Frank.

August 30th, 2006 8:27am

Thanks Frank,

New link worked. I notice that you are on SP3 I am having our server upgraded to SP4 following advice on this link

http://www.kbalertz.com/822668/results.another.error.message.occurs.linked.server.query.aspx

It seems that I have a couple of error messages but all giving me the same problem- left over dead(Sybase) conenctions from linked server.

Free Windows Admin Tool Kit Click here and download it now
August 30th, 2006 9:14am

Hi Juanita,

Thanks for the link (and the vote on the bug report).

We considered SP4 but have been wary of it due to a problem with DBCC INDEXDEFRAG holding locks and causing problems on production servers that are under high load... my understanding is that this problem was introduced by SP4. See for example this posting: http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=63194

"paulrandalMicrosoft SQL Server Product Team: ... Yup - unfortunately SP4 had a bug in the lock manager that means the short duration page locks that DBCC INDEXDEFRAG takes are not dropped when it drops them, leading to eventual escalation to an exclusive table lock..."

Looks like there is a post-SP4 hot fix for the DBCC INDEXDEFRAG problem, however:

http://support.microsoft.com/kb/907250/
"FIX: You may experience concurrency issues when you run the DBCC INDEXDEFRAG statement in SQL Server 2000"

If you find that SP4 fixes your problem with Sybase I will be interested to hear about it, perhaps we should install SP4 and this hotfix.

Thanks,
-Frank.

August 30th, 2006 7:16pm

Hi all. Wehave the same problem with sql server 2005 when using OPENQUERY() or EXECUTE() AT.

Connection hangs and process may not be killed (status changes to KILLED/ROLLBACK but connection does not terminate). Terminating the client connection to the sql server does not help.

If the blocking connection is using a transaction holding locks on the local sql server database these locks will be kept by the blocking connection and thus prevent other connections from accessing the locked objects. Depending on the locked object this is equal to a completely hanging sql server!

It looks like the linked server is able to terminate the hanging connection (by setting a timeout). But we are still testing this, so this information might be wrong.

Lukas

Free Windows Admin Tool Kit Click here and download it now
September 14th, 2006 5:45pm

Hi Lukas,

Are you able to reliably reproduce the behavior? I got frustrated when there was no reply from Microsoft so called and opened a tech support case. Thesuggestion so far has been to run some diag tools and send the output to the support tech the next time the problem occurs, but for our client the problem is intermittent and hasn't happened now for several weeks. Perhaps if you can reproduce it, we can get something to the MS tech that will show where the problem lies.

Thanks,
Frank.

September 17th, 2006 5:06am

Dear All,

I followed your discussion for some time now. Actually I have the same problem using the [SPSS 32-Bit Data Driver (*.sav)] with a Microsoft SQL Server 2005 Developer Edition. I can reproduce the behaviour as much as you want. Here the description:

For those who do not know what SPSS is: It is a very powerful analysis tool that comes with a propriatary file format that can only be read by the SPSS application itself, the programmable API and (for the data) the above mentioned driver. SPSS has a table where the data is stored (data view) and a table where the variable definitions for each column in the "data view" are stored (variable view). The special thing about SPSS is that one can have far more than 1000 columns in the data table - some of my files have 3200 columns...

A stored procedure using the OPENROWSET method in combination with MSDASQL as provider can therefore not access the 3200 columns at once - hence, each column is accessed one by one and then inserted into the a temporary table first (before being written to a final table depending on its data type). The query looks more or less like this:

INSERT INTO ##t_tempspss (S_String, QID)
SELECT ColumnCode, '1234
'
FROM OPENROWSET('MSDASQL','DRIVER={SPSS 32-BIT Data Driver (*.sav)};DBQ=C:\;SERVER=NotTheServer', 'SELECT '+ @SPSSVarName +' AS ColumnCode FROM"SPSSFileWithoutSAVExtention
")

This works fine for depending on the memory available on a computer or server ( I tested with SQL Server Express on my laptop and on a ProLiant Server ) and of course on the number of columns.

For big SPSS files (let's say 3200 variables (columns) and 8000 respondents (rows)) the server will hang an not retrieve any data anymore from the SPSS file when the limits of memory/Page Filehave beenreached. This means, only 2700 variables with their 8000 rows each have been imported for example and the others stall. Even after completion of the stored procedure the memory/page file will not be emptied and stay at max - leaving no memory for a new connection.

As you all wrote before, restarting the server is the only way to clean out the memory/page file and to have an operational server again.

I actually thought that this was a memory leak on the driver side supplied from SPSS but now I think that it is a problem with the SQL server.

Looking forward to some help from Microsoft!
Cheers!


Free Windows Admin Tool Kit Click here and download it now
September 19th, 2006 6:13pm

Hi Gigiga,

Thanks for the info. When it hangs, do you get a process in master.dbo.sysprocesses that cannot be killed?

I will ask the Microsoft tech we are working with on the case if we can use data from your server to help troubleshoot. Will post what he says.

Thanks!
Frank.

September 20th, 2006 12:10am

Hello fdc2005,

Could you actually post how I can check for these hanging processes? I am not advanced yet regarding SQL Server.

What I can tell you is that for example the "dllhost.exe" grows up to 2 GBand then crashes before thestored procedure has finished loading all columns from the SPSS file (e.g. after 2700 columns) - this is very well visible on a screenshot that I did from the Task Manager - Unfortunately I cannot attach that screenshot here in the forum.

Please let me know what I should look for and I will start a retry with your diagnostics.

Cheers,
gigiga

Free Windows Admin Tool Kit Click here and download it now
September 20th, 2006 12:52pm

Here is the picture from the Memory/dllhost.exe crash:

September 20th, 2006 1:11pm

If it does not show here the url:http://www.andikrupp.gmxhome.de/images/MemoryCrash-SPSSImport.JPG

Sry for all the posts, but when I edit in this forum an error occurs each time...
Free Windows Admin Tool Kit Click here and download it now
September 20th, 2006 1:16pm

Hi Gigiga,

I think you may have a related but slightly different issue. If dllhost.exe is exploding in memory usage, then it sounds like something outside of SQL Server, e.g. a web application that is talking to SQL Server.

To check this out, try executing your stored procedure directly from a SQL Server Management Studio query window.

The behavior we are seeing is that occasionally, the use of OpenQuery() will hang -- there is no memory or CPU usage on the SQL Server box, but the query never returns and cannot be killed.

To see items in sysprocesses, use the following in a query window:

SELECT * FROM master.dbo.sysprocesses

The output is one row for each open process (spid) talking to SQL Server.

To help with troubleshooting, try it this way:

(query window 1): SELECT @@SPID
--> record the resulting value -- this is the process ID for the query window 1 connection to SQL Server.

(query window 1): EXEC _YourFailingProcedureName_
--> Let it run

(query window 2): SELECT * FROM master.dbo.sysprocesses where SPID = _YourValueFromWindow1_
--> This gives you info about the process for query window 1.

What we are seeing is that after the OpenQuery hangs, even if we try 'KILL _YourValueFromWindow1_', the process continues to appear in sysprocesses. It changes to rollback, but the rollback never finishes and the process continues to exist until SQL Server is restarted. When the OpenQuery() is used as the source for an insert to a #temp table, then this keeps an open transaction in tempdb, and that means templog.ldf continues to grow without limit until SQL server is restarted or the drive where the file is located runs out of space.

Are you getting similar behavior?

-Frank.

September 20th, 2006 11:46pm

Hello Frank,

I followed your instructions above but the outcome is a little bit different - and maybe I have to give some more information about the server configuration regarding the "dllhost.exe".

Referring to the SQL statement posted above I am using the Provider "MSDASQL" with the SPSS Driver. In the properties of this provider you can configure the option "Allow inprocess". This option determines, if I understand correctly, that either MSDASQL is executed inside the SQL-Serverprocess or outside.

On SQLServerExpress Edition one cannot set this option (limitation of Microsoft of this version) and hence the process is attached to the SQL-Server process by default. Unfortunately, the same stored procedure on SQL Server Developer Edition does not work when I try to attach the process to the SQL-Server process ("Allow inprocess" activated). Therefore, on my server "Allow inprocess" is deactivated,hencecreating the "dllhost.exe" (I assume) to execute all MSDASQL tasks outside the SQL-Server process.

To resume, on my laptop (SQLServerExpress, Allow inprocess activated) there is no "dllhost.exe" and the stored proc crashes with a memory overload, and on the server (SQLServerDeveloperEdition, Allow inprocess deactivated) there is a "dllhost.exe" which crashes when it reaches ~2GB in memory. Please see the screenshot of the server processes just before the crash:

http://www.andikrupp.gmxhome.de/images/Processes.JPG

And almost the same screenshot as before after the crash (no more dllhost.exe exists in the processes):

http://www.andikrupp.gmxhome.de/images/DLLHostCrash.JPG

These pictures are from the tests of today...

---------------------

Having that said, I did exactly what you described above.

The stored procedure was executed on the server and stopped afterthrowing the following errors:

Msg 7399, Level 16, State 1, Line 1

The OLE DB provider "MSDASQL" for linked server "(null)" reported an error. The provider ran out of memory.

Msg 7320, Level 16, State 2, Line 1

Cannot execute the query "SELECT ir4_50 FROM "Russia 2005Q1.1"" against OLE DB provider "MSDASQL" for linked server "(null)".

At that point there was still the process id in the sysprocesses table (because the stored procedure window is open) but after using the "kill" command the process was sucessfully deleted from the sysprocesses table.

I also logged the SQL Server activities using the Performance Tool. The results do not really talk to me but maybe someone of you can do something with these findings:

When the dllhost.exe crashes the following items change radically:
- SQLServerMemoryManagerLockBlocks from ~2500 to 2
- SQLServerMemoryManagerLockOwnerBlocks from ~2500 to 6

So here are my questions to those who have the "almost" same problem:

1) How are your providers configured? (Allow inprocess activated or not?)
2) If somebody else can reproduce this behaviour, how does it look like if you activate "allow inprocess"?
3) Did anybody try already to import a huge dummy text or csv file using MSDASQL or some other provider to see what happens?

Maybe there is a memory leak in MSDASQL and other providers - it is strange that dllhost.exe always crashes when it reaches 2GB no? And why would there be physical limitation on a server with 4GB RAM and at least 10GB of free HD for a page file?

Cheers,
gigiga

Free Windows Admin Tool Kit Click here and download it now
September 21st, 2006 5:53pm

Dear all,

I think I have an answer to one of my questions above. I just tried to use MSDASQL to import a very large *.txt file (4000 columns x 6000 rows) and the "dllhost.exe" is constantly at 8MB and not moving. So I guess, I will have to ask SPSS for a memory leak in their driver... what do you think? Maybe, this is the same case for some of your external drivers...???

Cheers,
gigiga

September 21st, 2006 7:09pm

Hi Gigiga,

Yes, I would try to get the latest and greatest SPSS driver. In our case, the driver doesn't blow up or exhibit any strange behavior, we are just left with a hanging SPID in SQL Server that cannot be killed, and because this SPID is holding an open transaction in tempdb, templog.ldf cannot be truncated and eventually fills up all available disk space and SQL Server comes crashing down unless it is restarted first...

Best of luck!

-Frank.

Free Windows Admin Tool Kit Click here and download it now
September 21st, 2006 7:54pm

I'm having the same issue connecting a linked server via OLEDB provider for ODBC to a MySQL database. EExact same symptoms. (Incidentally, typing in this forum is interesting in Safari, where one cannot see the textarea at all. Totally blind typing. Fun.
October 13th, 2006 6:09am

meca wrote:
I'm having the same issue connecting a linked server via OLEDB provider for ODBC to a MySQL database. EExact same symptoms. (Incidentally, typing in this forum is interesting in Safari, where one cannot see the textarea at all. Totally blind typing. Fun.


OK, let's try Firefox. Better :-). What I was trying to convey is that I am seeing the same behavior as Lukas Ith: SQL Server 2005; OpenQuery worked once, then stopped working leaving "unkillable" processes in the SQL Server process list. Those were stuck there until restarting the service. I did not notice the log growing, though.
Free Windows Admin Tool Kit Click here and download it now
October 13th, 2006 6:17am

Hi Meca,

I've opened this issue as a bug:

https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=187192

If you haven't already, please visit this bug report and vote on it, perhaps we can get this fixed.

I've also opened a support case with Microsoft. We had to wait until the problem happened again at the client site (for us it is intermittent, and only happens once out of every few thousand executions of openquery()). But now we have an instance of it on a test server, and we are working with the Microsoft tech to gather data.

You may not see templog.ldf growing at a fast rate unless you are doing lots of things that require tempdb space, e.g. #temp table usage. In our case templog.ldf kept growing because we are using the openquery as the data source for an insert to a#tempxyz table, and when the openquery() hangs, it holds a transaction open on the #tempxyz table that never goes away until sql server is restarted. With an open transaction in tempdb, the templog cannot be truncated, and will grow until the disk runs out of space or sql server is restarted. That is my understanding, anyway :-)

Regards,
-Frank.

October 13th, 2006 6:52am

Wow,

I am having a similar issue.

My tempdb.ldf continues to grow, will not shrink. When I try to look at the properties of my tempdb it hangs. When I query sysprocesses for open_tran=1, I have 5 out there under the tempdb. They have a status of sleeping. Two have a lastwaittype of pagelatch_up, one with pagelatch_sh and 2 with miscellaneous. when i execute dbcc inputbuffer(spid) i get event types of rpc event and eventinfo of "sp_cursorunprepare;1" Has anybody had any contact with microsoft yet on resolving this issue. Thanks - Jen

Free Windows Admin Tool Kit Click here and download it now
October 24th, 2006 8:35pm

Hi Jen,

Is this behavior after you have used openquery() or another method to access a linked server?

If you try to kill the SPIDs from sysprocesses with open transactions for tempdb, do they go into cmd=KILLED/ROLLBACK and stay that way forever?

If so you may be running into the same problem we have. Are you running SQL Server 2000 sp3 or sp4?

I have opened a support case with Microsoft and our client has a machine hung in this configuration so we are sending pssdiag etc output to Microsoft for their review. I will post whatever the ultimate resolution is for the case.

Thanks,
-Frank.

October 24th, 2006 10:34pm

Hi all,

We've experienced this same issue today.

We use a linked server connection to a MySQL databsase through a DSN connection.

Products involved:
MS SQL Sever 2000 SP4
MSDASQL
MySQL ODBC 3.51 Driver
MySQL 4.1.21

Everything has been working fine until today when I tried to do the following:
select * from openquery(LinkName, 'Select * From products')

As you can see, it is a very complicated query ;-)

I get the SPID hanging around in the sysprocesses and if I kill it it says KILLED/ROLLBACK and won't go away.

Also, if I try to look at the tables in Enterprise Manager everything hangs.

We have exactly the same set up at a client site which is still working fine, with a connection to the same database. So, it is only stuffed on the development server at the moment (touch wood).

As I can see there are no actual resolutions to this at the moment I am really just adding fuel to the fire.

Rob.
Free Windows Admin Tool Kit Click here and download it now
October 25th, 2006 6:09pm

We have a web server that once the service starts, itopens up 5 sessions. The web server connects and calls the sql server via a jdbc connection. When the service opens the 5 sessions they do not have open transactions, but when a report is called through the web service it opensa transaction and does not end it. If I kill the process, it does not hang in a rollback, however it does not let go of the tempdb either.

If I end the web service it lets go of the tempdb, but it does not go back down to size.

When the service has the open transactions I cannot view the properties of tempdb, and I cannot see any current activity through enterprise manager.

SP4

Thank you for your response.

October 25th, 2006 6:36pm

Hi Rob,

Thanks for posting, that is useful that you are running SP4 because one of the suggestions from Microsoft is to upgrade to SP4 in case the problem may have been fixed there. But it sounds doubtful based on your experience.

Probably Enterprise Manager hangs because the killed/rollback SPID has a lock on the table. Have you tried sp_lock to see what that SPID has in its hands?

A practice we use is to always insert from openquery() into #temp tables. That way, if the query hangs, you can kill the job, and when it starts up again it will create a new #temp table (because #temp tables are scoped inside a process) and unless the whole connection to the linked server is dead, you will be able to continue working. But then you have open transactions in tempdb and eventually you will need to stop/restart sql server or the templog file will consume all your disk space. But at least you can do this at a time of your choosing rather than immediately when the problem occurs.

We are working with our client to get a full memory dump on the sql server box that is hung, will post if anything comes out of that.

Regards,
-Frank.

Free Windows Admin Tool Kit Click here and download it now
October 26th, 2006 1:00am

Hi Jen,

Sorry to not be more help, it is odd to me that you can kill the process and yet it continues to hold onto tempdb. Have you used sp_lock to see which SPID has the locks on tempdb (dbid=2 I think, you can confirm with select * from master.dbo.sysdatabases).

Regards,
-Frank.

October 26th, 2006 1:02am

Hi All,

After many phone calls and consultations and sending of memory dumps etc, here is mysummary of the feedback I received from SQL Server support:

- When our openquery() request hangs,SQL Server is waiting on the other side to do something and there is no known way to kill the SPID. If the SPID is holding locks in tempdb, there is no known way to get the SPID to release these locks. Thus once the openquery() request hangs while holding locks in tempdb, there is nothing that can be done, SQL Server must be stopped and restarted, otherwise templog will grow without limit and eventually that will force you to restart SQL Server.

- I received verbal agreement from a Microsoft tech that this is not good behavior for a production database. My position that I stated frequently to Microsoftis that SQL Server should never get itself into a state that requires aforced restart. But I never received confirmation fromMicrosoft that they are planning to modify the behavior of SQL server in regards to this issue.

- Microsoft's suggestions:

1) Add timeouts for the linked server connection, to see if that will force a graceful exit when it hangs (we are currently testing this), e.g.:

EXEC sp_serveroption '_linkedservername_', 'connect timeout', 300
EXEC sp_serveroption '_linkedservername_', 'query timeout', 300

2) Upgrade the Oracle driver used for the linked server connection, to see if that eliminates the hang.

3) Upgrade to the latest service pack for SQL Server, to see if that eliminates the hang.

Personally I hope that option (1) [timeouts] solves the problem -- we won't ever know for sure, but if it runs for the next few months without hanging that will be a good indication that it does. If that is the case then we will be religious about specifying timeouts for every linked server connection that we create.

Regards,
-Frank.

Free Windows Admin Tool Kit Click here and download it now
November 28th, 2006 8:38am

Hi All,

I spoke too soon. Turns out the openquery() hang has happened again *after* applying the linked server timeouts, so I have requested that Microsoft reopen the case.

Regards,
-Frank.

November 28th, 2006 5:17pm

We have the same exact hanging issues on the linked server to both DB2 and Oracle, and we have doneMicrosoft 's suggestions 1) and 2). That means none of those three suggestionswork. We even restarted the MSDTC service, the hanging processstill can not be killed.

The hanging happens whenever theremote server is not available, a SQL server process tries to connect to the unavailable linked server.

No wonder SQL Server shouldnot be considered to be an Enterprise database server.

Free Windows Admin Tool Kit Click here and download it now
January 4th, 2007 2:49am

Experiencingsame type periodic behavior using SQL 2000SP4 linked server connection to Oracle 10g. Not sure I agree with the previous statement but then everyone has their favorites.

Can anyone suggesta way to test the query process time from within a stored proc or sharehow to create thatkill job job? When it happens to us the Query and/or Connection timeout settings for the link do not get applied. Is there a timeout setting in the Oracle SQL*Net alias config (TNS settings)?

Only clue we have as to the cause is it seems to only happen when the Oracle server is online and available but the table/view being queried is in some state of being changed. i.e. the 'Other Side' is making changes in production without notifying us.

Please post the resolution if MS determines one.

Thanks!

January 18th, 2007 9:53pm

Have you tried setting the remote query timeout option using sp_configure? Both 'remote query timeout' and 'remote login timeout' values can be specified using sp_configure.

When 'remote query timeout' option is set, SQL Server will use this value toset theDBPROP_COMMANDTIMEOUT rowset property in the provider while executing a command. If the provider supports this property, then the query should automatically timeout after the specific time limit is reached and the local server should abort/rollback the transaction. This should prevent the server leaving a SPID with open transaction.

Free Windows Admin Tool Kit Click here and download it now
February 2nd, 2007 4:44am

Hi All,

Quite surprised to see the issue i am facing with mssql 2005 sp2 is happing with other peoples....

Here's is the simulatable steps to see this is happening...

My Environment:

client: Windows 2003 with sp1 and all latest updates

client: Mssql 2005 with sp2 (using named instance)

Target server: mysql 4.026 on unix cent 3.8 box (ip: 173.5.3.167)

I have created a dsn (deathstar23) and able to test connection successfully with UDL...so connection to mysql is not an issue...mysql admin interface is also connecting without any issue...

I have set mssql remote login and timeout properties to 60

here's the command i used to create linked server...

EXEC master.dbo.sp_addlinkedserver

@server = N'deathstar23'

, @srvproduct=N'MySQL'

, @provider=N'MSDASQL'

, @datasrc=N'deathstar23'

, @provstr=N'DRIVER={MySQL ODBC 3.51 Driver} ;SERVER=173.5.3.167;DATABASE=mysql; USER=ms33; PASSWORD=track21'

go

Just after creating this connection below query runs fine....

SELECT * FROM OPENQUERY (DEATHSTAR23,'SELECT * FROM mysql.test') M

But after10 secondsI changed the table name to another table this query gave this error ...

Msg 7399, Level 16, State 1, Line 1

The OLE DB provider "MSDASQL" for linked server "DEATHSTAR23" reported an error. The provider reported an unexpected catastrophic failure.

Msg 7303, Level 16, State 1, Line 1

Cannot initialize the data source object of OLE DB provider "MSDASQL" for linked server "DEATHSTAR23".

and now when i run this query it goes in the hang mode.....only way to come out is to kill sql server exe and start it again and then for the first time this query works and after 10 seconds same behaviour happen.....

I have followed almost all the suggestion available in last few days in almost all the forums with no success....test multiple sysql providers with no luck only this one worked...

When i used MySQLProv as provider to create linked server this is the error i got...

Msg 7302, Level 16, State 1, Line 1

Cannot create an instance of OLE DB provider "MySQLProv" for linked server "DEATHSTAR23".

and in the same forum i learned the cherry's provider is also not working.....

So it looks to me that there is some problem in creating linked server as the communication protocols for MSDASQL arehaving some mismatch whether its oracle or mysql or sybase....I am not sure if any other protocol is having any issue casue from this boxI have created alinkedserver tooracle with MSDAORA provider and no issues...

I hope by now MS must have aware of the issue and able to figure out a quick solution....if not then above steps sud help them to recreate the problem....

If anybody got any solution please email me at writeme@hedaconsulting.com....I wud be happy to test.....

Thanks & Regards...

Prakash Heda

March 18th, 2007 11:32pm

We have a similar issue. We are running SQL Server 2000 SP4 8.00.2187. I have a linked server set up to a Foxpro database. I am using the Microsoft OLE DB Provider for ODBC Drivers. The problem will occur when performing an OPENQUERY. Once it becomes hung up the spid cannot be killed and the only way I have been able to make it go away is to reboot the server. I have tried restarting the SQL Server service but that hangs as well. I don't know if our tempdb grows or not. We have a connect and query timeout set on the linked server and a remote connect and query timeout set as well.
Free Windows Admin Tool Kit Click here and download it now
June 11th, 2007 11:22pm

We have the same problem with processes hanging while using OPENQUERY with Microsoft OLE DB Provider for ODBC Drivers to get data from Visual FoxPro legacy databases. Most of the time it works great but on occasion processes start to hang and eventually the server will need to be rebooted.

I know this is an old thread that started in 8/9/2006 but did anybody ever get an answer from Microsoft that worked? Our workaround has been to isolate the application that calls FoxPro which in turn can cause the server to need rebooting, pretty pitiful huh?

Apparently, this is still an issue in SS2005 as stated earlier in this thread, so how about some help MS?

Anybody from Microsoft have an answer?

BillN
November 9th, 2007 1:42am

I have a similar issue to the above with SQL Server 2000 SP4. We run severalopenrowsets to an Oracle DB; and if a connection fails we need to stop the other calls, so we can retry. The only way we've been able to resolve the problem is to either shut the SQL Server down - not good as it's production; or we use TCPView from sysinternals to locate the remote IP addresses and kill the individual TCP connections. We're now looking at coding this into our Scheduler to kill the TCP connections on failure.

However, it looks like we might be moving towards a managed environment, and they will never give us this kind of permission.

Surely there is a fix/workaround to the issue. I've seen people mention using the timeout command, but if you don't know how long the query should run for how can you put a limit on it. Our problem is that we might get a failure on one call, but the calls then appear stuck, and only by killing the TCP connection can we close the SPIDs.

Free Windows Admin Tool Kit Click here and download it now
November 15th, 2007 6:38pm

It's funny that this post is still open and accumulating comments, I found it while searching for some other OLEDB problem we're having with sql server 2005. I was the original poster and no, Microsoft never came up with a solution. We went through many rounds of escalation and many suggestions from Microsoft and nothing helped.

To the moderator who suggested the following:

"Have you tried setting the remote query timeout option using sp_configure? Both 'remote query timeout' and 'remote login timeout' values can be specified using sp_configure."


This does not help - it was one of the things suggested by Microsoft tech support that we tried which had no impact.


Best regards and best of luck to all of you who are having this problem.

-Frank.

May 5th, 2008 9:55pm

Hi Frank,

I've still been playing with this. We tested SQL Server 2000 SP4 and SQL Server 2005 with Oracle 10g instances with the 10g client drivers on our server. We tried Openrowset, openquery with linked servers, tried all of the recommended connection timeouts, we placed timeouts on the asp.net app that calls the Stps that call the openrowset (this by the way just ends the connection between asp.net and sql server, but leaves the queries running still), tried DTS package, same problem.

If we open 10 connections to the oracle environment and 1 failed the other 9 would stay locked in the processes (KILL will not work). I have found examples with just one call to the simpliest of tables (holds 1 row, 5 columns - tells us whether we need to pull any data from the oracle instance) this can also hang.

Now, I don't know which component is at fault: Oracle; Oracle driver, SQL Server. I know for a fact the Oracle instance we call, is heavily used and is suffering, so maybe the connection dropping is Oracles fault? However, SQL Server doesn't seem to pick this up. Reading Microsoft docs on the problem, suggests that SQL Server is just sitting waiting for a response back from Oracle (it never gets one) so it hangs, and the spid will never check back whether it has been killed. Microsoft recommends you reboot the SQL instance if this occurs - but we can't do this, because our website runs of the same instance, and we'd see dropped connections; our users would not be happy, it has to be up 24x7

So as a workaroundwe found some code on TCP Connections (I believe the Sysinternals guy wrote in C# for his TCPView) we've built an exe that just terminates connections based on an destination/source IP/IP+Port. Now there is no clear way of targetting the exact connections to the destination server, so we just target all conenctions to this server, and will kill these if we detect our last load failed. This appears to do the trick, and our server has been up ever since with no problems.

The only other way I can think of safely building this is to drop openrowset etc and code only in .net and pull the data into a file and then bulk insert that into a holding table. I'm not exactly sure if this is the best situation, as the data has to be built, and then imported, where as openrowset is pulling the data directly into the tables.

Paul

Free Windows Admin Tool Kit Click here and download it now
May 6th, 2008 12:55pm

Hello,

We have been experiencing this problem as well, where our linked server is pointing to an instance of MS Analysis Services. So it is likely that the problem is with SQL Server itself, since other of you have had the problem whilst linking to Oracle.

The fact that this issue has existed for so long and that MS has not provided a solution to the loads of people complaining about it in this forum is indeed very disappointing. We will open a support request with them today and if it goes anywhere useful, I'll post back.

June 19th, 2008 12:10pm

Hi

Has anyone gotten a workable solution from MS?

We had asimilar situation with distributed transactions hanging on SQL 2005 between three instances on three separate servers ( It was a single script accessing the three instances that left the open transaction). Restarting services nor re-booting seems to solve the situation. The spids remain. In our case the issue is that the Tempdb on all three instances remain with open transactions. Killing SPID's does not do anything either.

Free Windows Admin Tool Kit Click here and download it now
October 5th, 2008 6:38pm

Were in 2009 and yet no solutions for this?

All that I had perceived is that sometimes my linked server connection is locked with a KPID that isnt running at all in the server.

I think the question is "How to kill such a SPID process?".
July 31st, 2009 4:16pm

We faced a strange error related to a linked server connection to an Informix database.

When trying to execute a OpenQuery statement with a -very long(It was a in clause)-commented line (starting with "--") just before the last line (which has single quote and closing bracket), the sql server hangs. It turns out it has actually re-started. If we remove this comment line out of openquery statement, than it runs fine and returns the results.

Anybody faced a similar or relatedproblem? Basically, how do I prevent SQL Server from re-starting and allow it to exit gracefully from this error?

Free Windows Admin Tool Kit Click here and download it now
September 17th, 2009 2:44pm

I am currently experiencing this problem in both a SQL 2005 and SQL 2008 environment. My servers point to an Advantage Database OLE DB provider linked server. Every once in a while some process hangs and nobody can access the linked server until we reboot the SQL server.

I propose this theory:

Create a separate SQL Server Instance (with very limited memory allocation) on the production server. Or a different machine (maybe a Virtual Machine) with a SQL install where you will allow a server reboot.

There could be licensing issues with this workaround. But Microsoft isnt giving any working solution to this problem

Place your code where you talk to your non-SQL linked server on this new instance of SQL.

Change your code in production to point to the new instance instead of talking to the non-SQL linked server directly. The new instance will be the one to pull in data or push data to the non-SQL linked server.

When the linked server fails (and Im sure it will) on your second SQL instance, you only need to restart the second instance NOT your production server instance.

I have not (knock on wood) had an issue with SQL linked servers pointing to other SQL servers.

September 30th, 2009 5:17pm

We are having exactly the same problem and implemented carbonc's suggested on our own as well. It's sad that we have to move the process to another server that can easily be rebooted without serious production issues instead of actually fixing the problem.

 

It's 2010 and still no solution on this? Comeon Microsoft.

Free Windows Admin Tool Kit Click here and download it now
August 26th, 2010 6:55pm

Instead of using OPENQUERY to linked server Alpha, consider the application connecting directly to remote server Alpha (push up the problem into the app layer for resolution).

Related link: http://www.sqlusa.com/articles2005/linkedserver/

SSIS Import/Export Wizard is another alternative: http://www.sqlusa.com/bestpractices/ssis-wizard/

September 2nd, 2010 10:31am

It's 2010 and still no solution on this? Comeon Microsoft.

It's 2011 and still no solution on this? Looking at my locked up server now that's going to cost me two days of outdated reports before I can restart it.
Free Windows Admin Tool Kit Click here and download it now
May 6th, 2011 7:28pm

Also having this problem occasionally... openquery, linked oracle server, sql server 2008 R2.

 

When is this ever going to get solved?

November 17th, 2011 4:00pm

It's 2012.  Same problem on SQL2005 with Microsoft.ACE.OLEDB.12.0.  Grrr.
Free Windows Admin Tool Kit Click here and download it now
February 3rd, 2012 2:33am

I had the same problem, a session to Oracle linked server was hanging and blocking other sessions. I tried to kill the the session but it stayed in KILLED/ROLLBACK state.

Kill with statusonly showed no progress: transaction rollback in progress. Estimated rollback completion: 0%. Estimated time remaining: 0 seconds. The session consumed no cpu.

Solution:

1) Download TCPView http://technet.microsoft.com/en-us/sysinternals/bb897437

2) Run TCPView, locate the connection to linked server and close the connection.

Worked for me, didn't have to bounce SQL Server. After closing connection the blocking session disappeared and things got back to normal.

March 2nd, 2012 3:36pm

I had the same problem, a session to Oracle linked server was hanging and blocking other sessions. I tried to kill the the session but it stayed in KILLED/ROLLBACK state.

Kill with statusonly showed no progress: transaction rollback in progress. Estimated rollback completion: 0%. Estimated time remaining: 0 seconds. The session consumed no cpu.

Solution:

1) Download TCPView http://technet.microsoft.com/en-us/sysinternals/bb897437

2) Run TCPView, locate the connection to linked server and close the connection.

Worked for me, didn't have to bounce SQL Server. After closing connection the blocking session disappeared and things got back to normal.

Is there a way to use TCPView to solve this type of problem when caused by query like this?

SELECT *
FROM OPENROWSET(
    'Microsoft.ACE.OLEDB.12.0',
    'Excel 12.0;Database=\\myNetworkPath\myExcelFile.xlsx;HDR=YES;IMEX=1;',
    'SELECT * FROM [Sheet1$]'
)



Free Windows Admin Tool Kit Click here and download it now
May 8th, 2012 5:28pm

That has worked for me - select from linked Oracle 11g server.
January 23rd, 2013 6:14pm

Hi All,

Just a quick followup note -- after nearly 7 years this problem has not been solved, and if I had to bet, I don't think it will ever be solved. In 2011 I again spent many frustrating hours on the phone with Microsoft techs about this same behavior in SQL Server 2008. We have seen it crop up in SQL Server 2000, 2005, and 2008 so far. And I suspect it exists in SQL Server 2012.

What I heard from Microsoft was that yes, this is a problem. But solving it would require too much delicate programming in the SPID-handling code in SQL Server, and so they choose to not solve it.

If anyone at Microsoft wants to look at case documentation, the 2011 case ID was 111080381910299.

If you want absolute protection against this problem (hung OPENQUERY() to linked server causes tempdb to grow without bounds), then you must put all your code in an external process (e.g. a .net program). From the .net program, open a direct database connection to the source server (e.g. Oracle), and a separate direct database connection to SQL Server. Then perform all your Oracle data extracts in the .net program, and push the data into SQL Server. If the .net program hangs reading from Oracle, it can be safely killed without requiring a restart on SQL Server. Then delete all your linked server entries.

So... linked server connections are nice, but they come with this serious flaw of occasionally requiring a SQL Server restart.

Caveat emptor.

-Frank.

 

Free Windows Admin Tool Kit Click here and download it now
April 10th, 2013 9:10pm

Hi,

We have faced a similar issue in Production database SQL server 2008 R2. Openquery hung and cannot establish a test connection in SSMS. Also could not kill process running. The issue got resolved only after a SQL server reboot. Can anyone advice on what work around can be done?

Thanks,

Preetha

August 1st, 2013 4:35am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics