User profile for physix

SQL 2005 Enterprise x64 sp2 build 9.00.3152.00 running on Windows Server 2003 x64 Enterprise I set my recovery interval (checkpoint) to "2" minutes to decrease the checkpoint duration but it didn't help. Were you suggesting this as a fix for the lack of LSN? are you not able to replicate this in testing? / comments

SQL 2005 Enterprise x64 sp2 build 9.00.3152.00 running on Windows Server 2003 x64 Enterprise I set my recovery interval (checkpoint) to "2" minutes to decrease the checkpoint duration but it didn't...

0 votes

This new version does indeed correct the retry logic that never seemed to work before. I have been running it for several weeks. It does however bring up a new issue: Previously if backup attempt 1 failed all 3 retries would immediately fail. No file would be written and we would cross our fingers that the next scheduled log backup would succeed. Now, If the first backup attempt fails (forcing attempt 2-4 to be made) IF 2-4 succeed then no LSN information will be written to the created file. This breaks any log shipping or wildcard restores that need to occur. This is a consistent problem and can be replicated across multiple servers. Anytime 2-4 succeed there will be no LSN information. IMO this is almost worse than the original issue. Please advise. -michael / comments

This new version does indeed correct the retry logic that never seemed to work before. I have been running it for several weeks. It does however bring up a new issue: Previously if backup attempt...

0 votes

There are definately no other backups running. I can copy the TSQL code from my jobstep and execute manually, again and again. (70% of which will fail during the middle of the day.) At night this process will always succeed. As for growth: The log file is not growing, however log size does affect checkpointing. I believe that if a log file is too big (ex:20Gb) with only a small about of used space (300mb) then a checkpoint will take longer to complete as it has to scan a larger physical container to write the small amout of dirty pages. (little bits of log spread over a large about of harddrive) our log is well trimmed and sized accordingly to our use. (5Gb container with use between 01-75%) Perhaps this is too much for a busy server to checkpoint efficiently? Still with a recovery interval of 2, the actual checkpoint process shouldn't be interfering... It sounds like the idle checkpoint spid is blocking the backup spid regardless of if the DB is checkpointing or not. The spid is blocked and redgate is forced to wait, then when it finally does come back the virtual device that redgate created is no longer there. We then throw an operating system error: 6/27/2007 2:35:33 PM: SQL error 3201: Cannot open backup device 'SQLBACKUP_0D9DFC95-15CC-42A4-8178-C5D7BE5850FE'. Operating system error 0x80070002(The system cannot find the file specified.). / comments

There are definately no other backups running. I can copy the TSQL code from my jobstep and execute manually, again and again. (70% of which will fail during the middle of the day.) At night thi...

0 votes

an interesting thing to note is that when the backup process is succeeding then it blocks the sys checkpoint spid (10). Does the backup process issue its own checkpoint that is butting heads wtih the sys checkpoint? / comments

an interesting thing to note is that when the backup process is succeeding then it blocks the sys checkpoint spid (10). Does the backup process issue its own checkpoint that is butting heads wtih t...

0 votes

email sent... the backup log statement is indeed being blocked by the checkpoint spid. when this occurs the backup fails. When it executes and beats the checkpoint spid to runnable then the backup succeeds. In our case the checkpoint spid is 10 and it cycles through the databases that it is checkpointing. It goes between "SUSPENDED" and "RUNNABLE" every few seconds. (70% on suspended) It appears that when the status is "SUSPENDED" and the DBName column is "mydatabase" then the Log backup process will fail for "mydatabase". This is all being checked via: sp_who2 "active" I have changed the recovery interval to '2' (2 minute checkpoints) but it still cycles between these status modes every few seconds. (the backup still randomly fails) My guess is the checkpoint is waiting for I/O to complete and reports SUSPENDED... then when I/O completes it changes to RUNNABLE, but may not necessarily checkpoint because of the checkpoint interval. If i backup during this RUNNABLE state then it succeeds. Full backups always succeed. This occurs when the DB is in both FULL and BULK models. / comments

email sent... the backup log statement is indeed being blocked by the checkpoint spid. when this occurs the backup fails. When it executes and beats the checkpoint spid to runnable then the backu...

0 votes

Bump... Sorry for the delay in my response. This is still happening and is becoming more of an issue each day. Our server is experiencing heavy slowdowns and blocking at random times and coincidentally the log backup job is the only agent process running. I say coincidentally because I find it hard to believe that a blocked process can cause any problems (because the spid is sleeping) but my dba team is throwing "redgate problems" to management. This last slowdown showed the SQBbackup spid being blocked by the DB checkpoint spid for over 15 minutes. The server tanked throughout this period. Like charleyevans I am running sql 2005 EE x64 on an x64 quad xeon monster server. At present we have 16gb ram and looking at buffer cache hit ratio and disk I/O we have very little evidence of needing memory. petey: as for your question, No I cannot definitively say that there were no other backups during that posted log. (I was testing) But on a normal day I receive 10-13 failed log backup emails where there is nothing running concurrently. Rather than post those here can I email you the output? FYI 2.5 minutes is a short timeout before failing. (on our server) Iâ€™m surprised you're surprised. most failures take longer than that. brian: correct me if I'm wrong, but MAXDATABLOCK and or MAXTRANSFERSIZE will not apply if the database cannot even be reached. In any case, yes I have tried these settings to no avail. currently: exec master..sqlbackup N'-SQL "BACKUP LOG [MicroTraffic] TO DISK = ''\\mitsql2\e$\BACKUP\MITSQL1\SQL\MicroTraffic\<AUTO>'' WITH NAME = ''<AUTO>'', DESCRIPTION = ''<AUTO>'', ERASEFILES = 15, MAILTO_ONERROR = ''me@mine.com'', MAXDATABLOCK = 1048576, COMPRESSION = 1"' fails 50% of the time during the day. The backup-log job backs up 4 other databases without problems. However "microtraffic" is the pig. Thanks again! -m.bee / comments

Bump... Sorry for the delay in my response. This is still happening and is becoming more of an issue each day. Our server is experiencing heavy slowdowns and blocking at random times and coincide...

0 votes

log backup failing randomly (dreaded vdi 1010)

We have an issue with a particular SQL server and I hope that you can shed some light on it. During the day (busy times) the log backup process fails with a server connection error. (the dreaded ...

7 followers 18 comments 0 votes

How can we help you today?

physix

Activity overview