Linked Server performance (2)

Okay, so I was able to devote some time the other day to this. After some digging and noodling, I decided to set up the profiler on both the source and the remote server.

After firing the query the stored procedure runs, I watched the profiler. It ran as I expected with two queries for the remote server as it broke down the joins.

I then fired the stored procedure, and while it was running, I was surprised to see the profiler firing over and over with the same query off the same tables. That lead me to think that we had an issue with a cached execution plan, where the optimizer thought the best results would be gotten by running the query on the large table, and then running a loop on the remote query for each row.

So I queried the plan cache and in the where clause checked for the name of the stored procedure. I found two.

SELECT plan_handle, st.text
FROM sys.dm_exec_cached_plans 
CROSS APPLY sys.dm_exec_sql_text(plan_handle) AS st
WHERE text LIKE N'%events_by_machine%';

After pulling the xml, and viewing the graphical plans, they confirmed my suspicions that I was facing a bad plan. The question is why did it start and what could I do about it? Parameter peeking is probably an issue here. My guess is that since we’re querying the large table based on a sequential ID, if the stored procedure is first fired off with a very recent ID, then we’re probably going to have the optimizer decide that perhaps looping through the few rows in the large table to the remote server will result in a fast plan. However, when running where we go back in time for an ID that’s a few million rows back in the table, the optimizer decides that pulling the small remote tables and looping against the large table is better.

Using Plan Explorer from SQL Sentry , I could see the parameters that were used to create both plans, and it also confirmed that the “bad” plan had only one row being returned from the large table in the estimated rows of the plan, and the “good” plan had a number of rows being returned.

Oh, and I’ll mention that this happened the day after rebuilding indexes and about an hour after statistics were updated, so I think the estimated plan information would have been pretty accurate.

First steps we’re taking are to hide the parameters by using local parameters inside the query of the stored procedure. Next step will be increase the statistics updates on this large table, which has a growth rate of 10 million rows / month. We’ll also work on possibly forcing a plan, and ultimately rework the architecture to remove the need for a linked server. Whether that includes code changes that allow us to physically move the needed tables to the other server or setting up replication on those articles is to be determined.

Linked Server performance

Ok, this is an interesting issue…

We have, to quickly summarize, an architecture that was developed before I joined… the team moved from a single SQL Server database with 12 DB’s on it, to two DB Servers, and separated the db’s loosely into data processing and data presentation. Unfortunately there is one process that crossed those boundaries and has to do a join query to a database on one server and insert the results after the business logic is processed onto the other server. Both servers are SQL Server 2005 SP3.

Right now, we’re working from the presentation server, where the data is inserted, and have a link to the processing server where a large table is joined to a small table, and also joined to a couple of tables on the presentation server.

We have a SP that holds the SELECT and join. Often times this runs fine, and returns data in an acceptable time considering the large table contains 80 million rows. However, after a time, it seems that the performance drops massively, and even times out the default settings for the linked server connection (10 minutes). When that happens, we actually have to drop and recreate the stored procedure. Yes, setting WITH RECOMPILE on a call to the SP does not help. Setting WITH RECOMPILE inside the SP does not help. The odd thing is that when this happens and the SP query return nosedives, I can pull the query out of the SP, run it with the same parameters, and BAM! 7 seconds later I get the results as expected. We’ve even tried replacing the parameters with local parameters, worried about parameter sniffing – which shouldn’t really matter if we force a RECOMPILE.

So, we’re working out some other options, such as switching the flow, and the linked server direction, since the tables on the Presentation DB Server are very small. The insert it larger, but the code can just connect directly to the server once the in memory processing is done.

I know there’s also issues if you use a login for the linked server that does not have permissions to get the statistics for the tables being queried, but two things tell me that’s not the issue… 1) I’ve since changed the login to be in the SysAdmin role, and 2) When I run just the query instead of executing the SP, the results are lightning fast… so I know the statistics are being used.

WOW, has anyone seen this before?

T-SQL Tuesday #015 – Automation in SQL Server

An interesting topic, challenge, and idea from TSQL Tuesdays here.

Here’s mine. I have a SQL Agent job that I install on every instance that I manage. It’s a simple job that only fires on SQL Agent Start up. It queries sys.databases for pertinent information regarding the databases (multiuser, online) etc. and emails it to an Exchange Distribution list in my company. I get an email every time a SQL Server Agent blinks off and then back on. The email lists each of the databases and tells me if they’re ready to support their application or not.

Here’s the script.

USE [msdb]

SELECT @ReturnCode = 0

IF NOT EXISTS (SELECT name FROM msdb.dbo.syscategories WHERE name=N'Database Maintenance' AND category_class=1)
EXEC @ReturnCode = msdb.dbo.sp_add_category @class=N'JOB', @type=N'LOCAL', @name=N'Database Maintenance'
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback


EXEC @ReturnCode =  msdb.dbo.sp_add_job @job_name=N'Sql Server Agent Restart Notification', 
		@description=N'No description available.', 
		@category_name=N'Database Maintenance', 
		@owner_login_name=N'sa', @job_id = @jobId OUTPUT
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

EXEC @ReturnCode = msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'Check DB Status', 
		@os_run_priority=0, @subsystem=N'TSQL', 
		@command=N'DECLARE @msgSubject as varchar(100) 
DECLARE @servername as varchar(50)
DECLARE @statusQuery  varchar(500)
Set nocount ON
select @servername =@@servername
Set @msgSubject = ''DB STATUS on SERVER ''  + @servername
set @statusQuery= ''select distinct convert(varchar(35),name) as NAME, 
	convert(varchar(20),convert(sysname,DatabasePropertyEx(name,''''Status''''))) as [STATUS],
	convert(varchar(20),convert(sysname,DatabasePropertyEx(name,''''Updateability''''))) as UPDATEABLE,
	convert(varchar(20),convert(sysname,DatabasePropertyEx(name,''''UserAccess''''))) as ACCESSIBLE
	from master..sysdatabases  ''
EXEC  MSDB.dbo.SP_send_dbmail @profile_name = ''DBA Email'', @importance=''HIGH'', @subject=@msgSubject , @recipients='''', @query=@statusQuery', 
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_update_job @job_id = @jobId, @start_step_id = 1
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_add_jobschedule @job_id=@jobId, @name=N'1', 
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_add_jobserver @job_id = @jobId, @server_name = N'(local)'
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
GOTO EndSave


So the big part in there is to pass sp_send_dbmail an @query with the results of a status query to sysdatabases.

SQL Server Merge Replication

I have had to set up Merge replication on a small custom database located in the US to a server in China. Our company’s WAN and uplinks at the branch offices leaves a LOT to be desired. Anyways, here’s the steps I took.

Publication database is on a Windows 2008 R2 Highly Available Cluster.
Databases are both on SQL Server 2008 R2 instances.
Create a clustered file share for the replication snapshot files to be located.
Create an AD service account to run the Merge Agent and Snapshot Agent.
Grant the appropriate permissions for the service account to the cluster share.
Run the create merge publication script (I’ll share later).
Log onto the remote server.
Create the database that will be the subscriber.
Run the subscription script (I’ll share later).
Back on the publisher DB, execute sp_addmergesubscription with the details needed.
On the subscriber, add the users and logins as needed and enjoy.

Since this is a small database, and there aren’t a lot of heavy hitting databases on either server, I decided to allow the publisher also act as the distributor. I can always break and change that if needed later on.

Smoked Pork Butt

We’re having my wife’s family over for the Superbowl so I decided that a 5 lb smoked pork shoulder would be a perfect pre-game dinner.

Picked up the butt from the butcher on Saturday. Rubbed in my special rub and wrapped back up tight and placed in the fridge by 2PM on Saturday.

Sunday morning:

8AM – Remove butt from fridge to rest on counter.
Soak applewood chips in water
Fire up smoker
8:30AM – Place wood chips in smoker tray.
Add butt to smoker

Microsoft DPM 2010

Wow, is this released, and are people using it? I figured I’d install this at home on a VM in my HyperV server to pre-test before doing the same at work. We’re entertaining the idea of dropping NetApp’s Snapdrive for Windows and Snapmanager for SQL on our primary datacenter filerheads, and I wanted to find a replacement or possibly an upgrade.

After 3 failed attempts to install, all I can say is “Sloppy”. The failures have been during DPM “Reporting Services configuration”. All the reports are deployed, and I can view the reports via the URL. But I get the “dreaded” 812 error saying some generic reporting services error ocurred. Advice on the web has to do with SSL and RS being configured to use HTTPS. Not in my case. Digging in the logs, I see this error pop up “Mojito error was: PasswordTooShort”. References on the web suggest that the password does not match the domain GPO policy. However, the password is the same that I’m using for my domain account, so forget that idea. No resolutions that I have found, one attempt from some MS guy was to try to do a net user /ADD. Guess what, that works without error. Great one MS.

Trying to install on Windows Server 2008 R2. Oh, and here’s another problem, probably a Server 2008 R2 problem. Try to reinstall, get an error that the database DPMDB on instance MSDPM2010 exists, and to delete prior to reinstall. Guess what, if you open a command window and try to bring up the DAC via sqlcmd -S localhost\MSDPM2010 -A you get “Login Failed for user that did the install”. Oh but forget it – you, open a command window “As Administrator” and it works fine. I absolutely hate running SQL Server on Windows Server 2008 R2.