Performance Issues

The trip to the remote server consists of a large overhead, not only in logging in to the remote server, but in passing the data back across the network. When I ran a SHOWPLAN performance analysis on the above query, it estimated that the trip to the remote server took up more than twice the resources it needed to access the data locally. When I padded out the authors table on both servers with some extra data and run the showplan again, the trip to the remote server then took three times as much resources.

If you pass back more data than you need from the remote server, you will decrease the speed of the query and possibly impact performance on the network as a whole. In extreme cases, your Network Manager may well come and shout at you!

Cutting down on traffic

Let's say for argument's sake that I "suspect" certain rows from the pubs table on the remote server are incorrect. I can improve performance only by bringing these suspect rows back across the network.

By simply coding the row IDs into the Where clause, SQL Server 2000 is smart enough to include the clause in the query it sends to the remote server, so only the rows I specify are sent back, not the entire table.

select r.au_fname as remote_fname, 
  r.au_lname as remote_lname, 
from  authors l 		
join  fuji_PUBS_DB.pubs.dbo.authors r
on    l.au_id = r.au_id
where (
      l.au_fname <> r.au_fname OR
      l.au_lname <> r.au_lname
and   r.au_id in (
      '267-41-2394' )

Note also that I chose to display in the result set the entire row from the local server, but only the au_fname and au_lname columns from the remote server. If I had used "select *" then SQL Server would have had to pull back all the columns, even though I was not particularly interested in looking at them here. SQL Server 2000 was smart enough to realize it only needed to pull back those two columns, plus the au_id column I used in the where statement--a neat bit of programming from the guys at MS!

You can check if your remote query is optimized in this using Query Analyser by displaying the execution plan for your SQL and hovering the cursor over the "remote query" icon.

Calling Stored Procedures

Stored procedures on the remote server can be executed easily, but first the server needs to be configured for RPC, which is just a one-line command:

sp_serveroption fuji_PUBS_DB, [rpc out], true

This only needs to be done once--we can now call stored procedures using the same 4-part naming style we used to reference tables:

exec fuji_PUBS_DB.master.dbo.sp_who2

Updating Remote Data

Updating tables on the linked server is not a problem either. Here is a simple example based on the select statement we used to join the authors tables on the remote servers:

UPDATE fuji_PUBS_DB.pubs.dbo.authors
SET   au_fname = r.au_fname
,     au_lname = r.au_lname
from  fuji_PUBS_DB.pubs.dbo.authors r
join  authors l
      on l.au_id = r.au_id
where (
      l.au_fname <> r.au_fname OR
      l.au_lname <> r.au_lname

With this UPDATE command there needs to be two separate trips made to the linked server: Trip one pulls the data from the linked server back into the local server, where the join takes place. The second trip performs the actual update once SQL Server decides which rows need updating.

More performance issues

In this example I have left out the list of "suspect" rows, but if I had left them in, the query performance would be better because the rows could be filtered at the linked server end, resulting in less data flying over the network, and less work to do in the join.

Like all update transactions, updates to linked servers are done in a transaction--and as the transaction will take longer than a purely "local" transaction, this means more potential for locking and blocking on the remote server, so distributed transactions like this need special care to ensure they are designed efficiently.

You can also wrap up multiple updates in a single distributed transaction--though obviously the concerns about resource utilization still apply, only more so. A full run-down on Distributed Transactions would probably double the size of this article though, so I will let MSDN take over on this issue!