Comments
Sort by recent activity
Hi,
Something we are considering in the future is the ability to allow only certain columns to be compared, so you could define a row as being "equal" if a subset of those columns were equal. Would this be something that would be useful in your situation?
Other than that, I don't think there's any way of reducing the data we send over the network - if you're going to check it's equal, you need to get both copies to the same place somehow.
Rob / comments
Hi,
Something we are considering in the future is the ability to allow only certain columns to be compared, so you could define a row as being "equal" if a subset of those columns were equal. Would...
Hi Ewan,
We won't create a third database; in fact, we don't write any data to the SQL Server during the comparison at all, only when (or if) you choose to run the synchronisation script. At that point, it's just INSERT / UPDATE / DELETE statements run against the target database, so you're still just using the original two databases.
Hope that helps,
Robert / comments
Hi Ewan,
We won't create a third database; in fact, we don't write any data to the SQL Server during the comparison at all, only when (or if) you choose to run the synchronisation script. At that p...
Hi,
Data Compare will work with TB sized databases, but I think you are going to hit the fundamental problem that reading that amount of data, in any way, is going to take a long time.
Linked Servers aren't going to help you either, I'm afraid - all they'd do is effectively redirect all traffic from from server B through server A before getting to Data Compare, so that'd make it slower if anything.
You could probably improve performance somewhat by running SDC on one of the servers directly, so then you're only pulling half the data over the network (i.e from one of the servers, rather than both).
Another possibility is that if they're both on the same SAN, and you're also doing backups to that SAN, you could run SDC against the local live server, and a backup of the "remote" server, but which is stored on a "local" (i.e. much higher bandwidth than Ethernet) SAN disk.
However, even then, I don't think it'll be a quick process on that much data!
I think your best bet is definitely to go for as restrictive a WHERE clause as you can, at which point a much smaller number of rows actually need to be compared.
Rob / comments
Hi,
Data Compare will work with TB sized databases, but I think you are going to hit the fundamental problem that reading that amount of data, in any way, is going to take a long time.
Linked Serve...
Hi again,
Another thought (courtesy of RichardJM) - if you know something more about the data that's likely to be new / changed, such as a timestamp that gets updated when rows are modified, you could maybe use a WHERE clause to limit the number of rows that are compared, which would obviously speed things up.
Rob / comments
Hi again,
Another thought (courtesy of RichardJM) - if you know something more about the data that's likely to be new / changed, such as a timestamp that gets updated when rows are modified, you co...
And another couple (you've got us all thinking about this now!)...
1) Try using the /ignoreidentical option. This should stop any identical rows being written to disk during the comparison, so will speed up that bit of it. Obviously won't help if the network / SQL Server is the bottleneck, though.
2) Are you using clustered indexes as the comparison keys? We retrieve the rows in the order defined by the comparison key, so if you use one that matches a clustered index, SQL Server will be able to return the rows much more rapidly.
Rob / comments
And another couple (you've got us all thinking about this now!)...
1) Try using the /ignoreidentical option. This should stop any identical rows being written to disk during the comparison, so will...
Hi,
At a really quick calculation, that doesn't seem to be too bad - to pull 160GB of data over the wire in two hours would average at about 200Mbps, which is about all you can hope for on Gigabit ethernet to a single machine in my experience.
It's also possible that you're limited by hard disk speed on the machine performing the comparison. In that case, if it's a fast (in terms of CPU) machine, you could try enabling compression of temporary files. To do this, use /option:CompressTemporaryFiles on the command line.
Hope that helps,
Robert / comments
Hi,
At a really quick calculation, that doesn't seem to be too bad - to pull 160GB of data over the wire in two hours would average at about 200Mbps, which is about all you can hope for on Gigabit ...
Hi,
Are you using the command line version, or the GUI?
In the GUI, you can right-click a row in the grid, and choose "Exclude all different" and "Exclude all on left", which should do the trick.
In the command line, have a look at the /ignoredifferent and /ignoremissing options (you can get more details by running SQLDataCompare.exe /help /verbose).
Hope that helps,
Robert / comments
Hi,
Are you using the command line version, or the GUI?
In the GUI, you can right-click a row in the grid, and choose "Exclude all different" and "Exclude all on left", which should do the trick.
I...
Hi,
This is actually what we'd expect you to see. ANTS Profiler necessarily introduces some overhead as a result of profiling your application, and as a result, we subtract our best guess at this overhead from the times we display in the results.
This generally means you get results closer to those you'd see when the application was running outside the profiler. However, in your example, your stopwatch measure ten seconds including the time spent in the profiler - so in other words, the profiler is running for 2-5 seconds, and your application code is running for the 5-8 seconds you see in the results.
The exact overhead of the profiler varies, but very tight loops like you have there tend to be just about the worst case for it.
If you try profiling in fast mode, you'll probably see that the method as a whole does get reported as 10 seconds, or at least much closer to it.
Hope that helps,
Robert / comments
Hi,
This is actually what we'd expect you to see. ANTS Profiler necessarily introduces some overhead as a result of profiling your application, and as a result, we subtract our best guess at this o...
Hi,
I'm guessing from what you describe that you've deselected the 20 non-key columns in the "columns in comparison" part of the project configuration? If so, that'll be why your inserts are only showing the key columns.
When we do a comparison, we only actually compare based on the key columns you select (shown with a key icon in the column list), so there's no need to un-check the other columns there. Doing that will mean we don't actually pull the data for them out of the database, hence why you're not seeing it in your insert statements.
The main reason for un-checking columns there would be if you didn't want to move certain fields over, such as passwords from a live system that you don't want to push back over into a QA server or similar.
Hope that helps,
Robert / comments
Hi,
I'm guessing from what you describe that you've deselected the 20 non-key columns in the "columns in comparison" part of the project configuration? If so, that'll be why your inserts are only s...
Hi,
We've given this a go on a couple of Vista SP1 machines this morning, but haven't been able to reproduce it, unfortunately. Hopefully we can put in a fix to work around this next release, but I can't say when that'll be at this time.
Regards,
Robert / comments
Hi,
We've given this a go on a couple of Vista SP1 machines this morning, but haven't been able to reproduce it, unfortunately. Hopefully we can put in a fix to work around this next release, but I...