程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
 程式師世界 >> 編程語言 >> .NET網頁編程 >> C# >> C#入門知識 >> Retrieving failed records after an SqlBulkCopy exception

Retrieving failed records after an SqlBulkCopy exception

編輯:C#入門知識

Let me start by saying that the idea I used in this article is not originally mine, but since I have only heard of it and have not been able to find any actual examples of its implementation anywhere I wrote the code to handle it.

With that out of the way - here's what this is about: Anyone who's worked with .NET's SqlBulkCopy class knows how fast and powerful it is. It beats other mechanisms for pumping large quantities of data into an SQL Server database by huge factors, and one of the reasons it is so fast is that it does not log anything it does.

The lack of logging definitely speeds things up, but when you are pumping hundreds of thousands of rows and suddenly have a failure on one of them because of a constraint, you're stuck. All the SqlException will tell you is that something went wrong with a given constraint (you'll get the constraint's name at least), but that's about it. You're then stuck having to go back to your source, run separate SELECT statements on it (or do manual searches), and find the culprit rows on your own.

On top of that, it can be a very long and iterative process if you've got data with several potential failures in it becauseSqlBulkCopy will stop as soon as the first failure is hit. Once you correct that one, you need to rerun the load to find the second error, etc.

The approach described in this article has the following advantages: 

  • Reports all possible errors that the SqlBulkCopy would encounter
  • Reports all culprit data rows, along with the exception that row would be causing
  • The entire thing is run in a transaction that is rolled back at the end, so no changes are committed.

... and disadvantages:

  • For extremely large amounts of data it might take a couple of minutes.
  • This solution is reactive; i.e. the errors are not returned as part of the exception raised by yourSqlBulkCopy.WriteToServer() process. Instead, this helper method is executed after the exception is raised to try and capture all possible errors along with their related data. This means that in case of an exception, your process will take longer to run than just running the bulk copy.
  • You cannot reuse the same DataReader object from the failed SqlBulkCopy, as readers are forward only fire hoses that cannot be reset. You'll need to create a new reader of the same type (e.g. re-issue the originalSqlCommand, recreate the reader based on the same DataTable, etc).

Background

The main idea is quite simple. Rerun the bulk copy, but only process one row at a time. As the rows are processed, capture the individual exceptions that copying them raises (if any) and add both the message and the row's data to an incremental message, but don't stop copying the data to the server. When all is said and done, your final error message is a nice log showing all the issues and the data that caused them. From that point it's easy to go back to the source, find those records, fix the issues and then reissue the bulk copy.

Using the code

It's important to note that not all failures on a bulk copy happen because of data. You might have connectivity issues, authentication failures, timeouts, etc. None of these cases would be explained by your data, so there's no point in calling this helper method if this is your case. You need to take this into account when calling the helper method, and only call it for specific types of exceptions (the sample code below takes care of this).

Also consider that the exception you're catching may not necessarily be the one raised by SqlServer and could be contained within an inner Exception. So if you plan on calling the helper method only if a data-related issue occurred, the exception (and all inner exceptions) needs to be inspected for this. The sample code below takes care of this, even though the Exception is coming directly from the server; in your case, you might be handling it at a higher level after the it has been wrapped in other exceptions.

Test bulk copy method

TestMethod() below is a simple method that sets up for a bulk copy operation and encloses it in a try/catch block. It is this bulk copy that supposedly fails because of some data issue, so within the catch block we then check the exception (and all inner exceptions) for a message containing the word "constraint" (which is apparently the only way to find a constraint failure, as all exceptions from SqlServer are of type SqlException). If such an exception message is found, we call GetBulkCopyFailedData() in order to get the failed rows. This latter method would ideally reside in a separate helper-type class.

Granted, this checking could have been done within the helper, but I was trying to keep it generic enough so that would show all exceptions and not assume what the caller wanted to filter out.

private void TestMethod()
{
   // new code
   SqlConnection connection = null;
   SqlBulkCopy bulkCopy = null;
   
   DataTable dataTable = new DataTable();
   // load some sample data into the DataTable
   IDataReader reader = dataTable.CreateDataReader();
 
   try 
   {
      connection = new SqlConnection("connection string goes here ...");
      connection.Open();
      bulkCopy = new SqlBulkCopy(connection); 
      bulkCopy.DestinationTableName = "Destination table name";
      bulkCopy.WriteToServer(reader);
   }
   catch (Exception exception)
   {
      // loop through all inner exceptions to see if any relate to a constraint failure
      bool dataExceptionFound = false;
      Exception tmpException = exception;
      while (tmpException != null)
      {
         if (tmpException is SqlException
            && tmpException.Message.Contains("constraint"))
         {
            dataExceptionFound = true;
            break;
         }
         tmpException = tmpException.InnerException;
      }

      if (dataExceptionFound)
      {
         // call the helper method to document the errors and invalid data
         string errorMessage = GetBulkCopyFailedData(
            connection.ConnectionString,
            bulkCopy.DestinationTableName,
            dataTable.CreateDataReader());
         throw new Exception(errorMessage, exception);
      }
   }
   finally
   {
      if (connection != null && connection.State == ConnectionState.Open)
      {
         connection.Close();
      }
   }
}

  

Documenting the errors and faulty data rows

GetBulkCopyFailedData() then opens a new connection to the database, creates a transaction, and begins bulk copying the data one row at a time. It does so by reading through the supplied DataReader and copying each row into an empty DataTable. The DataTable is then bulk copied into the destination database, and any exceptions resulting from this are caught, documented (along with the DataRow that caused it), and the cycle then repeats itself with the next row.

At the end of the DataReader we rollback the transaction and return the complete error message. Fixing the problems in the data source should now be a breeze.

 

View Code

Conclusion

I've certainly wasted more than enough time trying to figure out what was wrong with my data because the bulk copy operation wouldn't help me out there, so I hope this helps avoid wasted time for someone else as well.

As always - comments, questions and suggestions are always welcome. And please don't forget to vote! 

refer to:http://www.codeproject.com/Articles/387465/Retrieving-failed-records-after-an-SqlBulkCopy-exc

  1. 上一頁:
  2. 下一頁:
Copyright © 程式師世界 All Rights Reserved