How to Archive SharePoint List Items to SQL Server

By | November 1, 2013

You might have reached the threshold of a SharePoint List and you’re afraid to increase the limit as you might affect the performance of you server.  Now you are thinking of other solutions and one thing came to mind, why not archive the data? Well that is a good idea but how are you going to achieve this? well there are different ways and here are some suggestions:

  1. Export it manually to Excel and save the Excel document elsewhere – Caveman’s way
  2. Create an identical list and save it there – So you create a workflow to do this but you still reach the threshold for the archive list, do you create another identical list every exceeded threshold limit?
  3. You use a third-party open source data connector for SSIS – now your going somewhere but what if it has bugs? what if newer versions of SharePoint is not supported?
  4. Implement a retention policy and delete old data – too harsh for your users.
  5. Create a new solution! – You’re brave but at least you can freely manipulate what you want for your solution and that’s what we are going to do today, make our own solution in SSIS without any third-party dependence apart from what you already have, Visual Studio, SQL Server and Sharepoint.

Lets start and create a solution that will migrate List Items in a SharePoint List to SQL Server using SharePoint Web Services and SSIS.

Lets assume you have a List named Test on your SharePoint instance which contains the following fields on the image below.

01 Test List

Go to and create a view not using a datasheet format.  This will be the data interface for your SSIS, once created go below and you will see the Web address for mobile view, get the ListGUID and ViewGUID, keep this as you will be using them later.

02 Get List and View ID

Now create a new Business Intelligence – Integration Services Project.  Lets call it ListArchive.  In case you don’t have the option in your Visual Studio 2012 download it here.

03 Create a New BI Project

Now on your Control Flow canvas drag a Script Task

04 Script Task

Create a variable, we only need one for this example.  Name it OutputXML with data type string

05 Create Variables

Double click you Script Task then indicate the variable you created in the ReadWriteVariables of the script

06 Use Variable

Now add a Service Reference to your SharePoint instance.  Right click on your Project References and choose Add Service Reference

07 Add Web Service

When the window open, go to Advanced

08 Advanced

Then to Add Web Reference

09 Web Reference

Now point your Web Reference to URL to the List web service (other services are explained here), in this example we go to http://yoursharepointserver.com/sandbox/_vti_bin/Lists.asmx. click the arrow beside the URL address bar then your service methods should show below.  Give it a Web Reference name then click Add Reference.

10 Add Reference

That’s all you need for now so lets start coding.  Copy and paste this on your ScriptMain.cs replacing whats contained in Main(). Do it now and I will explain the code below, it is commented anyways so its easy to understand.

public void Main()
{
    // Instansiate Sharepoint Web Service
    var sharepointList = new sharepointListWebService.Lists();
    sharepointList.UseDefaultCredentials = true;
    sharepointList.Url = "http://yoursharepointserver.com/sandbox/_vti_bin/lists.asmx";

    // Your GUIDs
    var listGuid = "{cb6aedb3-d549-45a0-b3c5-e831cb0b51d0}";
    var viewGuid = "{dfbcb0dc-a21f-4f71-94e8-b0a1adbd717a}";

    // Get The List Items based on a View
    // You can put a null on the rowLimit parameter and this will get the default items limited by your view
    var outputNode = sharepointList.GetListItems(listGuid, viewGuid, nullnull"10000"nullnull);

    var outputString = new System.Text.StringBuilder();

    // Append additional XML Entries so you can create them Automatically using the Schema Generator
    // Declare z namespace on the <data> element 
    outputString.Append("<data xmlns:z="#RowsetSchema">");

    // removing z namespace from any child elements.
    outputString.Append(outputNode["rs:data"].InnerXml.Replace("<z:row ""<row "));
    outputString.Append("</data>");

    var output = outputString.ToString();

    // Output to file for debugging, you can remove this later or use them as additional backup file
    var xmlResultsDocument = new XmlDocument();
    xmlResultsDocument.LoadXml(output);
    xmlResultsDocument.Save(@"C:UsersRaymundDesktopExport.xml");

    // Perform Delete
    DeleteListItems(output, listGuid, sharepointList);

    // Save result to variables
    Dts.Variables["User::OutputXML"].Value = output;

    Dts.TaskResult = (int)ScriptResults.Success;
}

private void DeleteListItems(string listContents, string listGuid, sharepointListWebService.Lists sharepointList)
{
    var xmlResultsDocument = new XmlDocument();
    xmlResultsDocument.LoadXml(listContents);

    var xmlNamespace = new XmlNamespaceManager(xmlResultsDocument.NameTable);

    // Get all the rows
    XmlNodeList rows = xmlResultsDocument.SelectNodes("//row", xmlNamespace);

    if (rows.Count != 0)
    {
        foreach (XmlNode row in rows)
        {
            // Create an XmlDocument then construct the Batch element together with its attributes
            System.Xml.XmlDocument doc = new System.Xml.XmlDocument();
            System.Xml.XmlElement batchElement = doc.CreateElement("Batch");
            batchElement.SetAttribute("OnError""Continue");
            batchElement.SetAttribute("ListVersion""1");

            // Specify methods for the batch post using CAML
            int id = int.Parse(row.Attributes["ows_ID"].Value);
            batchElement.InnerXml = "<Method ID='1' Cmd='Delete'><Field Name='ID'>" + id + "</Field></Method>";            // Delete your List Item
            sharepointList.UpdateListItems(listGuid, batchElement);
        }
    }
}

First is you need to instansiate that sharepoint webservice and give it credentials to run on, you might want to change this on a live environment.

The two GUIDs that you kept a while ago will be used here, you can do that programatically but that’s not the focus of this guide.

Then execute the GetListItems given your ListID and ViewID, it will presented in an XML format.  That result will then be passed to the variable you created earlier.

Then we delete the items since it will be saved on the SQL Server anyway, that’s the whole point of archiving isn’t it? while may might argue why I am doing it in here and not after saving, well I know the fact and I will just include the codes in one place to easily explain things.  In real scenario you need a lot of try catches here and perform only the delete once the data is safely store on the database.

Now run your code and get that XML Export, you need this to create a schema.  Once it runs successfully, open that exported XML file in your Visual Studio, go to XML tab then choose Create Schema/

11 Create Schema

This then creates a schema for you with XSD extension, save it in your project folder.

12 Schema Created

Now going back to your Control Flow canvas, drag a Data Flow Task, your’e now ready to perform a data dump from your XML variable to SQL Server.

13 Create Dataflow Task

Now before you continue make sure you have the right tables sorted already in SQL Server, I suggest to name the fields exactly the same as the fields you get in the schema generated, take note of data types as well.

Once you got that sorted, go to the Data Flow Tab then drag an XML Source and an ADO NET Destination, connect them together.

14 Data Flow

Configure your XML Source with the following.  Data access mode should come from the variable and XSD will be the one that was autogenerated earlier.

15 Set Parameters

Now go to columns to confirm, it will warn you of the maximum length.  We ignore that first but again on live scenario you need to give this the proper data types and associated lengths.

16 warning

Check if you got all Columns you need.

17 Columns

Then lets Ignore any failure, just for this demo. If we don’t do this there will be a lot of truncation error specially if  you have long column lengths.

18 Ignore Truncate

XML Source is finished.  Now lets configure the destination, create a new connection manager.

19 New Connection

Choose ADO.NET

20 ADO.Net Connection

Then point to your database.

21 My Database

Then the necessary table.

22 Connect

Click mappings, if you have named the database table fields the same as the XSD then it would automap beautifully, otherwise you will manually do it.

23 Mapping

Now you will see the red x icon disappears on ADO NET Destination but still on the XML Source, this is because it validates the External Metadata from the XML and you don’t have it as it is from a variable populated on run time.  Just disable that checking on the Properties of your XML Source.

24 Validate Data

Run your project, then you’re all good to go.

25 Success

To confirm check your database if the data is there.

26 SQL data

Also check your SharePoint list, data should be already deleted at this moment.

This is tested to work using Sharepoint 2013, SQL Server 2012 and Visual Studio 2012.

Recommended

11 thoughts on “How to Archive SharePoint List Items to SQL Server

  1. Pingback: Get Sharepoint List Access from SQL Server - elbsolutions.com Project List & Blog

  2. Chahchi

    You have mentioned “Now add a Service Reference to your SharePoint instance. Right click on your Project References and choose Add Service Reference” But you have not said if this is a new Visual Studio project. Is this a new WCF project?

    Reply
      1. Chahchi

        Thanks for your response. It helped. After running the code, I am getting an error for the line outputString.Append(“”);

        Error says “preprocessor directives must appear as the first non-whitespace character on a line”

        Any idea how I can fix it?

        Reply
        1. Raymund Macaalay Post author

          Is that your own code? outputString.Append(“”);
          Why not try outputString.AppendLine();

          Also are there any preprocessor on your strings? Can you send your output string?

          Reply
          1. Chahchi

            I got all the errors fixed. I setup everything as per your steps. In the XML source editor when I set error output to ignore failure for error or truncation, it shows me a yellow warning triangle on the XML source saying “No Rows will be set to error output. Configure error or truncation dispositions to redirect rows”.

            Even after the warning if I run the project, it finishes successfully but does not load anything in the database. The output message says “”ADO NET Destination” wrote 0 rows.” I also get some runtime warnings like this “The output column “ows_TFS” (151) on output “row” (107) and component “XML Source” (99) is not subsequently used in the Data Flow task. Removing this unused output column can increase Data Flow task performance.”

            My script to export an xml is working fine. When I run the code, it generates an xml from which I generated the schema. My database connection is fine. I tested the connection and it was successful. The mapping for columns also went smoothly.

            Any idea why rows are not getting written to the table? (FYI I have not included the code for deleting the data from the SharePoint site. I am simply extracting data and loading it in sql server)

            Thanks much!

  3. Siya

    Hi Raymund, this is a great tutorial but I need some clarity. What type of application did you create before creating this SSIS project? There’s no “Add Reference” in the SSIS project. Are you working on a Console Application or WCF Application or a SharePoint WebPart Application? What do you mean by ” SharePoint instance”?

    Reply
    1. Raymund Macaalay Post author

      Its just and SSIS Project the Add Service Reference comes from the Script Task when you edit it. That will fire another instance of Visual Studio so you can start coding in C#

      Reply
  4. Siya

    Chachi how did you fix that error “preprocessor directives must appear as the first non-whitespace character on a line”? outputString.Append(“”);

    Reply
  5. Dheeraj

    Is this blog still being monitored? I am facing some issues running the code, hence needed some guidance.

    I am stuck at the step where I need to run the script task to generate the XML output. I get an error at the line “var outputNode = sharepointList.GetListItems(listGuid, viewGuid, null, null, “10000”, null, null);” . The error says sharepoint access is denied. (exception from hresult: 0x80070005 (e_accessdenied)). Any idea how I can solve it? I am not using any proxy.

    Reply

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.