You might have reached the threshold of a SharePoint List and you’re afraid to increase the limit as you might affect the performance of you server. Now you are thinking of other solutions and one thing came to mind, why not archive the data? Well that is a good idea but how are you going to achieve this? well there are different ways and here are some suggestions:
- Export it manually to Excel and save the Excel document elsewhere – Caveman’s way
- Create an identical list and save it there – So you create a workflow to do this but you still reach the threshold for the archive list, do you create another identical list every exceeded threshold limit?
- You use a third-party open source data connector for SSIS – now your going somewhere but what if it has bugs? what if newer versions of SharePoint is not supported?
- Implement a retention policy and delete old data – too harsh for your users.
- Create a new solution! – You’re brave but at least you can freely manipulate what you want for your solution and that’s what we are going to do today, make our own solution in SSIS without any third-party dependence apart from what you already have, Visual Studio, SQL Server and Sharepoint.
Lets start and create a solution that will migrate List Items in a SharePoint List to SQL Server using SharePoint Web Services and SSIS.
Lets assume you have a List named Test on your SharePoint instance which contains the following fields on the image below.
Go to and create a view not using a datasheet format. This will be the data interface for your SSIS, once created go below and you will see the Web address for mobile view, get the ListGUID and ViewGUID, keep this as you will be using them later.
Now create a new Business Intelligence – Integration Services Project. Lets call it ListArchive. In case you don’t have the option in your Visual Studio 2012 download it here.
Now on your Control Flow canvas drag a Script Task
Create a variable, we only need one for this example. Name it OutputXML with data type string
Double click you Script Task then indicate the variable you created in the ReadWriteVariables of the script
Now add a Service Reference to your SharePoint instance. Right click on your Project References and choose Add Service Reference
When the window open, go to Advanced
Then to Add Web Reference
Now point your Web Reference to URL to the List web service (other services are explained here), in this example we go to http://yoursharepointserver.com/sandbox/_vti_bin/Lists.asmx. click the arrow beside the URL address bar then your service methods should show below. Give it a Web Reference name then click Add Reference.
That’s all you need for now so lets start coding. Copy and paste this on your ScriptMain.cs replacing whats contained in Main(). Do it now and I will explain the code below, it is commented anyways so its easy to understand.
public void Main() { // Instansiate Sharepoint Web Service var sharepointList = new sharepointListWebService.Lists(); sharepointList.UseDefaultCredentials = true; sharepointList.Url = "http://yoursharepointserver.com/sandbox/_vti_bin/lists.asmx"; // Your GUIDs var listGuid = "{cb6aedb3-d549-45a0-b3c5-e831cb0b51d0}"; var viewGuid = "{dfbcb0dc-a21f-4f71-94e8-b0a1adbd717a}"; // Get The List Items based on a View // You can put a null on the rowLimit parameter and this will get the default items limited by your view var outputNode = sharepointList.GetListItems(listGuid, viewGuid, null, null, "10000", null, null); var outputString = new System.Text.StringBuilder(); // Append additional XML Entries so you can create them Automatically using the Schema Generator // Declare z namespace on the <data> element outputString.Append("<data xmlns:z="#RowsetSchema">"); // removing z namespace from any child elements. outputString.Append(outputNode["rs:data"].InnerXml.Replace("<z:row ", "<row ")); outputString.Append("</data>"); var output = outputString.ToString(); // Output to file for debugging, you can remove this later or use them as additional backup file var xmlResultsDocument = new XmlDocument(); xmlResultsDocument.LoadXml(output); xmlResultsDocument.Save(@"C:UsersRaymundDesktopExport.xml"); // Perform Delete DeleteListItems(output, listGuid, sharepointList); // Save result to variables Dts.Variables["User::OutputXML"].Value = output; Dts.TaskResult = (int)ScriptResults.Success; } private void DeleteListItems(string listContents, string listGuid, sharepointListWebService.Lists sharepointList) { var xmlResultsDocument = new XmlDocument(); xmlResultsDocument.LoadXml(listContents); var xmlNamespace = new XmlNamespaceManager(xmlResultsDocument.NameTable); // Get all the rows XmlNodeList rows = xmlResultsDocument.SelectNodes("//row", xmlNamespace); if (rows.Count != 0) { foreach (XmlNode row in rows) { // Create an XmlDocument then construct the Batch element together with its attributes System.Xml.XmlDocument doc = new System.Xml.XmlDocument(); System.Xml.XmlElement batchElement = doc.CreateElement("Batch"); batchElement.SetAttribute("OnError", "Continue"); batchElement.SetAttribute("ListVersion", "1"); // Specify methods for the batch post using CAML int id = int.Parse(row.Attributes["ows_ID"].Value); batchElement.InnerXml = "<Method ID='1' Cmd='Delete'><Field Name='ID'>" + id + "</Field></Method>"; // Delete your List Item sharepointList.UpdateListItems(listGuid, batchElement); } } }
First is you need to instansiate that sharepoint webservice and give it credentials to run on, you might want to change this on a live environment.
The two GUIDs that you kept a while ago will be used here, you can do that programatically but that’s not the focus of this guide.
Then execute the GetListItems given your ListID and ViewID, it will presented in an XML format. That result will then be passed to the variable you created earlier.
Then we delete the items since it will be saved on the SQL Server anyway, that’s the whole point of archiving isn’t it? while may might argue why I am doing it in here and not after saving, well I know the fact and I will just include the codes in one place to easily explain things. In real scenario you need a lot of try catches here and perform only the delete once the data is safely store on the database.
Now run your code and get that XML Export, you need this to create a schema. Once it runs successfully, open that exported XML file in your Visual Studio, go to XML tab then choose Create Schema/
This then creates a schema for you with XSD extension, save it in your project folder.
Now going back to your Control Flow canvas, drag a Data Flow Task, your’e now ready to perform a data dump from your XML variable to SQL Server.
Now before you continue make sure you have the right tables sorted already in SQL Server, I suggest to name the fields exactly the same as the fields you get in the schema generated, take note of data types as well.
Once you got that sorted, go to the Data Flow Tab then drag an XML Source and an ADO NET Destination, connect them together.
Configure your XML Source with the following. Data access mode should come from the variable and XSD will be the one that was autogenerated earlier.
Now go to columns to confirm, it will warn you of the maximum length. We ignore that first but again on live scenario you need to give this the proper data types and associated lengths.
Check if you got all Columns you need.
Then lets Ignore any failure, just for this demo. If we don’t do this there will be a lot of truncation error specially if you have long column lengths.
XML Source is finished. Now lets configure the destination, create a new connection manager.
Choose ADO.NET
Then point to your database.
Then the necessary table.
Click mappings, if you have named the database table fields the same as the XSD then it would automap beautifully, otherwise you will manually do it.
Now you will see the red x icon disappears on ADO NET Destination but still on the XML Source, this is because it validates the External Metadata from the XML and you don’t have it as it is from a variable populated on run time. Just disable that checking on the Properties of your XML Source.
Run your project, then you’re all good to go.
To confirm check your database if the data is there.
Also check your SharePoint list, data should be already deleted at this moment.
This is tested to work using Sharepoint 2013, SQL Server 2012 and Visual Studio 2012.
Pingback: Get Sharepoint List Access from SQL Server - elbsolutions.com Project List & Blog
You have mentioned “Now add a Service Reference to your SharePoint instance. Right click on your Project References and choose Add Service Reference” But you have not said if this is a new Visual Studio project. Is this a new WCF project?
Its a script task in SSIS which opens up Visual Studio so you can code on .NET
Thanks for your response. It helped. After running the code, I am getting an error for the line outputString.Append(“”);
Error says “preprocessor directives must appear as the first non-whitespace character on a line”
Any idea how I can fix it?
Is that your own code? outputString.Append(“”);
Why not try outputString.AppendLine();
Also are there any preprocessor on your strings? Can you send your output string?
I got all the errors fixed. I setup everything as per your steps. In the XML source editor when I set error output to ignore failure for error or truncation, it shows me a yellow warning triangle on the XML source saying “No Rows will be set to error output. Configure error or truncation dispositions to redirect rows”.
Even after the warning if I run the project, it finishes successfully but does not load anything in the database. The output message says “”ADO NET Destination” wrote 0 rows.” I also get some runtime warnings like this “The output column “ows_TFS” (151) on output “row” (107) and component “XML Source” (99) is not subsequently used in the Data Flow task. Removing this unused output column can increase Data Flow task performance.”
My script to export an xml is working fine. When I run the code, it generates an xml from which I generated the schema. My database connection is fine. I tested the connection and it was successful. The mapping for columns also went smoothly.
Any idea why rows are not getting written to the table? (FYI I have not included the code for deleting the data from the SharePoint site. I am simply extracting data and loading it in sql server)
Thanks much!
Are you running this on your dev machine or SSIS already?
Hi Raymund, this is a great tutorial but I need some clarity. What type of application did you create before creating this SSIS project? There’s no “Add Reference” in the SSIS project. Are you working on a Console Application or WCF Application or a SharePoint WebPart Application? What do you mean by ” SharePoint instance”?
Its just and SSIS Project the Add Service Reference comes from the Script Task when you edit it. That will fire another instance of Visual Studio so you can start coding in C#
Chachi how did you fix that error “preprocessor directives must appear as the first non-whitespace character on a line”? outputString.Append(“”);
Is this blog still being monitored? I am facing some issues running the code, hence needed some guidance.
I am stuck at the step where I need to run the script task to generate the XML output. I get an error at the line “var outputNode = sharepointList.GetListItems(listGuid, viewGuid, null, null, “10000”, null, null);” . The error says sharepoint access is denied. (exception from hresult: 0x80070005 (e_accessdenied)). Any idea how I can solve it? I am not using any proxy.