Friday, March 23, 2012

OPEN XML vs. SQLXMLBulkLoad

Which approach is a faster, better solution to process XML data? I get XML
data from an external web service. To ballpark the general size, an average
data file would be approximately 64K when saved in UTF-8.
A) TEXT parameter/sp_xml_preparedocument/OPENXML
- Passing raw XML as a TEXT parameter into a stored procedure
- sp_xml_preparedocument to parse the XML into an XML handle.
- Process the data using T-SQL with OPENXML
B) SQLXMLBULKLOAD/staging table/stored procedure
- Save the XML to a flat file
- Use the COM object SQLXMLBULKLOAD to bulk load the XML into a staging
table.
- Call a stored procedure which processes the data in the staging table
using T-SQL.
Any insight into this would be greatly appreciated.I prefer option B. This allows me to bulkload (i.e. minimal log) and then
use tsql to do whatever data insert in set. Also, this allows me to hand of
some of the workload to the client (i.e. workstation that does xmlbulkload).
-oj
"AsaMonsey" <AsaMonsey@.discussions.microsoft.com> wrote in message
news:A44A0EB0-2C15-4477-8FD3-93C7379CF9E0@.microsoft.com...
> Which approach is a faster, better solution to process XML data? I get XML
> data from an external web service. To ballpark the general size, an
> average
> data file would be approximately 64K when saved in UTF-8.
> A) TEXT parameter/sp_xml_preparedocument/OPENXML
> - Passing raw XML as a TEXT parameter into a stored procedure
> - sp_xml_preparedocument to parse the XML into an XML handle.
> - Process the data using T-SQL with OPENXML
> B) SQLXMLBULKLOAD/staging table/stored procedure
> - Save the XML to a flat file
> - Use the COM object SQLXMLBULKLOAD to bulk load the XML into a staging
> table.
> - Call a stored procedure which processes the data in the staging table
> using T-SQL.
>
> Any insight into this would be greatly appreciated.|||Thanks oj,
Our performance benchmarking across 1000 files indicates that the bulk load
is about 60% faster.
I was wondering if someone could explain the technical reasons why option B
is faster.
"oj" wrote:

> I prefer option B. This allows me to bulkload (i.e. minimal log) and then
> use tsql to do whatever data insert in set. Also, this allows me to hand o
f
> some of the workload to the client (i.e. workstation that does xmlbulkload
).
> --
> -oj
>
> "AsaMonsey" <AsaMonsey@.discussions.microsoft.com> wrote in message
> news:A44A0EB0-2C15-4477-8FD3-93C7379CF9E0@.microsoft.com...
>
>|||This article should help explain some:
[url]http://msdn.microsoft.com/library/en-us/dnsql90/html/exchsqlxml.asp?frame=true[/ur
l]
<quote>
SQLXML Bulkload enables the loading of input XML into the relational
backend. Internally, it uses the SQL Server bcp process, and is the best
mechanism to efficiently upload large input XML into the server. It is
implemented as a COM object, and it uses SQLOLEDB providers.
</quote>
-oj
"AsaMonsey" <AsaMonsey@.discussions.microsoft.com> wrote in message
news:F989E8E7-9B24-482F-AC85-7D848D028D08@.microsoft.com...
> Thanks oj,
> Our performance benchmarking across 1000 files indicates that the bulk
> load
> is about 60% faster.
> I was wondering if someone could explain the technical reasons why option
> B
> is faster.
> "oj" wrote:
>

No comments:

Post a Comment