Wednesday, March 28, 2012

How to save the " character in a CDATA in SQL SErver 2005

Dear Pals,

I know that special characters, such as <, in a cdata section will be converted properly into entity reference before getting stored in a xml field in the database.

However, it seems that the character " (quotation) does not get converted to &quot;. This would result in problems when the XML document is fetched.

For example, a xml document like:

<A><![CDATA[THis is "John"]]></A>

when stored, it becomes

<A>THis is "John"</A>

This of course causes problem for XML parsers.

IS there any cure for that problem.

Thanks

Feng-Hsu Wang

Feng-Hsu,

According to the XML spec:

The ampersand character (&) and the left angle bracket (<) MUST NOT appear in their literal form, except when used as markup delimiters, or within a comment, a processing instruction, or a CDATA section. If they are needed elsewhere, they MUST be escaped using either numeric character references or the strings " &amp; " and " &lt; " respectively. The right angle bracket (>) may be represented using the string " &gt; ", and MUST, for compatibility, be escaped using either " &gt; " or a character reference when it appears in the string " ]]> " in content, when that string is not marking the end of a CDATA section.

In the content of elements, character data is any string of characters which does not contain the start-delimiter of any markup and does not include the CDATA-section-close delimiter, " ]]> ". In a CDATA section, character data is any string of characters not including the CDATA-section-close delimiter, " ]]> ".

To allow attribute values to contain both single and double quotes, the apostrophe or single-quote character (') may be represented as " &apos; ", and the double-quote character (") as " &quot; ".

As I understand the XML spec, a single quote and a double quote character does not need to be escaped inside an element value. They only need to be escaped inside an attribute value.

So, this should not be causing any problems for XML parsers.

Jimmy Wu

|||

<A>THis is "John"</A>

is well-formed xml. That means it is following all the syntax rules for xml. If you are getting an error from an xml parser reading this then that parser is broken.

Dan

sql

No comments:

Post a Comment