Utility Functions
=================
These functions provide validation, analysis, and formatting capabilities for XML documents.
Validation
----------
xml_valid
~~~~~~~~~
Check if XML is well-formed.
**Syntax:**
.. code-block:: sql
xml_valid(xml)
**Parameters:**
- ``xml`` (VARCHAR or XML): The XML content to validate
**Returns:** BOOLEAN - ``true`` if the XML is well-formed, ``false`` otherwise.
**Example:**
.. code-block:: sql
SELECT xml_valid('- Valid
'); -- true
SELECT xml_valid('- Invalid
'); -- false
xml_well_formed
~~~~~~~~~~~~~~~
Alias for ``xml_valid``.
**Syntax:**
.. code-block:: sql
xml_well_formed(xml)
xml_validate_schema
~~~~~~~~~~~~~~~~~~~
Validate XML against an XSD schema.
**Syntax:**
.. code-block:: sql
xml_validate_schema(xml, xsd)
**Parameters:**
- ``xml`` (VARCHAR): The XML content to validate
- ``xsd`` (VARCHAR): The XSD schema to validate against
**Returns:** BOOLEAN - ``true`` if the XML is valid according to the schema.
**Example:**
.. code-block:: sql
SELECT xml_validate_schema(
'John',
'
'
);
Analysis
--------
xml_stats
~~~~~~~~~
Get document statistics.
**Syntax:**
.. code-block:: sql
xml_stats(xml)
**Returns:** STRUCT with fields:
- ``element_count`` (INTEGER): Number of elements
- ``attribute_count`` (INTEGER): Number of attributes
- ``text_node_count`` (INTEGER): Number of text nodes
- ``comment_count`` (INTEGER): Number of comments
- ``max_depth`` (INTEGER): Maximum nesting depth
- ``namespace_count`` (INTEGER): Number of namespaces
**Example:**
.. code-block:: sql
SELECT xml_stats('- Text
');
-- Result: {element_count: 3, attribute_count: 2, text_node_count: 1, ...}
-- Access individual stats
SELECT (xml_stats(xml)).element_count FROM read_xml_objects('doc.xml');
xml_namespaces
~~~~~~~~~~~~~~
List all namespaces declared in an XML document.
**Syntax:**
.. code-block:: sql
xml_namespaces(xml)
**Returns:** LIST
**Example:**
.. code-block:: sql
SELECT xml_namespaces(
''
);
-- Result: [{prefix: "", uri: "http://default.com"}, {prefix: "ns", uri: "http://example.com"}]
xml_common_namespaces
~~~~~~~~~~~~~~~~~~~~~
Returns a MAP of well-known namespace prefixes to their URIs (XML, XHTML, SVG, RSS, Atom, SOAP, etc.).
**Syntax:**
.. code-block:: sql
xml_common_namespaces()
**Returns:** MAP(VARCHAR, VARCHAR) - Mapping of common prefixes to URIs.
**Example:**
.. code-block:: sql
SELECT xml_common_namespaces();
-- Result: {xml: "http://www.w3.org/XML/1998/namespace", xhtml: "http://www.w3.org/1999/xhtml", ...}
-- Use with XPath extraction
SELECT xml_extract_text(xml, '//atom:entry/atom:title', xml_common_namespaces());
xml_detect_prefixes
~~~~~~~~~~~~~~~~~~~
Detect namespace prefixes used in an XPath expression.
**Syntax:**
.. code-block:: sql
xml_detect_prefixes(xpath)
**Returns:** LIST(VARCHAR) - List of unique namespace prefixes found in the XPath.
**Example:**
.. code-block:: sql
SELECT xml_detect_prefixes('//gml:pos | //ns:item[@xlink:href]');
-- Result: ["gml", "ns", "xlink"]
xml_mock_namespaces
~~~~~~~~~~~~~~~~~~~
Create mock namespace URIs for a list of prefixes. Useful when namespace URIs are unknown or don't matter.
**Syntax:**
.. code-block:: sql
xml_mock_namespaces(prefixes)
**Parameters:**
- ``prefixes`` (LIST(VARCHAR)): List of namespace prefixes
**Returns:** MAP(VARCHAR, VARCHAR) - Mapping of prefixes to mock URIs (``urn:mock:prefix``).
**Example:**
.. code-block:: sql
SELECT xml_mock_namespaces(['gml', 'ns']);
-- Result: {gml: "urn:mock:gml", ns: "urn:mock:ns"}
-- Use with XPath when you don't know the actual namespace URI
SELECT xml_extract_text(
'1 2 3',
'//gml:pos',
xml_mock_namespaces(['gml'])
);
-- Result: ["1 2 3"]
-- Auto-detect and mock in one step
SELECT xml_extract_text(
xml,
xpath,
xml_mock_namespaces(xml_detect_prefixes(xpath))
);
xml_add_namespace_declarations
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Add namespace declarations to an XML document's root element.
**Syntax:**
.. code-block:: sql
xml_add_namespace_declarations(xml, namespaces)
**Parameters:**
- ``xml`` (VARCHAR): The XML content
- ``namespaces`` (MAP(VARCHAR, VARCHAR)): Map of prefix to URI pairs to add
**Returns:** VARCHAR - XML with namespace declarations added to the root element.
**Example:**
.. code-block:: sql
-- Add a single namespace declaration
SELECT xml_add_namespace_declarations(
'1 2',
MAP {'gml': 'http://www.opengis.net/gml'}
);
-- Result: 1 2
-- Fix undeclared prefixes using helper functions
SELECT xml_add_namespace_declarations(
xml,
xml_mock_namespaces(xml_detect_prefixes('//gml:pos'))
);
xml_lookup_namespace
~~~~~~~~~~~~~~~~~~~~
Look up the URI for a well-known namespace prefix.
**Syntax:**
.. code-block:: sql
xml_lookup_namespace(prefix)
**Parameters:**
- ``prefix`` (VARCHAR): The namespace prefix to look up (e.g., 'gml', 'svg', 'atom')
**Returns:** VARCHAR - The namespace URI, or NULL if the prefix is not recognized.
**Example:**
.. code-block:: sql
SELECT xml_lookup_namespace('gml');
-- Result: http://www.opengis.net/gml
SELECT xml_lookup_namespace('svg');
-- Result: http://www.w3.org/2000/svg
SELECT xml_lookup_namespace('atom');
-- Result: http://www.w3.org/2005/Atom
SELECT xml_lookup_namespace('unknown');
-- Result: NULL
-- Use with XPath extraction
SELECT xml_extract_text(
'1 2 3',
'//gml:pos',
MAP {'gml': xml_lookup_namespace('gml')}
);
xml_find_undefined_prefixes
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Find namespace prefixes used in an XPath expression that are not declared in the XML document.
**Syntax:**
.. code-block:: sql
xml_find_undefined_prefixes(xml, xpath)
**Parameters:**
- ``xml`` (VARCHAR): The XML content to check
- ``xpath`` (VARCHAR): The XPath expression containing namespace prefixes
**Returns:** VARCHAR[] - List of prefixes used in the XPath but not declared in the XML.
**Example:**
.. code-block:: sql
-- Find undefined prefixes
SELECT xml_find_undefined_prefixes(
'1 2',
'//gml:pos'
);
-- Result: [gml]
-- Document with declared namespace returns empty list
SELECT xml_find_undefined_prefixes(
'1 2',
'//gml:pos'
);
-- Result: []
-- Combine with xml_mock_namespaces for automatic namespace handling
WITH doc AS (SELECT 'value' AS xml)
SELECT xml_extract_text(
xml,
'//ns:item',
xml_mock_namespaces(xml_find_undefined_prefixes(xml, '//ns:item'))
) FROM doc;
Formatting
----------
xml_pretty_print
~~~~~~~~~~~~~~~~
Format XML with indentation for readability.
**Syntax:**
.. code-block:: sql
xml_pretty_print(xml)
**Returns:** VARCHAR - Formatted XML string.
**Example:**
.. code-block:: sql
SELECT xml_pretty_print('- Test
');
-- Result:
--
-- -
-- Test
--
--
xml_minify
~~~~~~~~~~
Remove whitespace from XML.
**Syntax:**
.. code-block:: sql
xml_minify(xml)
**Returns:** VARCHAR - Minified XML string with no unnecessary whitespace.
**Example:**
.. code-block:: sql
SELECT xml_minify('
-
Product
');
-- Result: - Product
xml_wrap_fragment
~~~~~~~~~~~~~~~~~
Wrap an XML fragment with a root element.
**Syntax:**
.. code-block:: sql
xml_wrap_fragment(fragment, wrapper)
**Parameters:**
- ``fragment`` (VARCHAR): XML fragment to wrap
- ``wrapper`` (VARCHAR): Name of the wrapper element
**Returns:** VARCHAR - Wrapped XML.
**Example:**
.. code-block:: sql
SELECT xml_wrap_fragment('- A
- B
', 'items');
-- Result: - A
- B
System Information
------------------
xml_libxml2_version
~~~~~~~~~~~~~~~~~~~
Get libxml2 version information.
**Syntax:**
.. code-block:: sql
xml_libxml2_version(component)
**Parameters:**
- ``component`` (VARCHAR): Component name (e.g., ``'xml'``)
**Returns:** VARCHAR - Version string.
**Example:**
.. code-block:: sql
SELECT xml_libxml2_version('xml');
-- Result: "2.12.6" (or similar version string)