Table of Contents
1. Description
This driver allows querying an XML file based on XPath expressions.
2. Connection
2.1. Attributes
Name | Description | Example |
---|---|---|
id | this etl connection, that will be used as a reference with connection-id within scripts and queries | id=”xmlReader” |
driver | use “xpath” in this case | driver=”xpath” |
url | XML file URL |
see “<!-- CONNECTIONS -->
" section in ETL: XPath for example usage
2.2. Parameters
Name | Description | Default |
---|---|---|
return_arrays | Value of |
|
3. Query
XPath driver supports XPath syntax to query text files.
The query is executed on an XML Document and produces a rowset for matched nodes. The attribute and element values can be referenced from nested scripts/queries. The following example illustrates the querying mechanism:
Example:
XPath: /A
selects root element <A>
<A B=
"1"
>
<B>
2
</B>
<C>
3
</C>
</A>
Available variables for matched element <A>:
Name | Value |
---|---|
A | 2 3 |
B | 1 |
C | 3 |
The value of variable A represents text content inside XML element <A>, the value of variable B represents value of B attribute and the value of variable C represents text content inside element <C>.
The context node for the XPath query is the selected node of the nearest outer XPath query, or the document root node if there isn't one. This allows relative queries using ./
A special node
variable is available to the nested queries/scripts which provides extra methods for querying relative to the current node. The node variable is an instance of NodeVariable
. Using ${node.getString("./C")}
is equivalent to $C
.
Additional notes:
Currently only Node Set can be selected in XPath expressions, i.e. attributes or elements but not String, Boolean or Number
4. Script
<script>
elements are not supported.
5. Examples
Query example: Executes a query and every record from the result is written into a text file. The script element is used in the Query element to process every record of the result.
<!DOCTYPE etl SYSTEM "
https://scriptella.org/dtd/etl.dtd
">
<
etl
>
<
description
>Graphlytic job</
description
>
<
properties
>
job_name=Graphlytic job
</
properties
>
<!-- CONNECTIONS -->
<
connection
id
=
"xmlReader"
driver
=
"xpath"
url
=
"https://www.w3schools.com/xml/cd_catalog.xml"
></
connection
>
<
connection
id
=
"logInfo"
driver
=
"log"
>
level=INFO
</
connection
>
<!-- JOB STEPS -->
<
query
connection-id
=
"xmlReader"
>
/CATALOG/CD
<
script
connection-id
=
"logInfo"
>
$rownum: $TITLE - $ARTIST
</
script
>
</
query
>
</
etl
>