Table of Contents

1. Description

This driver allows querying an XML file based on XPath expressions.

2. Connection

2.1. Attributes

Name

Description

Example

id

this etl connection, that will be used as a reference with connection-id within scripts and queries

id=”xmlReader”

driver

use “xpath” in this case

driver=”xpath”

url

XML file URL

url=”https://www.w3schools.com/xml/cd_catalog.xml

see “<!-- CONNECTIONS -->" section in ETL: XPath for example usage

2.2. Parameters

Name

Description

Default

return_arrays

Value of true specifies that variables should return a string array, otherwise a single string is returned.

false

3. Query

XPath driver supports XPath syntax to query text files.

The query is executed on an XML Document and produces a rowset for matched nodes. The attribute and element values can be referenced from nested scripts/queries. The following example illustrates the querying mechanism:

Example:

XPath: /A selects root element <A>

<A B="1">
<B>2</B>
<C>3</C>
</A>

Available variables for matched element <A>:

Name

Value

A

2 3

B

1

C

3

The value of variable A represents text content inside XML element <A>, the value of variable B represents value of B attribute and the value of variable C represents text content inside element <C>.

The context node for the XPath query is the selected node of the nearest outer XPath query, or the document root node if there isn't one. This allows relative queries using ./

A special node variable is available to the nested queries/scripts which provides extra methods for querying relative to the current node. The node variable is an instance of NodeVariable. Using ${node.getString("./C")} is equivalent to $C.

Additional notes:

  • Currently only Node Set can be selected in XPath expressions, i.e. attributes or elements but not String, Boolean or Number

4. Script

<script> elements are not supported.

5. Examples

Query example: Executes a query and every record from the result is written into a text file. The script element is used in the Query element to process every record of the result.

<!DOCTYPE etl SYSTEM "https://scriptella.org/dtd/etl.dtd">
<etl>
<description>Graphlytic job</description>
<properties>
job_name=Graphlytic job
</properties>
<!-- CONNECTIONS -->
<connection id="xmlReader" driver="xpath" url="https://www.w3schools.com/xml/cd_catalog.xml"></connection>
<connection id="logInfo" driver="log">
level=INFO
</connection>
<!-- JOB STEPS -->
<query connection-id="xmlReader">
/CATALOG/CD
<script connection-id="logInfo">
$rownum: $TITLE - $ARTIST
</script>
</query>
 
</etl>