1. Description
Executes the given query for each row of provided CSV file. Allows to read from or write to CSV files.
2. Connection
Defines a connection to a single CSV file.
As an “url” parameter, please use an absolute path to the CSV file.
2.1. Parameters
Name | Mandatory | Default | Description |
---|---|---|---|
encoding | No | The system default encoding is used. | Specifies charset encoding of CSV files. |
separator | No | The default value is | The delimiter to use for separating entries when reading from or writing to files. |
quote | No | The default value is | The character to use for quoted elements when reading from or writing to files. Use empty string to suppress quoting. |
headers | No | The default value is | Value of true means the first line contains headers. Only valid for <query> elements. |
eol | No | The default value is | End-Of-Line suffix. Only valid for <script> elements. |
trim | No | The default value is | Value of |
skip_lines | No | The default value is | The number of lines to skip before start reading. Please note that if |
null_string | By default strings are preserved, i.e. empty strings are not converted to nulls and null variables references are not expanded in the output, i.e. ${nullvalue}. | Specifies string token to represent Java When querying a CSV file, cells with content equal to null_string are parsed to Java nulls. Specify an empty string ( |
2.2. Connection Example
<properties>
path_to_file=/some/path/file.csv
</properties>
<connection id=
"csv_file"
driver=
"csv"
url=
"$path_to_file"
>
separator=;
quote="
</connection>
3. Query
The driver utilizes a query-by-example approach for CSV content filtering. If you want to read the whole CSV simply use an empty query element, otherwise, specify a set of lines containing comma-separated case-insensitive regular expressions.
Example:
,foo
,,^Bar$
This query-by-example filter selects rows where the third column equals "Bar" or the second column contains the "foo" substring.
The columns of the matched row can be referenced by name in nested queries/scripts. It is also possible to reference columns by an index, i.e. $1, $2, ...
IMPORTANT: Always use commas as a column separator inside <script> and <query> elements, regardless of the separator used in files being parsed or produced. This decision allows switching between different formats like tab or semicolon while keeping scripts and queries unchanged.
3.1. Query example
<!DOCTYPE etl SYSTEM
"https://scriptella.org/dtd/etl.dtd"
>
<etl>
<description>Read csv example</description>
<connection id=
"csv_file"
driver=
"csv"
url=
"example.csv"
/>
<connection id=
"logInfo"
driver=
"log"
>
level=INFO
</connection>
<query connection-id=
"csv_file"
>
<!-- empty query element reads all csv values -->
<script connection-id=
"logInfo"
>
[$rownum]:
'$id'
,
'$username'
,
'$timestamp'
</script>
</query>
<query connection-id=
"csv_file"
>
<!-- given csv structure: id, username, timestamp.
reads all csv lines, where username contains
"foo"
,
or timestamp starts with
2023
-->
,foo
,,^
2023
<script connection-id=
"logInfo"
>
[$rownum]:
'$1'
,
'$2'
,
'$3'
</script>
</query>
</etl>
4. Script
Specify a comma-separated set of columns.
IMPORTANT: Always use commas as a column separator inside <script> and <query> elements, regardless of the separator used in files being parsed or produced. This decision allows switching between different formats like tab or semicolon while keeping scripts and queries unchanged.
4.1. Script example
<!DOCTYPE etl SYSTEM
"https://scriptella.org/dtd/etl.dtd"
>
<etl>
<description>Write csv example</description>
<connection id=
"csv_write_example"
driver=
"csv"
url=
"example.csv"
/>
<script connection-id=
"csv_write_example"
>
id,username,timestamp
1
,admin,
2023
-
01
-
01
00
:
00
:
00
2
,read-only,
2023
-
08
-
02
10
:
07
:
50
</script>
</etl>
This script writes 3 lines to the output file.