ETL: Neo4j
Table of Contents
Description
Executes Cypher query in the defined Neo4j Server and returns the result of the query. All instances of Cypher queries in one job are running in one Neo4j transaction and this transaction is committed or rolled back at the end of the job execution.
Do not use this driver in combination with other drivers that are creating Neo4j connections because transaction blocking could occur. Use the ETL: GraphConnection driver instead to run every Cypher query in a separate transaction committed after every step.
Connection
Defines connection to a Neo4j endpoint.
Attributes
Name | Description | Required |
---|---|---|
url | url to Neo4j server, Bolt or REST url | yes |
user | username for authorization at Neo4j | yes |
password | password for authorization at Neo4j | yes |
Parameters
Name | Description | Required | Default |
---|---|---|---|
encrypted | If encrypted Bolt communication with the Neo4j database should be used. Values are true or false. | no | false |
queryEmptyValue | A string that represents what will be returned if a null value of some property is returned from Neo4j. | no | Returns the name of the property |
Query
Executes Cypher query in the defined Neo4j Server and returns the result of the query. Used for the read Cypher statements.
You can use any <script> in the body of the Neo4j query to process every record returned by the query one by one. Properties in the query result are accessible as "$property_key".
Script
Executes any Cypher statement. Used for "write" Cypher statements mainly. All statements of one connector are executed in the same Neo4j transaction and this one transaction is committed at the end of the ETL job. If there is some error during the execution of any <script> of one <connection> then the rollback is called for the transaction in Neo4j and the ETL job is ended. The rollback does not work if you use Cypher "USING PERIODIC COMMIT".
If more transactions are needed, create a new connector in the same ETL job.
USING PERIODIC COMMIT
If you need to use "USING PERIODIC COMMIT" Cypher statement this statement has to be only one statement executed in a <script>. Use more connectors to execute more Cypher queries in the same ETL job.
Examples
Query example: Executes Cypher query and every record from the result is written into a text file. The script element is used in the Query element to process every record of the result.
<!DOCTYPE etl SYSTEM "
https://scriptella.org/dtd/etl.dtd
">
<
etl
>
<
description
>Test neo4j query</
description
>
<
connection
id
=
"neo4j"
driver
=
"neo4j"
url
=
"bolt://localhost:7687"
user
=
"neo4j"
password
=
"admin"
>
encrypted=false
query_empty_value=
</
connection
>
<
connection
id
=
"out"
driver
=
"text"
url
=
"d:\\testfile.txt"
/>
<
query
connection-id
=
"neo4j"
>
MATCH (n:Ci) RETURN id(n) as id, n.logicalName as logicalName LIMIT 25
<
script
connection-id
=
"out"
>$rownum;$id;$logicalName</
script
>
</
query
>
</
etl
>
Script example: Executes Cypher statement.
<!DOCTYPE etl SYSTEM "
https://scriptella.org/dtd/etl.dtd
">
<
etl
>
<
description
>Test neo4j query</
description
>
<
connection
id
=
"neo4j"
driver
=
"neo4j"
url
=
"bolt://localhost:7687"
user
=
"neo4j"
password
=
"admin"
>
encrypted=false
</
connection
>
<
script
connection-id
=
"neo4j"
>
CREATE INDEX ON :Ci(logicalName)
</
script
>
</
etl
>
Script example: Executes Cypher statement based on the result of a SQL Query. The script element is used in a Query element to process every record of the result. Neo4j transaction is committed at the end of the ETL job.
Every record of a SQL result is inserted into Neo4j directly. This is not very good for performance and this example is only showing the possibilities of the Neo4j driver. For better performance please at least write the result of the SQL statement into a CSV file and then use the "LOAD CSV" statement instead (see ETL job examples).
<!DOCTYPE etl SYSTEM "
https://scriptella.org/dtd/etl.dtd
">
<
etl
>
<
description
>DB to Neo4j</
description
>
<
properties
>
db.driver=postgresql
db.url=jdbc:
postgresql://localhost:5432/cmdb
db.user=someuser
db.password=somepassword
</
properties
>
<
connection
id
=
"db"
driver
=
"$db.driver"
url
=
"$db.url"
user
=
"$db.user"
password
=
"$db.password"
/>
<
connection
id
=
"neo4j"
driver
=
"neo4j"
url
=
"bolt://localhost:7687"
user
=
"neo4j"
password
=
"admin"
/>
<
query
connection-id
=
"db"
>
select
case when id_sm is not null then id_sm else '' end as id_sm,
logical_name,
title,
case when type is not null then type else '' end as type,
case when subtype is not null then subtype else '' end as subtype
from device2m1
<
script
connection-id
=
"neo4j"
>
CREATE (n:Ci) SET n.idSm = '$id_sm', n.logicalName = '$logical_name', n.type = '$type', n.subtype = '$subtype'
</
script
>
</
query
>
</
etl
>