Table of Contents

1. Description

Executes bulk import of data into Cosmos DB through the Cosmos .NET endpoint.

2. Connection

2.1. Parameters

Name

Description

Required

host

Cosmos DB .NET endpoint.
Example: https://my-cosmos.documents.azure.com:443/

yes

database

Database name.

yes

graph

Graph name.

yes

throughput

Maximum allowed throughput.

yes

partition

Partition key path.
Example: /Region

yes

key

Primary key to the database.

yes

commit_size

Size of entities to be committed, to free resources. The default is 10000.

no

rewrite

When true, entities with the same IDs will be overwritten. The default is false, which means only entities with unique IDs will be created.

no

3. Query

Not applicable

4. Script

Executes bulk import of vertices or edges. Vertices and edges are pushed to Cosmos DB after commit_size is reached, with either the number of vertices or the number of edges. If commit_size was not reached, data is pushed when the entire job is done.
Example: when the commit_size = 1000 and only 950 vertices or edges were processed with the <script> element, the data will be pushed at the end of the job.

4.1. Parameters

Name

Applies to

Description

Example

ENTITY

vertex edge

Either VERTEX or EDGE.

EDGE

ID

vertex edge

ID of the newly created vertex or edge. If not provided, a random UUID will be used.

Any unique string

LABEL

vertex edge

Label for a vertex or an edge.

Person

PARTITION

vertex

Value of the property used as the partition.

Asia

VERTEX_TAG

vertex

It saves the vertex information under a specific tag. The tag is overridden when a new vertex is created with the same tag.

Usage: In a script, create a vertex with vertex_tag = source_vertex, in a subsequent script, create a vertex with vertex_tag=target_vertex. In the next script, it is possible to create an edge with SOURCE_TAG=source_vertex and TARGET_TAG=target_vertex without defining SOURCE_VERTEX_ID, SOURCE_VERTEX_LABEL, SOURCE_VERTEX_PARTITION, TARGET_VERTEX_ID, TARGET_VERTEX_LABEL, and TARGET_VERTEX_PARTITION.

Any string

SOURCE_TAG

edge

VERTEX_TAG of the source vertex.

Any string

SOURCE_VERTEX_ID

edge

ID of source vertex.

SOURCE_VERTEX_LABEL

edge

Label of source vertex.

SOURCE_VERTEX_PARTITION

edge

Partition value of the source vertex.

TARGET_TAG

edge

VERTEX_TAG of the target vertex.

Any string

TARGET_VERTEX_ID

edge

ID of target vertex.

TARGET_VERTEX_LABEL

edge

Label of the target vertex.

TARGET_VERTEX_PARTITION

edge

Partition value of the target vertex

<any_other_parameter>

vertex edge

Any other parameter is considered a string property of a vertex/edge to be created.

Any valid property value

5. Example

<!DOCTYPE etl SYSTEM "https://scriptella.org/dtd/etl.dtd">
<etl>
<description>CosmosDB CSV bulk import</description>
<connection id="logger" driver="log">
level=WARN
</connection>
<connection id="import" driver="cosmosBulkImport">
host=https://my-cosmos.documents.azure.com:443/
database=MY-DATABASE
graph=MY-GRAPH
throughput=10000
partition=/partition
commit_size=10000
key=......primary key
rewrite=true
</connection>
<connection id="csv" driver="csv" url="/path-to-csv/data.csv">
separator=,
quote=
empty_string=""
</connection>
 
<script connection-id="logger">
STARTING BULK IMPORT
</script>
<!-- Process CSV -->
<query connection-id="csv">
<script connection-id="import">
LABEL=PERSON
ENTITY=VERTEX
VERTEX_TAG=source_node
PARTITION=$6
name=$3
city=$2
phone_number=$7
</script>
<script connection-id="import">
LABEL=ADDRESS
ENTITY=VERTEX
VERTEX_TAG=target_node
PARTITION=$6
street=$5
zip=$8
</script>
<script connection-id="import">
LABEL=LIVES_IN
ENTITY=EDGE
SOURCE_TAG=source_node
TARGET_TAG=target_node
</script>
</query>
<script connection-id="logger">
BULK IMPORT ENDED.
</script>
</etl>