Table of Contents
1. Description
Executes bulk import of data into Cosmos DB through the Cosmos .NET endpoint.
2. Connection
2.1. Parameters
Name | Description | Required |
|---|---|---|
host | Cosmos DB .NET endpoint. | yes |
database | Database name. | yes |
graph | Graph name. | yes |
throughput | Maximum allowed throughput. | yes |
partition | Partition key path. | yes |
key | Primary key to the database. | yes |
commit_size | Size of entities to be committed, to free resources. The default is 10000. | no |
rewrite | When true, entities with the same IDs will be overwritten. The default is false, which means only entities with unique IDs will be created. | no |
3. Query
Not applicable
4. Script
Executes bulk import of vertices or edges. Vertices and edges are pushed to Cosmos DB after commit_size is reached, with either the number of vertices or the number of edges. If commit_size was not reached, data is pushed when the entire job is done.
Example: when the commit_size = 1000 and only 950 vertices or edges were processed with the <script> element, the data will be pushed at the end of the job.
4.1. Parameters
Name | Applies to | Description | Example |
|---|---|---|---|
ENTITY | vertex edge | Either VERTEX or EDGE. | EDGE |
ID | vertex edge | ID of the newly created vertex or edge. If not provided, a random UUID will be used. | Any unique string |
LABEL | vertex edge | Label for a vertex or an edge. | Person |
PARTITION | vertex | Value of the property used as the partition. | Asia |
VERTEX_TAG | vertex | It saves the vertex information under a specific tag. The tag is overridden when a new vertex is created with the same tag. Usage: In a script, create a vertex with vertex_tag = source_vertex, in a subsequent script, create a vertex with vertex_tag=target_vertex. In the next script, it is possible to create an edge with SOURCE_TAG=source_vertex and TARGET_TAG=target_vertex without defining SOURCE_VERTEX_ID, SOURCE_VERTEX_LABEL, SOURCE_VERTEX_PARTITION, TARGET_VERTEX_ID, TARGET_VERTEX_LABEL, and TARGET_VERTEX_PARTITION. | Any string |
SOURCE_TAG | edge | VERTEX_TAG of the source vertex. | Any string |
SOURCE_VERTEX_ID | edge | ID of source vertex. | |
SOURCE_VERTEX_LABEL | edge | Label of source vertex. | |
SOURCE_VERTEX_PARTITION | edge | Partition value of the source vertex. | |
TARGET_TAG | edge | VERTEX_TAG of the target vertex. | Any string |
TARGET_VERTEX_ID | edge | ID of target vertex. | |
TARGET_VERTEX_LABEL | edge | Label of the target vertex. | |
TARGET_VERTEX_PARTITION | edge | Partition value of the target vertex | |
<any_other_parameter> | vertex edge | Any other parameter is considered a string property of a vertex/edge to be created. | Any valid property value |
5. Example
<!DOCTYPE etl SYSTEM "https://scriptella.org/dtd/etl.dtd"><etl> <description>CosmosDB CSV bulk import</description> <connection id="logger" driver="log"> level=WARN </connection> <connection id="import" driver="cosmosBulkImport"> host=https://my-cosmos.documents.azure.com:443/ database=MY-DATABASE graph=MY-GRAPH throughput=10000 partition=/partition commit_size=10000 key=......primary key rewrite=true </connection> <connection id="csv" driver="csv" url="/path-to-csv/data.csv"> separator=, quote= empty_string="" </connection> <script connection-id="logger"> STARTING BULK IMPORT </script> <!-- Process CSV --> <query connection-id="csv"> <script connection-id="import"> LABEL=PERSON ENTITY=VERTEX VERTEX_TAG=source_node PARTITION=$6 name=$3 city=$2 phone_number=$7 </script> <script connection-id="import"> LABEL=ADDRESS ENTITY=VERTEX VERTEX_TAG=target_node PARTITION=$6 street=$5 zip=$8 </script> <script connection-id="import"> LABEL=LIVES_IN ENTITY=EDGE SOURCE_TAG=source_node TARGET_TAG=target_node </script> </query> <script connection-id="logger"> BULK IMPORT ENDED. </script> </etl>