ETL: SolrCsvImport
Table of Contents
Description
Uploads a CSV into SOLR instance configured in the hybrid data model configuration in the project's graph connection.
The hybrid model and integrated SOLR client needs to be enabled in the graph connection on the Application Settings page.
Connection
Defines connection to a SOLR instance.
Parameters
Parameter | Description | Default Value | Required |
---|---|---|---|
project_id | ID of the project with hybrid data model configuration to be used. | ||
upload_url | A part of the URL used by the Data Import Handler. | /upload | |
path_to_csv | Path to the CSV file to be uploaded. It must be accessible by GL on the filesystem. | ||
clear_data | Deletes all the documents from the SOLR instance. | false | |
type | Specifies, what SOLR configuration this driver should use (either "search" or "graph"). | graph | |
separator | CSV file separator. | , (comma) | |
trim | If | false | |
header | Set to | false | either header or fieldnames |
fieldnames | Comma-separated list of field names to use when adding documents. | either header or fieldnames | |
skip | Comma separated list of field names to skip. | ||
skip_lines | Number of lines to discard in the input stream before the CSV data starts, including the header, if present. | 0 | |
encapsulator | The "encapsulator" character is optionally used to surround values to preserve characters such as the CSV separator or whitespace. This standard CSV format handles the encapsulator itself appearing in an encapsulated value by doubling the encapsulator. | ||
escape | The character is used for escaping CSV separators or other reserved characters. If an escape is specified, the encapsulator is not used unless also explicitly specified since most formats use either encapsulation or escaping, not both. | ||
keep_empty | Keep and index zero-length (empty) fields. | false | |
map | Map one value to another. The format is value:replacement (which can be empty). E.g: | ||
overwrite | If | true | |
commit_within | Commit the document within the specified number of milliseconds. | 1000 | |
rowid | Map the | ||
rowid_offset | Add the given offset (as an integer) to the | 0 |
Query
Not available
Script
1) Deletes all documents from SOLR instance (if clear_data=true).
2) Uploads and imports a CSV from path_to_csv parameter.
Examples
Script example: Simple import of some-data.csv.
<!DOCTYPE etl SYSTEM
"https://scriptella.org/dtd/etl.dtd"
>
<etl>
<description>Load CSV into SOLR instance</description>
<properties>
upload_url=/update
path_to_csv=/some/accessible/path/some-data.csv
</properties>
<connection id=
"solrImport"
driver=
"solrCsvImport"
>
project_id=
1
upload_url=$upload_url
path_to_csv=$path_to_csv
</connection>
<script connection-id=
"solrImport"
/>
</etl>
Script example: Clears SOLR data, import some-data.csv. Uses "search" setting of SOLR hybrid model.
<!DOCTYPE etl SYSTEM
"https://scriptella.org/dtd/etl.dtd"
>
<etl>
<description>Load CSV into SOLR instance</description>
<connection id=
"solrImport"
driver=
"solrCsvImport"
>
project_id=
1
upload_url=/update
path_to_csv=/some/accessible/path/some-data.csv
clear_data=
true
type=search
</connection>
<script connection-id=
"solrImport"
/>
</etl>