Mystic Help

BISON: Text Importer

The BISON Text Importer is designed to pre-process both formatted text files as well as some proprietary formats which can be interpreted and processed as text files (e.g. the result of an Excel spreadsheet import). To ease understanding, the dialog is designed to mimic the way in which common importers used in third-party spreadsheets have often been displayed.

Initial Format Processing

Different text files, especially if produced on different operating systems (which use different end-of-line characters) result in the underlying text being loaded into the importer in different ways. The first thing to do therefore, is to tweak import settings until the list of results at the bottom of the dialog show a logical data set rather than 'garbage'.

The three main values for the initial understanding of the text are:

Record Separator - The record separator¹⁾ is the (non-visible) character that is used to separate lines in a text file. In a word-processed document, this is effectively the character(s) that is inserted into the file when you press the [Enter] key on the keyboard. Files produced on different operating systems or by different programs however, may have different line separators. Change the line separator until it is clear that a set of clearly defined rows are being presented in the list box.

Field Separator - The field separator is the character which divides different fields within a single row. Simply by definition, for a CSV²⁾ you will need to choose a comma; for a TSV³⁾ you will need to choose a TAB character.

Field Protector - Because it is possible to put characters into a field which confuse the file format (e.g. an actual comma in a field would result in an artificial field-break), fields are often 'protected' by wrapping them in a known affix. Probably the most common used in both CSV and TSV files is the double-quote.

For an Excel spreadsheet import, the Importer already knows how to read the standard internal structure of an Excel file. For this reason, the Record Delimiter, Field Delimiter and Protection character are locked and the data imported without the need to change these values.

Three additional settings may also modify how the data is displayed on the screen:

Internal Delimiter - Most fields that need to be imported are self-contained, single values. On occasion however, an import file may contain a range of data within a single field. An example of this is an Item export from other Library Management systems, which may place a series of Authors into a single field in the file. The Internal Delimiter allows you to define which character should be used within a field to further sub-divide the data into separate values.

Display Sample - Because a file may be of considerable size, by default the Importer will only display the first 50 rows as a sample which can be used to ensure that the settings (above) are producing the correct results. Once the import looks correct, try changing the Display Sample to [All Data] to check that all records can be brought into the Importer without corruption⁴⁾.

Use 1st Row as Headings - Many CSV and TSV files, as well as many Excel files, use the first row to contain headings explaining what the fields mean in the file. By default, this check-box is on. If data seems to appear in the Header row of the results list, turn this check-box off.

Defining The Fields

Once the data sample is reading correctly and the rows in the results list seem correct, you need to define what each field actually means. While there may be a header row with human-readable text, this is not a row which can be interpreted by the importer. You need to match the field up with a known set of fields for the type of import that you are performing.

Beneath the Header row is a second row initially containing the text [> <None>]. These are the field names as allocated to the importer for processing. Right-click each of these cells in turn to display a list of known fields for the import type being performed - select the appropriate field definition from the menu.

If an imported field is redundant and is not required, leave the field definition as <None> to ignore the data.

Editing Minor Mistakes

Once the final, full set of data is loaded and viewed, it is possible to make minor changes to data that is considered to be incorrect.

It is recommended to perform these edits in an external editor or even better on the source application rather than here in the importer so as to ensure that the data is consistent in both applications.

To edit a cell, select [Edit Mode] to fix the current state of the data. Double-click any cell which needs to be edited to display an editable text field.

Processing the File

Once the data is in an acceptable state on screen, select the [Next»] button to process the file using the selected configuration.

In many cases the file can be process immediately and the data written into the database (common for Item imports), but for others you will be returned to the proceeding BISON Import dialog for finalisation.

Using Presets

If you are ensured of a regular import file (unfortunately, this is rare), you can use the Preset system to save the configuration for future re-use.

Once configured, use the [Save As] button to save the preset into the database.

In order to select a pre-saved preset, select it from the drop-down pick-list at the top of the dialog.

When selected, you can edit the settings and use [Update] to update the settings, or delete it using the [Delete] button.

Using Scripts

In some cases, a single final field must be created by combining several fields together. In other cases, you may wish to process a field as it is being imported to change its nature, remove some common text, or change its case. For this purpose you can use a script.

Scripts must be written in XojoScript (similar to VBScript) and can be edited using the [View Script] button.

Scripts are saved with presets when the [Save] button is used.

It is strongly advised to seek guidance when creating scripts, or otherwise request a script from Esferico support to perform the task required.

Note: Scripts in this context are run in a sandbox and have a very limited scope. They can only modify the data being imported.

Mystic Linked Help Files
Pergamon Wiki Home

¹⁾

Note, that the example in the image was produced on an Apple Mac - if your file was produced on a Windows machine, it is more likely to need Carriage-Return/Newline as the record separator rather than Newline by itself

²⁾

Comma-Separated Values

³⁾

Tab-Separated Values

⁴⁾

on occasion, files are exported with foreign characters etc. which can prevent data after that point from reading correctly

Pergamon Wiki

User Tools

Site Tools

Table of Contents