Character data (fixed format)

<< Click to Display Table of Contents >>

Navigation:  User guide > Data >

Character data (fixed format)

We recommend using data files in CSV format.

IMPORTANT: If your fixed format data file contains text that is not ASCII (English language), then there are important differences between ASC files and UNI files and you must use the correct file name extension.

ASC locations refer to bytes, UNI locations refer to characters, see below.

Character data records consist of characters in a fixed format where the data is stored in specific locations (cols or fields). A data record may be up to 350000 characters long.

ASC data files (extension .asc)

QPSMR CATI currently outputs ASC data files.

ASC data files usually use the local system Locale but can be UTF-8 encoded.

The data locations used for a col or field refer to the bytes within the record that are used to hold the data. A byte holds one ASCII (English language) character.

For cvars with non-English text this can mean that the cvar will hold less characters than the field width.  For example a field allocated 6 locations will use 6 bytes and can only hold a maximum of 3 Far Eastern letters because each will use 2 bytes (using the Locale) or 2 to 3 bytes if UTF-8 encoded.

If a data file does not have a specific extension it will be assumed to be an ASC file.

If a data file is from a Unicode program is is more likely to be formatted as a UNI file.

UNI data files (extension .uni)

These data files can use any encoding (Locale, UTF-8 or UTF16)

The data locations used for a col or field refer to the character positions within the record that are used to hold the data. Each character will use 1-3 bytes in the data record depending of the the language.

Each field can contain as many characters as the locations allocated, regardless of language.

If a data file is from an Classic program is is more likely to be formatted as a ASC file.

Usage

You specify that your data is fixed format character data by using a CARD CHARACTERS command.

Character data records may vary in length. The CARD CHARACTERS command specifies the number of data locations needed for the data. This will also be the maximum length of any record to be input or output.

For more information on character data please see on the following:

Data locations

Codes

Character data input

Character data output