Fixed format data files

<< Click to Display Table of Contents >>

Navigation:  Reference Manual > Files >

Fixed format data files

We recommend using CSV data files.

This section describes the various types of fixed format data files.

IMPORTANT: the way UNI (files with extension .uni) files are treated is different to earlier versions.

There are two basic types of fixed format data files: ASC and UNI.  If a correct extension is not used then ASC is assumed.

For fixed format data files each question is allocated a specific set of data locations.  The difference between the two data file types relates to the way that data locations are used.

ASC (file extension .asc)

Data locations refer to bytes in the data file record.  For example a character question in data location 12 width 4 will use bytes 12, 13 ,14 and 15 in the data record.

This character question can hold up to 4 English (ANSII) characters (including blanks) which can include numbers and normal punctuation.

If this character question contains a text that is not English then the number of characters that it can hold will vary depending on the language and the encoding.  For example it will only hold 2 Korean characters because each character will use 2 bytes.

If you are not using English then you should allow more data locations for character questions depending on the encoding used.

If a data file only contains ANSII characters then it does not matter whether it is treated as ASC or UNI because every character will only use one byte.

Encoding in ASC files

The way the information is stored in each record depends on the encoding, see Encoding.

ASC files can use one of the following encoding types:

Locale (MBCS)

UTF-8 (with or without a BOM)

UNI (file extension .uni)

Data locations refer to characters in the data file record.  For example a character question in data location 12 width 4 will use characters 12, 13 ,14 and 15 in the data record.

This character question can hold up to 4 characters (including blanks) in any language.  The actual number of bytes used in the data record will depend on the encoding used.

Encoding

The way the information is stored in each record depends on the encoding, see Encoding.

UNI files can use one of the following encoding types:

Locale (MBCS)

UTF-8 (with or without a BOM)

UTF-16 LE

Some languages, such as Thai and Indian, should avoid using UTF-16 data files because some characters will need 2 UTF-16s to represent a single character. UTF-8 CSV data files are strongly recommended for these languages.