Merge files

<< Click to Display Table of Contents >>

Navigation:  Reference Manual > Files >

Merge files

Merge files are used to update existing records with new data using Data merge. A portable (serial) merge file can be used to create a new data file.

There are four types of merge file:

Portable (serial)

Portable (other)

Coding

Forcing

The first two (portable) types are very similar and are used to update (often large) portions of records.

The last two types are also very similar and are used to update individual questions on selected records.

Merge files can be any CSV type file including Excel (or OpenXML).

Merge files can contain additional columns of information that are not to be used, provided those columns do not have a valid entry name in the header row.

TIP: You may need to set the format of all cells in Excel to "Text" to stop Excel thinking that you are entering a date.

Special data

The data in merge files is what you would expect for the type of question the data is being allocated to, but there are other special values:

In portable files you can use the response text instead of the response number for single-coded and multi-coded questions.

For character and verbatim questions you can use BLANK to clear out any existing content.  For all other question types you can use U or B to clear out existing contents.  This clearing of contents will always be done even if the "Empty cells remove existing contents" is not set.

Multi-coded questions

Multi-coded questions can be in one column or a number of columns:

Single column

If it is not a standard portable file you can choose the separator between response numbers or texts.  Also you can:

Put a B as the first part - this will clear out all existing responses (for example "B 7 11" will remove all existing responses and set responses 7 and 11)

Put a minus sign (-) in front of a response number - this will remove this response (for example "-5 12" will remove response 5 and add response 12)

Put a plus sign (+) in front of a response number - this will add this response to those already there (for example "+5 +6" will add responses 5 and 6 to those already there)

You can use B on its own to make a question empty - the reject response will be used if present

If you use a plus or minus then the existing contents will not be replaced except as specified even if "Multi-coded contents add to existing responses" is not set.

If you use B then the contents will be replaced even if "Multi-coded contents add to existing responses" is set.

More than one column

All the column headers must begin with the question name and be followed by a sequential number preceded by an underscore, for example Q5a_1, Q5a_2, and Q5a_3 and so on.

There are two schemes:

1.Each column for a question can contain a response number or text and empty columns are ignored.

2.You can set "Multi responses have own column".  In this case each column can contain:

A response text

A zero (0) which will be ignored.

A one (1) which will be treated as the relevant response, for example a 1 in the third column for the multi will set response 3.

The relevant response number which will be treated as the relevant response, for example a 4 in the fourth column for the multi will set response 4.

Any values which do not comply with the above will be ignored and listed in an error log.

Portable (serial)

A portable (serial) merge file can contain whole records or parts of records.

This file type is similar to a Classic portable file and starts with a header line.

The first column in the header must be "Serial" and the remaining columns should be the names of questions in the project.

For the rest of the file the first column contains a serial number, and the other columns contain data to replace the contents of the relevant questions in records with the matching serial number.

The data for single-coded and multi-coded questions can be the response texts (instead of the response numbers).

The data for multi-coded questions can be spread over a number columns with a header as the question name followed by an underscore (_) followed by a number.

If a record with the serial number in the merge file cannot be found you can choose to add this record to the main file.

You can create a new main data file from a portable (serial) merge file.

A portable (serial) merge file can be created using Data export.

Portable (other)

A portable (other) merge file contains parts of records.

This file type starts with a header line.

The first column in the header must be an entry (question or variable) name and the remaining columns should be the names of other questions in the project.

For the rest of the file the first column contains the data to be matched for the entry, and the other columns contain data to replace the contents of the relevant questions in all the records that match the data in column 1.

The data for single-coded and multi-coded questions can be the response texts (instead of the response numbers).

The data for multi-coded questions can be spread over a number columns with a header as the question name followed by an underscore (_) followed by a number.

A portable (other) merge  file can be created using Data export and then replacing the first column.

Portable (other) merge files are useful if the record identifiers are not a number.  They can also be used to clean large sections of data based on particular answers to an entry.

Coding

This is a special type of file that is saved when a project with verbatim questions is saved to a raw data file (usually ASC).  A number of columns are output to the file to enable the data to be coded: allocating responses to a multi-coded question based on the textual answer given to the verbatim question.

The only columns used when merging are those headed "Serial" (which must be the first), "Entry", and "Value".

The serial number is set when the file file is created.

The entry name is set when the file is saved if it has been allocated in the verbatim question settings.

The value is the response numbers to be set that are relevant for the verbatim text entered.

A coding merge file can be used to make individual changes to records referencing them by serial number.

Forcing

This file type starts with a header line.  The first column must be "Definition" and the only other columns used are "Entry" and "Value".

For the rest of the file the first column must contain a valid filter (logic) definition.

For every record in the main data file for which the filter is true, the contents of the named question in the Entry column will have its value replaced or altered.

A forcing merge file is used to make changes to a number of records based on the condition specified.

Individual records can also be changed by using, for example, "serial/5" as the definition.

IMPORTANT: changes will be made in turn in the order of the rows in the merge file.  This means that if a question is changed this may affect later matching if it refers to a question that has been altered by an earlier row.