Data merge

<< Click to Display Table of Contents >>

Navigation:  Reference Manual > Windows and dialog boxes > Data merge >

Data merge

From the Main window menu [Data] [Merge data] is used to merge two sets of data for the same records.

To combine two or more separate data files with different records in them use Combine data files.

The main data file is the one to be updated. One or more merge files contain data that adds to or replaces the data in the main file.

Common uses

There are many different reasons for merging data, here are some examples:

1.The records for a second wave of data for the same people need to be merged with the first wave. The records are linked using the same serial number for both waves or some other ID.  In either case the main data file contains wave 1 and the merge file contains wave 2.  The project will need different question names for the two waves.

2.Additional information (questions)  needs to be added to each record, for example cluster groups from a statistics program.

3.A question response list needs to shuffled.  You can output the response texts using data export, change the project (QDF), and load the question data back in.

4.A block of questions need to be cleared depending on the answer to another question.  For example a batch of questions that should only be answered by women can be made blank if the respondent is male.  For this only one data row is needed in the merge file.  The first column will be gender and will have the male response.  The other columns will be the questions to change and will be blank in this case.  This row will be used for every record in the main file that matches the answer in the first column.

5.The responses to one question need to be forced depending on the answers to another question.

6.Verbatim (or character) questions have been coded and the coded responses need to be put into the data

Merge files

There are four types of merge file:

Portable (serial)

Portable (other)

Coding

Forcing

See Merge file for details of each type.

Procedure

Step 1 (Main data file)

Open the data file to be updated.  This can be any type of Data file. The file is read and is ready for updating.

You can choose instead to start with an empty file and add records from one or more merge files.  You may want to do this if you have a data file with response texts instead of response numbers.

The screen will show the number of records in the main data file at the start.

Step 2 (Merge file)

Choose the multi treatment and then open the merge file(s).  The files will be read in turn and it will inform you of the type.

Each merge file is scanned and the data in it checked to make sure that it is the right type to be stored in the relevant questions.  For forcing merge files the filter definition in column 1 is checked to make sure it is valid.

The screen will show the number of records in the merge file.

If you are using a Portable (other) merge file and the entry in the first column is a character question then you will be asked whether you want to ignore case (upper or lower) when deciding which records match the first column.

If more than one merge file is selected they will be merged in turn.

Step 3 (Execute merge)

Choose any options you need using the check boxes:

Add new records for unmatched. For portable merge files, if the program cannot find a record in the main data file that matches a record in the merge file (in the first column) then a new record is added to the merged data file that contains the information in the merge file for that record.

Empty cells to remove existing contents.  This is normally set on for portable files but you can turn it off so that empty cells in the merge file are ignored and any data in the main data file for empty cells is left untouched.  Any cells that are not empty will always replace the contents in the main file.

Multi-coded question contents add to existing responses.  Any responses found for multi-coded questions are added to any existing responses in the main data file.  What happens with empty cells is determined by the previous setting.  For multi-coded spread over a number of columns, "empty" means empty in all the columns.

Then use Merge data now.  This causes the merge to take place:

The main data file is scanned to find matching records in the merge file.  When a match is found the question data from that row is used to replace the contents of the relevant questions in the main data file.  In a coding or forcing merge file the same row may be matched many times.

If requested and there are records in the portable merge file that have not been used, they are added as new records .

At the end of the scan you will be informed of any problems.  You will also be informed of the number of records now in the main file and how many records in the merge file were not used (unmatched) or used more than once.

Step 4 (Save merged data)

You can repeat steps 2 and 3 using different merge files before saving.

The merged data is saved to a new data file.  You can choose the type of file you want to save.