Data drill down

<< Click to Display Table of Contents >>

Navigation:  Reference Manual > Windows and dialog boxes > Data drill down >

Data drill down

This window is requested from the Main window menu.

This window is used to interrogate a data file.

TIP: No licence is needed to analyse data (tables and distinction) when using a Reflect database.

There are two modes: tables mode and distinction mode.

See also the following Data drill down toolbars and images section.

Global settings that affect all table and distinction analysis can be set with Tables global.

In tables mode the respondent weight, filters, quantity weights, rows, and columns are dragged from the entries list into the appropriate lists or can be dragged from the main window if using the main program.

For distinction mode the respondent weight, filters, quantity weights, sample A filters, and Sample B filters are dragged from the entries list into the appropriate lists or can be dragged from the main window.

Row and column sections are only used in tables mode.

Sample A and Sample B sections are only used in distinction mode.

If entries have been placed into groups then these group names are also listed on the left.  Dragging a group name into the row or columns sections will add all the relevant entries in the group to the list.

More than one quantity can be applied and they will be multiplied together for the analysis.

More than one filter can be applied and only records that pass all the filters will be shown.

New filters can be created and applied by:

Use the New filter button

Drag an entry into the filter or sample A/B sections

Right click an entry in the entries list

In tables mode, drag one or more selected cells from the table into the filters list.  Only records in the selected cells are then used in the analysis

Global settings

There are a number of options that can be used to control the analysis. Please see Tables global which can be accessed using the relevant toolbar button.

Changed settings will be used for any analysis produced after the changes.

Tables mode

TIP. If the entry used as the rows of a table has filters applied, these will be used in addition to any filters set manually.

Many entries can be applied as rows and columns and a table will be produced from each combination.

The tables are shown one at a time with toolbar buttons to move up and down the row entries or between column entries.

You can request a display of the records that pass all the filters.

You can choose to see the raw (unweighted) figures or weighted figures if respondent weighting or quantity weighting has been used.

You can show column percentages and row percentages.

Hovering the mouse over a cell gives details of that cell including confidence intervals of percentages being shown.

Additional rows and columns will be produced in addition to the responses in the entries tabulated:

Missing - records that do not have a response in this entry

Count - the total number of responses given to a multi-coded entry (rows only unless horizontal percentages used)

Unweighted - if the data is weighted the unweighted total figures are also shown

ESS - if the data is weighted then the Effective Sample Size total figures may also be shown

Mean score - if the responses are values or have scores attached then the average will be shown

Standard deviation - optional if mean score produced

IMPORTANT: If you have applied any weights then you can choose whether to view the unweighted figures or the weighted ones.  Any percentages that you view will then be based on the figures chosen.

You can output individual tables or all the tables to a spreadsheet file (.xlsx).  To view the tables report you should set the program in Options settings.

Significance tests

There are two types of testing described below.  All significance testing is done by comparing columns with each other or with an imaginary column derived by subtracting a column from the total column (the "rest").

Against the rest

Each column is tested against "the rest" and significant differences shown.  "The rest" is the total column minus the column being tested.  This methodology makes sure that there are no overlapping samples.

Those cells that are different to the "the rest" in that row are coloured appropriately (green higher, red lower).

Weighting effects are taken into account when doing the testing.

For mean scores (averages) a t test is done.  For normal rows a Z test on proportions is used.

Z tests on a column are not done if the total in that column or "the rest" is less than 30.  If the data is weighted then the ESS (effective sample size) is tested against this minimum value.

Against other columns

If columns have identifiers (a letter in parentheses at the end of the label) then each column under a header will be tested against all the other columns under the same header.

There is an "All columns" option which means that every column with a marker is tested against all other columns with a marker, not just the ones under the same header.

Significant differences are shown by putting the relevant letters under a cell.  Letters under a cell show that the cell is significantly higher than the columns identified.

For Z tests both columns must have a base (ESS if weighted) of 30 or more.  This minimum value can be set in Tables global.

TIP: There is no minimum limit on the base for t tests because the variance is used in the calculation.

IMPORTANT: Only the higher cell is marked.

IMPORTANT: The tests are not valid if the two columns being tested are "overlapping" (some respondents are in both columns).

Excel (OpenXML) output

Using the two toolbar buttons you can build up a spreadsheet file:

The left hand button only builds a sheet for the table being viewed

The right hand button builds a table for every combination of the rows and columns selected

IMPORTANT: be careful if you have more than one column entry selected and a large number of row entries selected.  The number of tables generated will be the number of rows times the number of columns which could be a very large total which will take a long time to produce.

At any time you can ask to view the spreadsheet file in which case it is finalised and then opened in the default program for .xlsx files (usually Excel or OpenOffice).  Future tables will be placed in a new file with a different name.

Distinction mode

This scans the whole data file for two samples (A and B) and finds mean scores and responses that are statistically significantly different for the two samples.

Tip: distinction analysis is much quicker for finding differences between two sub-sets of the data than producing tables.

You can choose to use the raw (unweighted) figures or weighted figures if respondent weighting or quantity weighting has been used.

Only records that pass all the global filters will be inspected.

Filtered questions will only use records that pass the filters.

Sample A is records that pass all the filters in the sample A section.

Sample B is records that pass all the filters in the sample B section. If sample B is not set then it is all other records that pass the global filters that are not in sample A (the rest).

The checking looks at these types:

Individual responses to all single-coded and multi-coded entries, using a Z test

The mean scores of all entries that have score values attached to responses, using a t test

The mean values of all integer or float entries, using a t test

TIP: The calculations can ignore any entry with responses where all of one of the samples is empty.  This is useful for imported projects where the filters on questions have not been defined.  You can turn off this option in Tables global  if you have a project with filters applied to all relevant questions.

For the Z test on responses it will not test any samples with less than the setting in Tables global which defaults to 30. This minimum value can be set in Tables global.

The list shows the significance percentage level in descending order with all rows for an entry kept together

The lowest level shown is the lowest level set in Tables global.

If the number of rows to display exceeds the setting in tables global the display will be cut short and a message appears at the top of the display.

Excel (OpenXML) output

Using the toolbar button you can build up a spreadsheet file by adding the current distinction analysis as a new sheet.

At any time you can ask to view the spreadsheet file in which case it is finalised and then opened in the default program for .xlsx files (usually Excel or OpenOffice).  Future tables will be placed in a new file with a different name.

TIP: The data sheet generated will always contain the full list even if the displayed list has been cut short and a warning shown at the top of the analysis.

The colours set in tables global will be used in the Excel output.  If the lowest level being used is less than the lowest coloured level then tests below this level will not be coloured.