When computers were first used to analyse market research data the most common means of storing data was column binary (card image) format but, with the advent of telephone and web interviewing packages, ASCII data became the favourite. Now, however, more and more data are stored in Excel or CSV files – and there are numerous advantages to this approach including:
MERLIN can read and write data in Excel or CSV files, and the principles described below apply to both. We refer to data fields / cells in such files as freefields (free format fields) to distinguish them from fixed location fields in ASCII and column binary files, which we have traditionally called fields.
Assuming that the first row in each data file is a header containing the names of the variables (which is nearly always the case) you will need to specify one of these RCPs (Run Control Parameters):
The RECORD LENGTH statement needs to specify the number of freefields in your data file(s), rather than the number of characters, e.g. RECORD LENGTH 57 FIELDS. There could, in fact, be more freefields, but this would mean you are only analysing the first 57.
To refer to a freefield, you can use one of three approaches (which can be mixed freely within the same script). In all cases, the reference starts with $* (as opposed to $ on its own, used to refer to ASCII and column binary data locations):
The last approach is the most flexible, since Q25 could be in any location, and its position may vary from file to file (which is often the case when you are analysing different waves of the same survey together).
Another advantage of Excel or CSV files is that multi-coded variables can be stored in a single freefield, as opposed to the traditional approach of entering code 0 or 1 in a series of locations (known as a ‘bitstring’). The bitstring approach may still be used with freefield data but the former is easier to work with, and more flexible if the number of valid codes changes. A single freefield could therefore contain “1,7,10” to signify these 3 values (the quotation marks are optional in Excel, but compulsory in a CSV file). To indicate that a freefield may contain more than one value, an extra asterisk must be added after the freefield reference, e.g.
If your data file is not in Excel or CSV format, you can use MERLIN Toolkit to convert it, e.g. you can import from SPSS, and you can automatically generate a MERLIN script defining the variables.
Any questions? Email support@merlinco.co.uk.