In blog #8, we discussed weighting in general, the terminology used, and how to do some types of weighting in MERLIN. In this blog about rim weighting, we will assume you are familiar with the terminology but, if not, we advise you to read blog #8 first.
How rim weighting works
As an example, we will use a project with two rims – gender and age – and the targets shown below.
Just to be clear, we cannot use target weighting because we don’t know the targets for the interlaced matrix, e.g. we don’t know the target for males aged 16-34, only the ‘edges’ or ‘rims’ of the interlaced matrix.
Some people have approached this problem by calculating the factor needed to achieve the gender targets then, having applied this factor to the data, calculated the factor needed to achieve the age targets – then multiplied both factors together. Although this will achieve the age targets, it is unlikely to achieve the gender targets – and if we do the whole process the other way round, the gender targets will be achieved but not the age ones. Rim weighting, however, is a way of achieving two or more sets of targets at the same time.
Unlike target weighting, it is virtually impossible to calculate rim weighting without a computer, since it involves trying lots of different factors until we find the ones that achieve all the targets. These attempts to find suitable factors are known as iterations. Sometimes it simply isn’t possible to achieve all the targets, and MERLIN will report this (but see next section below). Another point to understand about rim weighting is that different programs use different methods of calculation, so may end up with different factors. Although weighted figures would be the same when tabulating gender or age, they may not be the same when tabulating other variables.
Run Control Parameters (RCPs) MAXRIMIT and OKRIM
By default, MERLIN will do 500 iterations before abandoning rim weighting calculations – but this can be increased with MAXRIMIT=n, where n is any number up to 32767. In practice, if MERLIN cannot achieve the targets in 500 attempts, it cannot usually achieve them in more – but it is worth a try. There are two main reasons why rim weighting targets cannot be achieved:
RCP OKRIM specifies that MERLIN will continue with the run even if the targets are not achieved. Sometimes in this situation, inspection of weighted tables shows that the results are close enough or, because of rounding, they may even appear to be what is required.
Calculating weighting for two rims
As with target weighting, the calculation is done with MERLIN’s powerful MANIP facilities.
First, in the Data Stage we need to generate a table (#RIMACT) showing the actual (unweighted) figures for gender by age, then three empty tables to be used in MANIP – a ‘total only’ table for each rim variable (#RIMV01 and #RIMV02), and another table of gender by age (#RIMWGT) that will contain the weighting factors calculated.
F=NPTB, !don’t print these tables
T#RIMACT = $GENDER*$AGE,
T#RIMV01 (F=NITB) = $GENDER*,
T#RIMV02 (F=NITB) = $AGE*,
T#RIMWGT (F=NITB) = $GENDER*$AGE,
Then, in the Manip Stage, we specify the target figures for each rim variable in #RIMV01 and #RIMV02. As with target weighting, it doesn’t matter whether these figures are numbers or percentages (and they could even vary between rims) providing the first figure is the total of the others.
To get the weighting factors into table #RIMWGT, we use function %RIM. The first argument is the total weighted base required, the second is the name of the table containing the actual figures, and the remainder are the names of the tables containing the rim targets. If the total weighted base is required to be the same as the unweighted, specify the first argument as U.
MT#RIMV01 = (100.0,42.0,58.0),
MT#RIMV02 = (100.0,31.0,33.0,36.0),
MT#RIMWGT = %RIM(1500.0,#RIMACT,#RIMV01,#RIMV02),
Finally, in the Tables Stage, we extract the weight from the relevant Row and Column of table #RIMWGT, and apply it to all tables following.
DW $WT=#RIMWGT(R$GENDER,C$AGE)
SELECT WR $WT,
A complete example script can be found in item 11.6 of the MERLIN Tips and Examples library.
Using more than two rims
If we have more than two rims, we need to interlace some of them so we still end up with two variables to use in tables #RIMACT and #RIMWGT. If, for example, we have rims for gender, age and area, we could interlace gender and age…
DS $SIDE=$GENDER.BY.$AGE,
… then tabulate $SIDE * $AREA.
If we have four rims, we could interlace the first three, and so on … but might eventually break some limits. The maximum number of rows in a table is 32000 and the maximum number of columns is 1500 so, if interlacing all except the last variable gives more than 32000 cells, we must also interlace some variables to be tabulated across the top, e.g.
DS $SIDE = $GENDER.BY.$AGE.BY.$CLASS.BY.$MARITAL.BY.$HHSIZE,
DS $TOP = $AREA.BY.$CARS,
The maximum number of rims is 100, and the maximum number of cells when interlacing them all is 1,228,800.
Further examples can be found in items 11.7, 11.8 and 11.9 of the Tips and Examples library.
Any questions? Email support@merlinco.co.uk. ls