Skip to content

Weighting

Core weighting functions.

This module provides the main functions for applying weights to survey data using the RIM (Raking) algorithm. It includes functions to weight dataframes, return weight series, and calculate weighting efficiency metrics.

weight(df, scheme, verbose=False)

Weight a dataframe using a Rim scheme.

The dataframe must have a column for each dimension in the scheme. String columns are automatically converted to categorical, allowing easier processing.

Parameters:

Name Type Description Default
df DataFrame

The survey dataframe to weight

required
scheme Rim

The Rim scheme object defining the weighting targets

required
verbose bool

If True, prints progress information

False

Returns:

Type Description
Series

A series containing the calculated weights

weight_dataframe(df, scheme, weight_column='weights', verbose=False)

Weight a dataframe using a Rim scheme and return the weighted dataframe.

The dataframe must have a column for each dimension in the scheme. String columns are automatically converted to categorical, allowing easier processing.

Parameters:

Name Type Description Default
df DataFrame

The survey dataframe to weight

required
scheme Rim

The Rim scheme object defining the weighting targets

required
weight_column str

Name of the column to store the calculated weights

"weights"
verbose bool

If True, prints progress information

False

Returns:

Type Description
DataFrame

The input dataframe with an additional weight column

weighting_efficiency(weights)

Calculate the weighting efficiency (Kish's effective sample size).

This metric indicates how much the sample size is reduced due to weighting. Higher values (closer to 100%) indicate more efficient weights.

Parameters:

Name Type Description Default
weights Series

Series containing the calculated weights

required

Returns:

Type Description
float

The weighting efficiency as a percentage (0-100)