Weighting
Core weighting functions.
This module provides the main functions for applying weights to survey data using the RIM (Raking) algorithm. It includes functions to weight dataframes, return weight series, and calculate weighting efficiency metrics.
weight(df, scheme, verbose=False)
¶
Weight a dataframe using a Rim scheme.
The dataframe must have a column for each dimension in the scheme. String columns are automatically converted to categorical, allowing easier processing.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
The survey dataframe to weight |
required |
scheme
|
Rim
|
The Rim scheme object defining the weighting targets |
required |
verbose
|
bool
|
If True, prints progress information |
False
|
Returns:
| Type | Description |
|---|---|
Series
|
A series containing the calculated weights |
weight_dataframe(df, scheme, weight_column='weights', verbose=False)
¶
Weight a dataframe using a Rim scheme and return the weighted dataframe.
The dataframe must have a column for each dimension in the scheme. String columns are automatically converted to categorical, allowing easier processing.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
The survey dataframe to weight |
required |
scheme
|
Rim
|
The Rim scheme object defining the weighting targets |
required |
weight_column
|
str
|
Name of the column to store the calculated weights |
"weights"
|
verbose
|
bool
|
If True, prints progress information |
False
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
The input dataframe with an additional weight column |
weighting_efficiency(weights)
¶
Calculate the weighting efficiency (Kish's effective sample size).
This metric indicates how much the sample size is reduced due to weighting. Higher values (closer to 100%) indicate more efficient weights.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
weights
|
Series
|
Series containing the calculated weights |
required |
Returns:
| Type | Description |
|---|---|
float
|
The weighting efficiency as a percentage (0-100) |