High performance Python GLMs with all the features!
Project description
glum
Generalized linear models (GLM) are a core statistical tool that include many common methods like least-squares regression, Poisson regression and logistic regression as special cases. At QuantCo, we have used GLMs in e-commerce pricing, insurance claims prediction and more. We have developed glum
, a fast Python-first GLM library. The development was based on a fork of scikit-learn, so it has a scikit-learn-like API. We are thankful for the starting point provided by Christian Lorentzen in that PR!
The goal of glum
is to be at least as feature-complete as existing GLM libraries like glmnet
or h2o
. It supports
- Built-in cross validation for optimal regularization, efficiently exploiting a “regularization path”
- L1 regularization, which produces sparse and easily interpretable solutions
- L2 regularization, including variable matrix-valued (Tikhonov) penalties, which are useful in modeling correlated effects
- Elastic net regularization
- Normal, Poisson, logistic, gamma, and Tweedie distributions, plus varied and customizable link functions
- Box constraints, linear inequality constraints, sample weights, offsets
This repo also includes tools for benchmarking GLM implementations in the glum_benchmarks
module. For details on the benchmarking, see here. Although the performance of glum
relative to glmnet
and h2o
depends on the specific problem, we find that when N >> K (there are more observations than predictors), it is consistently much faster for a wide range of problems.
For more information on glum
, including tutorials and API reference, please see the documentation.
Why did we choose the name glum
? We wanted a name that had the letters GLM and wasn't easily confused with any existing implementation. And we thought glum sounded like a funny name (and not glum at all!). If you need a more professional sounding name, feel free to pronounce it as G-L-um. Or maybe it stands for "Generalized linear... ummm... modeling?"
A classic example predicting housing prices
>>> from sklearn.datasets import fetch_openml
>>> from glum import GeneralizedLinearRegressor
>>>
>>> # This dataset contains house sale prices for King County, which includes
>>> # Seattle. It includes homes sold between May 2014 and May 2015.
>>> house_data = fetch_openml(name="house_sales", version=3, as_frame=True)
>>>
>>> # Use only select features
>>> X = house_data.data[
... [
... "bedrooms",
... "bathrooms",
... "sqft_living",
... "floors",
... "waterfront",
... "view",
... "condition",
... "grade",
... "yr_built",
... "yr_renovated",
... ]
... ].copy()
>>>
>>>
>>> # Model whether a house had an above or below median price via a Binomial
>>> # distribution. We'll be doing L1-regularized logistic regression.
>>> price = house_data.target
>>> y = (price < price.median()).values.astype(int)
>>> model = GeneralizedLinearRegressor(
... family='binomial',
... l1_ratio=1.0,
... alpha=0.001
... )
>>>
>>> _ = model.fit(X=X, y=y)
>>>
>>> # .report_diagnostics shows details about the steps taken by the iterative solver
>>> diags = model.get_formatted_diagnostics(full_report=True)
>>> diags[['objective_fct']]
objective_fct
n_iter
0 0.693091
1 0.489500
2 0.449585
3 0.443681
4 0.443498
5 0.443497
Installation
Please install the package through conda-forge:
conda install glum -c conda-forge
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for glum-2.7.0-cp312-cp312-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bcef4daaff6216b21fad1c804ea9c695744bf0f407844ba54ea62d044f71a15b |
|
MD5 | 2b87145a617f29769ecf4a9a442ebae7 |
|
BLAKE2b-256 | 6f213f7f68394703ff54346088f379a148bef39984ddcad7c372a5851414587a |
Hashes for glum-2.7.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 15df2cf3e3314b3603a5aef8770b1f4fa15b4a3aa1f58ed5a4884e9c7900f1bf |
|
MD5 | 8df3cccb7b5d2e38d5b625e73c3cbdcf |
|
BLAKE2b-256 | ac9dcab8d02a96789f1e08ad2c1b3088732b873803a1c40ce4677d8dfdb5e2a7 |
Hashes for glum-2.7.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d057b844be8606d064b766be5eeb0af4519f79179438dab3800dc97eec3e159e |
|
MD5 | c73a25b84034d8e3c3bb478175b49f9d |
|
BLAKE2b-256 | 190377d91ec4fbfe0c52251e4e6d3efe772493ccbe6fc795506e0a7f2912dfb4 |
Hashes for glum-2.7.0-cp312-cp312-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b22d4fab58117c3287bbfaabbd79259f64020e9ef6412ef6af3ad248936a2e45 |
|
MD5 | a564fde293b1d4b1155e142034ac69df |
|
BLAKE2b-256 | a5e98b22b996b3f459213074fd1ecf661bc026af123dc5e45fff287f2f90040f |
Hashes for glum-2.7.0-cp311-cp311-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2079477133dc8e189f949dc63fc8ccf63b95334291cb8e783b2f054304849042 |
|
MD5 | 2704a74688f6204dee3697c1738c818d |
|
BLAKE2b-256 | 04d3a11ea315d35e7eea9f37e982e6d28fa44ef58b61a0f7859ea46e75490af5 |
Hashes for glum-2.7.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 20e946df91e41e055299c6cd30aa7231c7b9aaf6e51e28e1f7fe61d671089b05 |
|
MD5 | 6225bcb26c2631592d0cc0032b5b9bcd |
|
BLAKE2b-256 | 2a993976b0575a428f62dd02c5153443166bb88c56045188abb466a28ee24bca |
Hashes for glum-2.7.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d61ecc1afe519ec307c781d5b9fab00f81de760043fc122850ea042af666a8b0 |
|
MD5 | f18d316a54dbfd9967fcb03eb2e82cf3 |
|
BLAKE2b-256 | e15c6151c67c3ef1dafd60aede249d470f6dd731d02aacd928e9e8e5f3a0d0a6 |
Hashes for glum-2.7.0-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e0595ae9c26e3f1d022df6790abb578fdb1fa0a1d70e9923d1cc36ee561a265c |
|
MD5 | a9da695c8c38fe827917b3cb20c93879 |
|
BLAKE2b-256 | 055aaebc9870873e2bb898db80b7bb8a24358764cf6bf0417b9cd25413153ccd |
Hashes for glum-2.7.0-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3ae88ea2b596ff4d90f8e9adf2e5422f1a17a880fda7891abe67c4c97b54d4a0 |
|
MD5 | 736126218b8dc9048b593f1a70ccae90 |
|
BLAKE2b-256 | 59aaf178d64c884ba29b10ea00edfc85b11bce806eedc8b903462a0b5896553b |
Hashes for glum-2.7.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5154f077098d9c89a7831c911108ce40254e8ba12a68682810a8945a24334f41 |
|
MD5 | 5bc9d26356c31a67dea3fb9bf4b33fc1 |
|
BLAKE2b-256 | 70002ab00d7fa1dfc72a7673f70dd65b3b42eab1eb405cf1d52477f31e1280ff |
Hashes for glum-2.7.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bb568a7aa9a47a558cbb92e960a96c7a7b1279dab2a4fe6e8978e822c407fc25 |
|
MD5 | ab8df629e86952446e395d5436fc20a8 |
|
BLAKE2b-256 | 999f67c10a63fdb4fe478ea1af094b08d52ef413e7954f23b7d8c74b75daf223 |
Hashes for glum-2.7.0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 329e5e1e5b55384c390d54050645c20087549f43fcfcc95398541085f8846993 |
|
MD5 | e49c4d4a84b6da77ac6a80b5b014fc99 |
|
BLAKE2b-256 | d5a88879a6ff9b31c926f00490eadb5434054b2043d94b656a9dbcf932238360 |
Hashes for glum-2.7.0-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 769d49a0a451beba10abeb080d9de0bfaa577a38fc9aa06e7c35e7d396cf98c6 |
|
MD5 | 6cb0cd182cf557606c9780e19477b971 |
|
BLAKE2b-256 | db27b95b299439d7ae8ae9df9cb67de16576575f53ade5393877bf2aeaba3e1e |
Hashes for glum-2.7.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b20e9f74ab1c1f6e3faf79333e4e5ee40b133e8ebabbdddef6a5a6fc620e10f5 |
|
MD5 | 2c7e591fc596fb5013f2690859a04860 |
|
BLAKE2b-256 | 12035667c5108dec184ba817b269267ffa54c8a3f1ce71b10422a6d4761ce758 |
Hashes for glum-2.7.0-cp39-cp39-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 59caa9b95faa01bb8e6817b605f6d0a83afede94a388e2ca850c092f31d2a542 |
|
MD5 | f118d518ad6ba908659dbc66ac41ee42 |
|
BLAKE2b-256 | a1b24692ff36b36fa1e770a33a876630d04e079c217409ed43d93eb38cca82e6 |
Hashes for glum-2.7.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f773ea863c72dc38853b56da23a9f8a77e13cd9c93d04224038ca3579d4393d5 |
|
MD5 | ee9dd981ba262d23460518f607b451e1 |
|
BLAKE2b-256 | 2734591a6afb5b3e9f4fdc2fc4df2ec071af9e3307b433ff3b9f5fd924c2d54e |