High performance Python GLMs with all the features!
Project description
glum
Generalized linear models (GLM) are a core statistical tool that include many common methods like least-squares regression, Poisson regression and logistic regression as special cases. At QuantCo, we have used GLMs in e-commerce pricing, insurance claims prediction and more. We have developed glum
, a fast Python-first GLM library. The development was based on a fork of scikit-learn, so it has a scikit-learn-like API. We are thankful for the starting point provided by Christian Lorentzen in that PR!
The goal of glum
is to be at least as feature-complete as existing GLM libraries like glmnet
or h2o
. It supports
- Built-in cross validation for optimal regularization, efficiently exploiting a “regularization path”
- L1 regularization, which produces sparse and easily interpretable solutions
- L2 regularization, including variable matrix-valued (Tikhonov) penalties, which are useful in modeling correlated effects
- Elastic net regularization
- Normal, Poisson, logistic, gamma, and Tweedie distributions, plus varied and customizable link functions
- Box constraints, linear inequality constraints, sample weights, offsets
This repo also includes tools for benchmarking GLM implementations in the glum_benchmarks
module. For details on the benchmarking, see here. Although the performance of glum
relative to glmnet
and h2o
depends on the specific problem, we find that when N >> K (there are more observations than predictors), it is consistently much faster for a wide range of problems.
For more information on glum
, including tutorials and API reference, please see the documentation.
Why did we choose the name glum
? We wanted a name that had the letters GLM and wasn't easily confused with any existing implementation. And we thought glum sounded like a funny name (and not glum at all!). If you need a more professional sounding name, feel free to pronounce it as G-L-um. Or maybe it stands for "Generalized linear... ummm... modeling?"
A classic example predicting housing prices
>>> from sklearn.datasets import fetch_openml
>>> from glum import GeneralizedLinearRegressor
>>>
>>> # This dataset contains house sale prices for King County, which includes
>>> # Seattle. It includes homes sold between May 2014 and May 2015.
>>> house_data = fetch_openml(name="house_sales", version=3, as_frame=True)
>>>
>>> # Use only select features
>>> X = house_data.data[
... [
... "bedrooms",
... "bathrooms",
... "sqft_living",
... "floors",
... "waterfront",
... "view",
... "condition",
... "grade",
... "yr_built",
... "yr_renovated",
... ]
... ].copy()
>>>
>>>
>>> # Model whether a house had an above or below median price via a Binomial
>>> # distribution. We'll be doing L1-regularized logistic regression.
>>> price = house_data.target
>>> y = (price < price.median()).values.astype(int)
>>> model = GeneralizedLinearRegressor(
... family='binomial',
... l1_ratio=1.0,
... alpha=0.001
... )
>>>
>>> _ = model.fit(X=X, y=y)
>>>
>>> # .report_diagnostics shows details about the steps taken by the iterative solver
>>> diags = model.get_formatted_diagnostics(full_report=True)
>>> diags[['objective_fct']]
objective_fct
n_iter
0 0.693091
1 0.489500
2 0.449585
3 0.443681
4 0.443498
5 0.443497
Installation
Please install the package through conda-forge:
conda install glum -c conda-forge
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for glum-2.3.0-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dd32ad1737bb2fff972e47cbfc3998491dedc675f3b4c2db95b1d9ec9f8d3a0d |
|
MD5 | ce4639f647a5d29740584f3a5ff28d70 |
|
BLAKE2b-256 | 7a5112153bb49cd97bf9d68f62910b73a1e098ea2eb9f9cd7f6739b5366507b3 |
Hashes for glum-2.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b1b81921f29970af161b16b5f4699508680174d9ec7b3cfa3e5c275587e4b24a |
|
MD5 | 7de0405aeadb864270e2860e18d52df6 |
|
BLAKE2b-256 | 1075281af3d02252506682643e068646bf11221563360c3017704db288205273 |
Hashes for glum-2.3.0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7f39e98a0b69b46dc20a377c9d3a213ee27c0f7a8f5a0ea4b741c9c514ce7739 |
|
MD5 | a76044d1caaddcb6610de5d9209c9faf |
|
BLAKE2b-256 | e3b03b7bdeaadd9ed3d07f2370e6e404b2f400ea4da041dd21a32cf43cf3ad62 |
Hashes for glum-2.3.0-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c5daa36da6041ffcc0836525eb55b25fcf87d7252e28b6da8f7b1add2b006f5d |
|
MD5 | 1315fa2c1bf400db2567ddc587b97339 |
|
BLAKE2b-256 | 38d9a9946ce8558bace4b8aeabc7ad5b5e2a24231f6b4425740f4f6b7e983e16 |
Hashes for glum-2.3.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9e1371d20e61a4b0c35722a41a79d6e83d03ef1b60d2ca39a403a3017288e228 |
|
MD5 | f6eb0d8a86d4fd8c1baaeb45b691d7c1 |
|
BLAKE2b-256 | 8f444acb94e2dd9ac67e2dd7c0cddfd687146e749c3d1cd98641026b61086ddc |
Hashes for glum-2.3.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 142d991c2c73f0ba567d742bb0345f28afcc3cb6d44414d7a40b0a5d656244a5 |
|
MD5 | dd50d8807468356a103335da6a7868d9 |
|
BLAKE2b-256 | 44abad67604673633c7c1aa7b5f984e51eba6509caf466428bbc7c0bab6de6dc |
Hashes for glum-2.3.0-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 29744dbb8aaca1521e5e30534b3bf985dd11eb3f88c966adbb3696fc16986fac |
|
MD5 | 7afbcf37e2c69462818dbf3945895de5 |
|
BLAKE2b-256 | b121af3fa9eac59169cc4df793a346a8e3eefee2d8585112825a1fb13f311af9 |
Hashes for glum-2.3.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 17673ef99e0f24e63e1fdf3500a7f763aec4191a8fa3cbb306c22b813369b344 |
|
MD5 | 3f17a7f42842bdb91e40feee1984075f |
|
BLAKE2b-256 | 6fd377e766bf3ddab71c40afb60d3d7c197b94f68cb9f334194122d0f3048587 |
Hashes for glum-2.3.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 94bdbc2e3ec36c5bd3e337185f16b1308ce8c6ec92e1b09428faac344615258d |
|
MD5 | cd735240bfc307f1386747f09e39c79e |
|
BLAKE2b-256 | 74f118f6be670feace80c6f82407357e4c9ae51fac4dbac5f9181154e0147581 |
Hashes for glum-2.3.0-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ac33accf97eaf89e2556e932881cc437016f1a7a5c6fee03f3a6d67b097d4b46 |
|
MD5 | 60ea482854fccfcbe5e8f612c0060548 |
|
BLAKE2b-256 | 65327272651692bc44a73a28d670f2c30aede4043ad7c0ea6add011b21aebb07 |
Hashes for glum-2.3.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3c0a3101383930bc791ead61b84e5c561e13c43172b165514f90f2ebc97e725b |
|
MD5 | 2b4134bce9f43dc627f4a04c11503835 |
|
BLAKE2b-256 | f28c22bb207b67f1b5a407a90a540772267dbb930a0ffc454f71109a722aa82a |
Hashes for glum-2.3.0-cp37-cp37m-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a59a0503d7b2a4977b0a149e8209fea3cbf71ca3282516474ecce5a98ea3993c |
|
MD5 | 2eab84a3bd9985f1eff317d51f74a7bc |
|
BLAKE2b-256 | 2cbc0c7c6f1bdc20de030ca6d4e61d4b7f00db6f6c346c6f55bdccd11eb517fc |
Hashes for glum-2.3.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cb486d6e253919ca05d40cb90a192458156ca1e4fbaa4796afe8c067a25316d4 |
|
MD5 | 10a36ae62360cfb90441d36eb2f5531b |
|
BLAKE2b-256 | c051babbf9008165084fee61c5e01d58eed5a51809be1673e632762c2961ae0d |
Hashes for glum-2.3.0-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 75db4dd06ab1778700ea9b57f3a79f2384feb859b3b0e3a5fe49a086c46b3994 |
|
MD5 | 1a1e4635d950ea1c17a4e7ed8c13f066 |
|
BLAKE2b-256 | a72de1790c8fe7d35385be58c544f2e985b93b1051cb9013ada4fb5e28669150 |
Hashes for glum-2.3.0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 88589c13405bc049aacb77edc5a06207b2b1404f17263e314bff761008e92386 |
|
MD5 | 2955726895a6d5be8047ee86ad4c694d |
|
BLAKE2b-256 | a321886a6c43da0555c79b63374398b57adc860c5943157797160af64cc53c6a |
Hashes for glum-2.3.0-cp36-cp36m-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4f5a0727df66f807b21137b6027d8ea170c5b8e77a3af41b0ea74f268bc3262a |
|
MD5 | b7a7feda6db873b7cc0f2284dfc7491f |
|
BLAKE2b-256 | 9ec8c435ad69857e429d11b69f419ed7148d90ab626fcc755a8ac195b1eda0ab |
Hashes for glum-2.3.0-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1704adbb234bd40919032899c4ae5dad51ae514c469331d22816d9ad392dbc6f |
|
MD5 | 025bf60350cf67a860fbc687b6c7d6de |
|
BLAKE2b-256 | c5ec2601b58803262958c6d9aad2cf3ac8550ff2b1ac6828330b22187c372d46 |