High performance Python GLMs with all the features!
Project description
glum
Generalized linear models (GLM) are a core statistical tool that include many common methods like least-squares regression, Poisson regression and logistic regression as special cases. At QuantCo, we have used GLMs in e-commerce pricing, insurance claims prediction and more. We have developed glum
, a fast Python-first GLM library. The development was based on a fork of scikit-learn, so it has a scikit-learn-like API. We are thankful for the starting point provided by Christian Lorentzen in that PR!
The goal of glum
is to be at least as feature-complete as existing GLM libraries like glmnet
or h2o
. It supports
- Built-in cross validation for optimal regularization, efficiently exploiting a “regularization path”
- L1 regularization, which produces sparse and easily interpretable solutions
- L2 regularization, including variable matrix-valued (Tikhonov) penalties, which are useful in modeling correlated effects
- Elastic net regularization
- Normal, Poisson, logistic, gamma, and Tweedie distributions, plus varied and customizable link functions
- Box constraints, linear inequality constraints, sample weights, offsets
This repo also includes tools for benchmarking GLM implementations in the glum_benchmarks
module. For details on the benchmarking, see here. Although the performance of glum
relative to glmnet
and h2o
depends on the specific problem, we find that when N >> K (there are more observations than predictors), it is consistently much faster for a wide range of problems.
For more information on glum
, including tutorials and API reference, please see the documentation.
Why did we choose the name glum
? We wanted a name that had the letters GLM and wasn't easily confused with any existing implementation. And we thought glum sounded like a funny name (and not glum at all!). If you need a more professional sounding name, feel free to pronounce it as G-L-um. Or maybe it stands for "Generalized linear... ummm... modeling?"
A classic example predicting housing prices
>>> from sklearn.datasets import fetch_openml
>>> from glum import GeneralizedLinearRegressor
>>>
>>> # This dataset contains house sale prices for King County, which includes
>>> # Seattle. It includes homes sold between May 2014 and May 2015.
>>> house_data = fetch_openml(name="house_sales", version=3, as_frame=True)
>>>
>>> # Use only select features
>>> X = house_data.data[
... [
... "bedrooms",
... "bathrooms",
... "sqft_living",
... "floors",
... "waterfront",
... "view",
... "condition",
... "grade",
... "yr_built",
... "yr_renovated",
... ]
... ].copy()
>>>
>>>
>>> # Model whether a house had an above or below median price via a Binomial
>>> # distribution. We'll be doing L1-regularized logistic regression.
>>> price = house_data.target
>>> y = (price < price.median()).values.astype(int)
>>> model = GeneralizedLinearRegressor(
... family='binomial',
... l1_ratio=1.0,
... alpha=0.001
... )
>>>
>>> _ = model.fit(X=X, y=y)
>>>
>>> # .report_diagnostics shows details about the steps taken by the iterative solver
>>> diags = model.get_formatted_diagnostics(full_report=True)
>>> diags[['objective_fct']]
objective_fct
n_iter
0 0.693091
1 0.489500
2 0.449585
3 0.443681
4 0.443498
5 0.443497
Installation
Please install the package through conda-forge:
conda install glum -c conda-forge
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for glum-2.5.1-cp311-cp311-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b80472b3fa9bbd8e32a368325b535ea89d13e2c093ac6981e315057a98d0346f |
|
MD5 | dc558a01f98d9ec29cdb3aa1d8971855 |
|
BLAKE2b-256 | 433f633dfd7318560a2d557e58824e5bcd66e043f98b9a943b1c493b4273371a |
Hashes for glum-2.5.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | daaefb52ffcd4cd59ad57cc5e4d474b88f6a378971dca53b7b1e83f574a797a8 |
|
MD5 | 8e4cf6279debce3b686a36c3e8289485 |
|
BLAKE2b-256 | e43e3128b7faed5341adfa100987189b9f2cbc527c7ab6cccdc757e57bc79f61 |
Hashes for glum-2.5.1-cp311-cp311-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b2244d1034e981e15bd5cc9cc22f921f3cf75b48cbc15978efb54102a958ce38 |
|
MD5 | 333e4619ea5780363e9c9f042bdbc7bc |
|
BLAKE2b-256 | da41627dee73a3cf2baf107c3a3a2dbe7ed0cfa3b9d5a5799020a509c95a9c10 |
Hashes for glum-2.5.1-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 411509149ab1dd113066525c99fd07bd5f0ae49b3d0c4858b3b0843290cabd57 |
|
MD5 | 0f2192d62a95fdbf50b1bf537688d456 |
|
BLAKE2b-256 | 814f86e775203454beb0e4daacbef5901630649d60633697cbb981bcea2d53bc |
Hashes for glum-2.5.1-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2f2ca7781d424a92ee3e0f5bde7d6c291dee15f8cfaa0c79d60d78a9e6c9bb8e |
|
MD5 | e3890486d8deb5712d79b89d0fa85172 |
|
BLAKE2b-256 | a9acc218e492127be0ae4f159333fd0b092c13f594e21e0153761564041a8956 |
Hashes for glum-2.5.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 45a6d9e7ec43239ab5f1f4bbadf6fc8038f4f38e4625b5ae385f6d1e4f876f63 |
|
MD5 | ffd355a9a1eb764d48d1a2a0bd8972f9 |
|
BLAKE2b-256 | 3be1dcaddc0cd495339dc9bb7b402fb66d970bc2a8fb53c85b51be8df1dc59ae |
Hashes for glum-2.5.1-cp310-cp310-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9c4cdb9b5b7194de0c014aae92493520d93e2d87a2965e08156577874832f1e8 |
|
MD5 | b5e57ccee6afbba8990916812f7ad167 |
|
BLAKE2b-256 | a5a614e4b0cbe3d6a6a80b2d167dfed84cd9650935f02ef30e36124ee186b532 |
Hashes for glum-2.5.1-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6eb0baac6a5d02a9e695b192e035aa06a7e5c3702c6094fcfda2ed0d1f74aa46 |
|
MD5 | 2cf4647ab46a9bef04f0dccb08cb6e03 |
|
BLAKE2b-256 | a32d7fd43a4152558035eef32fe9f3fce426870b417c71d3074411f77c127d50 |
Hashes for glum-2.5.1-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dc5717dea1b480e2ede2313b80bc19409bf4849dd0ed5c412fcb9ff7829ad65e |
|
MD5 | 42bb5a8e28d8081070dc3632e6ceb4b2 |
|
BLAKE2b-256 | fe83c9467f287801d0be4834bee3038a7be9ca1c79a8a846098deeefa2f07954 |
Hashes for glum-2.5.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dff6c7976716b2bb8467ff3ce5e99575597b08dc298dce435ac9648ef28eb97c |
|
MD5 | a51ce601f16288c22eae3bad1e01ba6e |
|
BLAKE2b-256 | b850be38d343427098ab3781d96c4eddebc4584162f3cc0cd2d80882d603879c |
Hashes for glum-2.5.1-cp39-cp39-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 929feab8281960f7ef033c0798687787967f5c2df56afd5aaa6428e0966107e0 |
|
MD5 | a790c0d3053336a111bfd7fc5f34289f |
|
BLAKE2b-256 | 6d7b4435349f3faee4959e045bafa550c07815281b6cc62c2037d70ad45342f0 |
Hashes for glum-2.5.1-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | add86c6881149653147c6aee85021f202063b0863aad48e173e8245fd00260e2 |
|
MD5 | 1fddc2c469c94c6fbe790ea349c86123 |
|
BLAKE2b-256 | 50192bed3294103371b4bd4a72d161854a49e43bf2a8a9269d3a55622736f579 |
Hashes for glum-2.5.1-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 091700ad670c0fcdf43670be087e43d89d2350d1bf317b144040888fa8fbadf2 |
|
MD5 | 7e84a48c5c6316d2a400c62681e4c025 |
|
BLAKE2b-256 | ce8c6004c966ba5a141a4c3b8ddfe89bcd04fe12e3bbbc3f20dbfdb386432009 |
Hashes for glum-2.5.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e83fcc2ffb5a0f4647e02f58f5c6d3d6e85ebee968435ce1b900d8c2559ce855 |
|
MD5 | 89871e2e4333e139e6736bc343151fde |
|
BLAKE2b-256 | 0bf3c701a791b0f6ba290861c408d75e2c95f3c96a0f14a49aac71a721bff961 |
Hashes for glum-2.5.1-cp38-cp38-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c722dfa93d1a7ab51c97e445a8a9b05c6489f2a15427bd4d4ad1d25f548416ab |
|
MD5 | 634a64bb97973d8cda59e2d5be417d2a |
|
BLAKE2b-256 | 14722183e7d26adf02461c58e2b43fb894c508953f20251e23c4588cddb6a34e |
Hashes for glum-2.5.1-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6285724020a11fc0ccf4bb295f5d2e3fb5d2ca92100c6e6dad5830d314db97d0 |
|
MD5 | 1fce09b014f9ce7261c1fba8faf7760e |
|
BLAKE2b-256 | d9e11013f972c914eacfff9047e97d34ad74406dcf0878f33535c721885fabd7 |
Hashes for glum-2.5.1-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1b49f4fc7f41fd87642ecfeca06da5b42ce6ae7ab1644f123ba7496147681c11 |
|
MD5 | 24083bc9b399c0f25ce92f84f9d31237 |
|
BLAKE2b-256 | 13e27661cbdc7aaacf1086dfaa42a448e32dd602f2fb8610360e928f98cf5efe |
Hashes for glum-2.5.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 99e5639b0c56e793d6fcb847bf9cb3d4d6c8677493234889b835ad6ae447965a |
|
MD5 | 681e962f4d44c0f190c2954fe96b3c1d |
|
BLAKE2b-256 | 67a1fa96c132cad75dc20cefb9b61c213b911d9c3085768627b61f944ca4e4c7 |
Hashes for glum-2.5.1-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 66daa7e6ee527a14e2b46448a71c65b7fce25149ef04bded7b8414728463c4f5 |
|
MD5 | b24dbe2a60171e50deb8aeb1242142db |
|
BLAKE2b-256 | 7ba3e7af8c640d8425078609bee86c6545a51dccbdd513d0cc695301d50c8498 |
Hashes for glum-2.5.1-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0c90a60b37f8f5c5b071bc594ad47796795916f2b025abdfd64e485c675a8d76 |
|
MD5 | 5d78e9767c4c5228681fc62500303821 |
|
BLAKE2b-256 | 69506d13e81533824904b1ed55b71e9a3a1aafca34729259e7b91ca3a2e703ef |
Hashes for glum-2.5.1-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1e51e50f8d456eed56e26b68c3ced0e8199813f88b6ae508803526763af76d06 |
|
MD5 | 4c5c47f4a5cbf4afecb689a5bbbeb26d |
|
BLAKE2b-256 | f22141b3e387c5b85dacc29049c7c98f46df3f908e487b417f665af244530ecf |
Hashes for glum-2.5.1-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 80d5a02cf8b2d83525ba42d64fb9d262f8addde716acbbee41ddd64bfb56f959 |
|
MD5 | 90b933b47ed0fe8c00a6100b4680edb8 |
|
BLAKE2b-256 | cb397773a62081e3af9e9f40a55a397382e37d4ea3482e51d370e2c0d1661350 |