High performance Python GLMs with all the features!
Project description
glum
Generalized linear models (GLM) are a core statistical tool that include many common methods like least-squares regression, Poisson regression and logistic regression as special cases. At QuantCo, we have used GLMs in e-commerce pricing, insurance claims prediction and more. We have developed glum
, a fast Python-first GLM library. The development was based on a fork of scikit-learn, so it has a scikit-learn-like API. We are thankful for the starting point provided by Christian Lorentzen in that PR!
The goal of glum
is to be at least as feature-complete as existing GLM libraries like glmnet
or h2o
. It supports
- Built-in cross validation for optimal regularization, efficiently exploiting a “regularization path”
- L1 regularization, which produces sparse and easily interpretable solutions
- L2 regularization, including variable matrix-valued (Tikhonov) penalties, which are useful in modeling correlated effects
- Elastic net regularization
- Normal, Poisson, logistic, gamma, and Tweedie distributions, plus varied and customizable link functions
- Box constraints, linear inequality constraints, sample weights, offsets
This repo also includes tools for benchmarking GLM implementations in the glum_benchmarks
module. For details on the benchmarking, see here. Although the performance of glum
relative to glmnet
and h2o
depends on the specific problem, we find that when N >> K (there are more observations than predictors), it is consistently much faster for a wide range of problems.
For more information on glum
, including tutorials and API reference, please see the documentation.
Why did we choose the name glum
? We wanted a name that had the letters GLM and wasn't easily confused with any existing implementation. And we thought glum sounded like a funny name (and not glum at all!). If you need a more professional sounding name, feel free to pronounce it as G-L-um. Or maybe it stands for "Generalized linear... ummm... modeling?"
A classic example predicting housing prices
>>> from sklearn.datasets import fetch_openml
>>> from glum import GeneralizedLinearRegressor
>>>
>>> # This dataset contains house sale prices for King County, which includes
>>> # Seattle. It includes homes sold between May 2014 and May 2015.
>>> house_data = fetch_openml(name="house_sales", version=3, as_frame=True)
>>>
>>> # Use only select features
>>> X = house_data.data[
... [
... "bedrooms",
... "bathrooms",
... "sqft_living",
... "floors",
... "waterfront",
... "view",
... "condition",
... "grade",
... "yr_built",
... "yr_renovated",
... ]
... ].copy()
>>>
>>>
>>> # Model whether a house had an above or below median price via a Binomial
>>> # distribution. We'll be doing L1-regularized logistic regression.
>>> price = house_data.target
>>> y = (price < price.median()).values.astype(int)
>>> model = GeneralizedLinearRegressor(
... family='binomial',
... l1_ratio=1.0,
... alpha=0.001
... )
>>>
>>> _ = model.fit(X=X, y=y)
>>>
>>> # .report_diagnostics shows details about the steps taken by the iterative solver
>>> diags = model.get_formatted_diagnostics(full_report=True)
>>> diags[['objective_fct']]
objective_fct
n_iter
0 0.693091
1 0.489500
2 0.449585
3 0.443681
4 0.443498
5 0.443497
Installation
Please install the package through conda-forge:
conda install glum -c conda-forge
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for glum-2.6.0-cp311-cp311-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2ef3e4a569495d1553bb581e837423e91862f5fcbd0782ca7e343792ff4d400c |
|
MD5 | 69424e1618ee43121d143ce82f35a747 |
|
BLAKE2b-256 | da11401103ff602eef435072d59d677bb96d818f74f273ca8ce59480572c558a |
Hashes for glum-2.6.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | aa3d364c6a0f90e7623a6df4400298b00dc3febe10c1165356beb39bda8a55bd |
|
MD5 | acc86998eb878a2d2e55b0ab8dda6195 |
|
BLAKE2b-256 | 9c82f4374d714f1d82de77c62ce6ed86ec9222697edb203c258e0dd0b87ed49c |
Hashes for glum-2.6.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f5b05ab35ff21d5531cd004409c52285d303ed173030eb59c0487b080e27f7b1 |
|
MD5 | f076f4081d3580b6b5443438c37d400e |
|
BLAKE2b-256 | 30b8952fe9d9399a964260c3efc7bee157ebd754b482d6793425f66ad097c07c |
Hashes for glum-2.6.0-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0cc2f9c1658bc78edac1190afdd88a456ce1882ffeb672beb420eb4e4e3a6353 |
|
MD5 | 30f4ddc8659c0647071eeb6e9cfd8041 |
|
BLAKE2b-256 | 549358416a1a35e5b3d5f112426515dce39a4138da2ab65b55f08a454919c4f1 |
Hashes for glum-2.6.0-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 40e1cff6b7000f630da17a2647541718c9a08a88acc36faf6190b8adcaf35eac |
|
MD5 | 6637939addd75c1860342488868f0d18 |
|
BLAKE2b-256 | 31753255a4c62c37fd2a706ba7c98171be3b7aca033a2ee921b90d268afcb5dd |
Hashes for glum-2.6.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 406404715c172d19f7c1e18ead4d32acf02cc76fad06fb4cbce55e1bb1d5bbfa |
|
MD5 | deb22ccce6bc227e19a8572ff41be36d |
|
BLAKE2b-256 | baee747c4b81e6dbf8db3eb05c85596cb1d2a2e9ad4066fd0861a2053edc5c6f |
Hashes for glum-2.6.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e6192cb80339879b2c7a599201b8daa282150c73a09326da59aeab6f16392346 |
|
MD5 | f7fe4693af2eecaf85d48d46d743aa85 |
|
BLAKE2b-256 | bdab3bf2ede8c1d3effa204163884f64d11acf1f409db2857894c30c0413193e |
Hashes for glum-2.6.0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3689a808cdb954bd93c639f4a41d23904b2b6ad2d2a730485dd7e280541e1f6f |
|
MD5 | 62c5d85ca4da837aef74f1bdaf056225 |
|
BLAKE2b-256 | 68460c9b563c03052e27a8fc9c3af6385bef5b202784da46c95b6644080836da |
Hashes for glum-2.6.0-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 751397f12f09ee13b56b537f44e2a1f1b153284143dd6854d83237e4663dcd5b |
|
MD5 | b6009a8bee3c3a360feda599518bcbaa |
|
BLAKE2b-256 | d8fcfeb4b983ede1142e7ba0bd0bd9928c584fe74f57056c4dfea92e4c45ca0f |
Hashes for glum-2.6.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3ef77e6edd1bad234068dc14fe966f97ca3dabbd95e52242ae4d572b20c7de52 |
|
MD5 | 8aebed9acf982471b73eb439039980a7 |
|
BLAKE2b-256 | 66dd3d21066ce7ae3ba0502b13a32ad3a4cb5bc67f66a117a0ab6b2ef91d0083 |
Hashes for glum-2.6.0-cp39-cp39-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 171a92b58af5cf5fe65e59114732eff7af1d802eb1ba44d17497b29998bc0a82 |
|
MD5 | 6a7529eedac34816e73b51456a9a697d |
|
BLAKE2b-256 | c1454f8280f42bacacf0ef08694761e7b3650be590acb48dfdff6e3b06a225ee |
Hashes for glum-2.6.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2026b825afbf821e636de5f390daa4762e82646bbb988db7460237163b881d73 |
|
MD5 | c0e6a3b2b863c6eb9eba41e0716eb831 |
|
BLAKE2b-256 | 99ce7b68a9f8d49eeb49b0e53f1e446ab531b45ec0f37e6627bbf34ee64ad9b8 |
Hashes for glum-2.6.0-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 23ff32b0bf54dd4dd53920f77d506e66e76597c7f3e579b70238bbb5085b8ccd |
|
MD5 | 120e1b39e8f019c24dd8f782cc252376 |
|
BLAKE2b-256 | e60cfcc0b1c8701b886554c60c9dd543bf374a660b59eace699334a830402c7b |
Hashes for glum-2.6.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 52d0d980a6e90aa0182ff54218b7cb84644a57410ba97af26b663971d470b4cb |
|
MD5 | 680ee59e0010892b38db57fcf7669169 |
|
BLAKE2b-256 | ef9ef4dc02496ec8556cb41f75386bc1a824d96f722f282d1c9fc41a1a4e5f14 |
Hashes for glum-2.6.0-cp38-cp38-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 50f4d0159662aa0a7b15a96668a7d7fac07f5030f29f8a2c3c2b78888463ae78 |
|
MD5 | eefb20c9c5b5595fd08fae21c817bcb2 |
|
BLAKE2b-256 | 5575edd02aff6ed4cc2d42d7ad3e1fc14361c709b3ba287d7635b8c302bb9122 |
Hashes for glum-2.6.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d120bca4720a5f56e4b797dd3bc0ea52f02913a1193642c40dc3478bb9bda6b2 |
|
MD5 | 2463fd6592343a4825296d42067c4220 |
|
BLAKE2b-256 | 81402210e2f673414b08b6986360c09fe45b3b0ea122c48abb1550e1117c066d |
Hashes for glum-2.6.0-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5b65a57642a2adca633aab1616cca86caf9318295b1de4252fd497020d321443 |
|
MD5 | e43f2b3d1af350d830e7624d4019303a |
|
BLAKE2b-256 | 8be48066210c216db6363674c68b83b0b39ceab4bc9a2335b3d38b8f5e1e5802 |
Hashes for glum-2.6.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bab7d0637c19fb349f88b48f591551ae393e97e028911204051ae56dc086e61d |
|
MD5 | 0973a211d00b22a8df5249df9c322d96 |
|
BLAKE2b-256 | 07bfc2fe7b753e7d8122b2b21e3a6e304d8a8150e898d2aa3ce6765a40b1b355 |
Hashes for glum-2.6.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cb4e311a8c267d6064d1926bd868d82abe0b5605522a2f32be1c9d222843d48d |
|
MD5 | 9cfbce1ff4217302a75b49e7bca02f5f |
|
BLAKE2b-256 | e1261924ca1d71b210789a4ac74ff6a350b70d1c8784e946e3f560babe4b93bc |
Hashes for glum-2.6.0-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 13eea98e0d5a4565a9b36d4d918be6c3fdf2b92d9e893bac8263f9443fef6a5b |
|
MD5 | 27f5ea790802771b3d7f8892633a2d05 |
|
BLAKE2b-256 | 6dc617b538ab66009f8853c749d65a3232fc583b7dacd3c39358a7e5abd74bcf |
Hashes for glum-2.6.0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0ac11e6a4bbee9f4d1dde53c1e19666924800f697bc229867fcf8f7f597b01c6 |
|
MD5 | 5d525145406e5a4a115eb9fd717d02bd |
|
BLAKE2b-256 | a839f4db82520ad6edfc66754cf5c42633ae98b8f4db8bd23521095a51301aa8 |
Hashes for glum-2.6.0-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 32a6b6c8115416f3bfe64a99af8eaa276cf67b104fd8a504e54a5e4b70d6d39a |
|
MD5 | 60620553b808a5c7c0b41d8a6c810229 |
|
BLAKE2b-256 | f73978f95b65acab34ff178b3f956e798f5df57a38b94db3ca41ee9ccecc557c |