High performance Python GLMs with all the features!
Project description
glum
Generalized linear models (GLM) are a core statistical tool that include many common methods like least-squares regression, Poisson regression and logistic regression as special cases. At QuantCo, we have used GLMs in e-commerce pricing, insurance claims prediction and more. We have developed glum
, a fast Python-first GLM library. The development was based on a fork of scikit-learn, so it has a scikit-learn-like API. We are thankful for the starting point provided by Christian Lorentzen in that PR!
The goal of glum
is to be at least as feature-complete as existing GLM libraries like glmnet
or h2o
. It supports
- Built-in cross validation for optimal regularization, efficiently exploiting a “regularization path”
- L1 regularization, which produces sparse and easily interpretable solutions
- L2 regularization, including variable matrix-valued (Tikhonov) penalties, which are useful in modeling correlated effects
- Elastic net regularization
- Normal, Poisson, logistic, gamma, and Tweedie distributions, plus varied and customizable link functions
- Box constraints, linear inequality constraints, sample weights, offsets
This repo also includes tools for benchmarking GLM implementations in the glum_benchmarks
module. For details on the benchmarking, see here. Although the performance of glum
relative to glmnet
and h2o
depends on the specific problem, we find that when N >> K (there are more observations than predictors), it is consistently much faster for a wide range of problems.
For more information on glum
, including tutorials and API reference, please see the documentation.
Why did we choose the name glum
? We wanted a name that had the letters GLM and wasn't easily confused with any existing implementation. And we thought glum sounded like a funny name (and not glum at all!). If you need a more professional sounding name, feel free to pronounce it as G-L-um. Or maybe it stands for "Generalized linear... ummm... modeling?"
A classic example predicting housing prices
>>> from sklearn.datasets import fetch_openml
>>> from glum import GeneralizedLinearRegressor
>>>
>>> # This dataset contains house sale prices for King County, which includes
>>> # Seattle. It includes homes sold between May 2014 and May 2015.
>>> house_data = fetch_openml(name="house_sales", version=3, as_frame=True)
>>>
>>> # Use only select features
>>> X = house_data.data[
... [
... "bedrooms",
... "bathrooms",
... "sqft_living",
... "floors",
... "waterfront",
... "view",
... "condition",
... "grade",
... "yr_built",
... "yr_renovated",
... ]
... ].copy()
>>>
>>>
>>> # Model whether a house had an above or below median price via a Binomial
>>> # distribution. We'll be doing L1-regularized logistic regression.
>>> price = house_data.target
>>> y = (price < price.median()).values.astype(int)
>>> model = GeneralizedLinearRegressor(
... family='binomial',
... l1_ratio=1.0,
... alpha=0.001
... )
>>>
>>> _ = model.fit(X=X, y=y)
>>>
>>> # .report_diagnostics shows details about the steps taken by the iterative solver
>>> diags = model.get_formatted_diagnostics(full_report=True)
>>> diags[['objective_fct']]
objective_fct
n_iter
0 0.693091
1 0.489500
2 0.449585
3 0.443681
4 0.443498
5 0.443497
Installation
Please install the package through conda-forge:
conda install glum -c conda-forge
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for glum-3.0.0a1-cp311-cp311-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5814f5610af23316f21ce4129dd4971771c7d1a0a648008b7acda7262003e703 |
|
MD5 | 2905d7d6c1711aab817bcff40ae22850 |
|
BLAKE2b-256 | bfaeaea79feca56b732430588dbc977d81f2db62f5d9dda4dae16d2159d22fea |
Hashes for glum-3.0.0a1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ef925852c1027575c726c2c131ae7d4a415dab8a23ce639fa977ca06581f779f |
|
MD5 | 375d8b4d766333726c720afbf506c7cc |
|
BLAKE2b-256 | 89278a1018c731552cbe400872a2777770b01314e5767faae6913cd47886b781 |
Hashes for glum-3.0.0a1-cp311-cp311-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9fda6393c2442a34fe17f1293c6ebb77d924416c1fd20db6631bec86402ea1e8 |
|
MD5 | d5bf1843662f9f2240062478f4399620 |
|
BLAKE2b-256 | ddb45ca2dafdd600fe39da209441c092b5813a14d956b6e9ee8f820b25fd1e02 |
Hashes for glum-3.0.0a1-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3e136fe0c43437b5b3202a56c570ed086b91c045d92b130b5d07ec47e1d9e5a8 |
|
MD5 | 5899cd331af10f4755b343da08ae0882 |
|
BLAKE2b-256 | b1bb1e1daff56d7a34cfd17047defca6d2d8a5d32c44a7255ffbf02e9bd9942f |
Hashes for glum-3.0.0a1-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f5d835124cca462f1d8367221869f450908e4d07c5241db30bc9bc88befecb1f |
|
MD5 | b28dd6e5c4f099fefe743faf67797870 |
|
BLAKE2b-256 | c6d2573383bd2291169c4f1b2260831d7dd28e28893f654167949f0914c924e8 |
Hashes for glum-3.0.0a1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c2b49d6bb1ba382d12b274d33be82714de06bec40c7588a96b43b4d94736d6e0 |
|
MD5 | 8821858b4b9df500ffe8d7e62e74bbef |
|
BLAKE2b-256 | 2c2bd8fe45055cd6abccb0d38f1b43e9987688f3aec0fcf97dd07a3abd2c1527 |
Hashes for glum-3.0.0a1-cp310-cp310-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c3c13c911580b0b3c50187e4611af7bda37ea4b121dae107e63595ca04a4276d |
|
MD5 | adfc0324326f410afcc3891fb5d51d99 |
|
BLAKE2b-256 | 0ec8e522e5c1027317df7ad0514b85447ba59fa2726b4beeda8695b90bba5237 |
Hashes for glum-3.0.0a1-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 91562041fbfe1e4263109e63f3210195e21609dfb21c7299cd1ee0484e8663b6 |
|
MD5 | 65886bcb6284d1f699b7cc41cbf04718 |
|
BLAKE2b-256 | 03ec40bb0d11d2c8765dbd010e4ba45c58658f243bc6961978a3d59e4205e7c2 |
Hashes for glum-3.0.0a1-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 07984d8cf77fb005587f5052b1d3c15539e008650e4c546e745dbeee8ab5f07e |
|
MD5 | 8a63229d6082e5f1e58eac6b7932af2e |
|
BLAKE2b-256 | b926e92e635bc687c303b59c8ed704039313b124bc282719ce58702a69f5f914 |
Hashes for glum-3.0.0a1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 970a253dda69251de4d08bf93efbe128a48700d7d5202e5d095ec3e77188d8f0 |
|
MD5 | a41e470654623e2087b733712e0a11b2 |
|
BLAKE2b-256 | 51bf36c6d39e7999a489436709c59fa3282f0809d5f6ed7d920261e95e15beaf |
Hashes for glum-3.0.0a1-cp39-cp39-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6fee9adc08817fd4b8ceefa1c28a5ec0a0af647a106ebe4b8398ebe05317cc4f |
|
MD5 | 3008c31652dfb78b4234701c5974121a |
|
BLAKE2b-256 | e78c7bf6ff9db1a7dfa99b41a63910a116abf98ea61bdf5a42768483a6cddd79 |
Hashes for glum-3.0.0a1-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 53bcd6105346d9a459436e6b1a917a637bb3bdb8557eb8b9896f7b2f865320bb |
|
MD5 | 0d26cf80cde51ea1c3d48f748668387f |
|
BLAKE2b-256 | ac8ff7fb6740b5c9322959dcc8792aee8f30d4227505de1a6244fd74d664b64f |
Hashes for glum-3.0.0a1-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0809a36afc751f1da267427cabcb5006ad0f72985f8070172cbf08de41c6da79 |
|
MD5 | 7f9b77eb3d4a7185ddfd04221b0b0fa3 |
|
BLAKE2b-256 | d55616e5fb6a3a9c1fd1e864f98bae62a2cb8bf7943b37ebabb5675909afb016 |
Hashes for glum-3.0.0a1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cc0e9f9805b34c1055b8cc042e26ae82b98936037addadb3b71bd3afd9f9e3a8 |
|
MD5 | 752e00f281b6cdf4dda190f3c2a74c76 |
|
BLAKE2b-256 | e577d22738285ba8ab2bb8e4d25d39e79bfaf9a4cb6cee4e6b10dfca9faef0fd |
Hashes for glum-3.0.0a1-cp38-cp38-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cc54dc325b6177312eabeff2f0e2ddaa1677918f00ce8e3d00699402fe008674 |
|
MD5 | 555897a3b3abac5df6b35b6f4ccec6f6 |
|
BLAKE2b-256 | 3a25100199e929871bff32ff8a0b3a8e889796e2e7f2646b67a4e0803ed67e31 |
Hashes for glum-3.0.0a1-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 386b30bf7f107748e83e7d36fbc80ef0839b8969a1e3e0f2b0907d6f2b69b957 |
|
MD5 | 1fe495aeb7c31c8280a459b83b3207d4 |
|
BLAKE2b-256 | 8e3befef15286e6cf5403b083f5b2aa395b0b44f8b97a268540ccbdc7f6f1067 |
Hashes for glum-3.0.0a1-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 531bd7f8428b24484e2f6dddf7009d5a58ae2580664d217ec773383e9368faf3 |
|
MD5 | 1a4ae3632d5df0537708aaf7a09cad63 |
|
BLAKE2b-256 | 872d252a511abfedb479a5135aca70317aeda4d544c9dd6ca8c91a31889a8f0a |
Hashes for glum-3.0.0a1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 546500aa584c3944aa233212aa18dfc92d1d6c06ceb5581f7f5274d678203ef3 |
|
MD5 | 2ce337a4e0a91ec612cd52e4f545feb0 |
|
BLAKE2b-256 | 042c256de2cddfe01f07dc231bca6df229b99d94df68a1d09020f36db92cc673 |
Hashes for glum-3.0.0a1-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 656cc6004b2736876c9e54553a16860c16afab8044136598fd0c724b752af637 |
|
MD5 | 2e0bd3041ca890e301c5cd705871dbc9 |
|
BLAKE2b-256 | 521c8e78518868e9da666d7b089b130cf9753fb50ea800a02cf72e2339dc1065 |
Hashes for glum-3.0.0a1-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0714d9f5d092d377c2249ba26eecf3b2fd9a31a98545455fa9465445de36d567 |
|
MD5 | 328cd50ea9a3a55da8f805a2b3822166 |
|
BLAKE2b-256 | 7d2ab93fc16d28b86501fa9d531ae671c2e69009bb7f51582a91975e087d86ce |
Hashes for glum-3.0.0a1-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 89dd9d18ee455699cfcde4ba5d3df4da66dc22095dd10520b31f8e857ec098a3 |
|
MD5 | 1db8386617309f63bff4396016761a9f |
|
BLAKE2b-256 | f1f07997becbfeaf16a4b7b782b58e605517ddc3a61f09d2adf46e0f785fc80d |
Hashes for glum-3.0.0a1-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 28e2dc91b8e012fadde2d93de6420637530811e5afc7115273da35435bdd46eb |
|
MD5 | ed519a92903a98117e3da47731753b2f |
|
BLAKE2b-256 | 09193b7b74f05d85c166b74f10065bef4093d169f464beb795ff34ed1bcd6d17 |