High performance Python GLMs with all the features!
Project description
glum
Generalized linear models (GLM) are a core statistical tool that include many common methods like least-squares regression, Poisson regression and logistic regression as special cases. At QuantCo, we have used GLMs in e-commerce pricing, insurance claims prediction and more. We have developed glum
, a fast Python-first GLM library. The development was based on a fork of scikit-learn, so it has a scikit-learn-like API. We are thankful for the starting point provided by Christian Lorentzen in that PR!
The goal of glum
is to be at least as feature-complete as existing GLM libraries like glmnet
or h2o
. It supports
- Built-in cross validation for optimal regularization, efficiently exploiting a “regularization path”
- L1 regularization, which produces sparse and easily interpretable solutions
- L2 regularization, including variable matrix-valued (Tikhonov) penalties, which are useful in modeling correlated effects
- Elastic net regularization
- Normal, Poisson, logistic, gamma, and Tweedie distributions, plus varied and customizable link functions
- Box constraints, linear inequality constraints, sample weights, offsets
This repo also includes tools for benchmarking GLM implementations in the glum_benchmarks
module. For details on the benchmarking, see here. Although the performance of glum
relative to glmnet
and h2o
depends on the specific problem, we find that when N >> K (there are more observations than predictors), it is consistently much faster for a wide range of problems.
For more information on glum
, including tutorials and API reference, please see the documentation.
Why did we choose the name glum
? We wanted a name that had the letters GLM and wasn't easily confused with any existing implementation. And we thought glum sounded like a funny name (and not glum at all!). If you need a more professional sounding name, feel free to pronounce it as G-L-um. Or maybe it stands for "Generalized linear... ummm... modeling?"
A classic example predicting housing prices
>>> from sklearn.datasets import fetch_openml
>>> from glum import GeneralizedLinearRegressor
>>>
>>> # This dataset contains house sale prices for King County, which includes
>>> # Seattle. It includes homes sold between May 2014 and May 2015.
>>> house_data = fetch_openml(name="house_sales", version=3, as_frame=True)
>>>
>>> # Use only select features
>>> X = house_data.data[
... [
... "bedrooms",
... "bathrooms",
... "sqft_living",
... "floors",
... "waterfront",
... "view",
... "condition",
... "grade",
... "yr_built",
... "yr_renovated",
... ]
... ].copy()
>>>
>>>
>>> # Model whether a house had an above or below median price via a Binomial
>>> # distribution. We'll be doing L1-regularized logistic regression.
>>> price = house_data.target
>>> y = (price < price.median()).values.astype(int)
>>> model = GeneralizedLinearRegressor(
... family='binomial',
... l1_ratio=1.0,
... alpha=0.001
... )
>>>
>>> _ = model.fit(X=X, y=y)
>>>
>>> # .report_diagnostics shows details about the steps taken by the iterative solver
>>> diags = model.get_formatted_diagnostics(full_report=True)
>>> diags[['objective_fct']]
objective_fct
n_iter
0 0.693091
1 0.489500
2 0.449585
3 0.443681
4 0.443498
5 0.443497
Installation
Please install the package through conda-forge:
conda install glum -c conda-forge
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for glum-2.4.1-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 38a9b267a998db802be37ed8c4940485bf80f520a3b765b4a4d1d2bcc76a4c79 |
|
MD5 | bc3cda6f7718bc089a1f898607038e84 |
|
BLAKE2b-256 | 0f6b817114787da130175b11350b6051f334e7ad84b0769d3286a22644efc2db |
Hashes for glum-2.4.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e57721c707dac066609104039c99f8874281bb59e2700f0efdbcca174cffebb1 |
|
MD5 | 133c0162b7d2f2bc57a30e95ce1238e2 |
|
BLAKE2b-256 | fd8790a8bd37abd17ee55670e314b88de8b04fa81f675efea075aac2ee39e8b3 |
Hashes for glum-2.4.1-cp310-cp310-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d6d9e732d379a472c07a590c7517ba1e7e79fa0fda96e363422141209847b384 |
|
MD5 | ac76452329dedaff47aa0144185bb218 |
|
BLAKE2b-256 | c148e4c667546166b543d897e6f6741c4c860762f977b6b50e56a6183c3bbbd9 |
Hashes for glum-2.4.1-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 04229717533fd46d01f237378f2a6228f6273e7f7b02f8e2da8a5bcc93919423 |
|
MD5 | 06602c8d4e0e418539c11b9299fb75b7 |
|
BLAKE2b-256 | 830464ba32182ce257f6a1121b2d02ecaecf795511dff0ca0a4cf9c0c307c92a |
Hashes for glum-2.4.1-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e595dae3fe6f9c64a14e6342944e15296201620f86eafea0287e1e734b64ac02 |
|
MD5 | 9ad86d8ce945e7fcf351fc8fc7792ec9 |
|
BLAKE2b-256 | 6dc09a70b0a696ce6526383ef31b185de357f0c5acb3f57dcdc25f0e81cd9ec3 |
Hashes for glum-2.4.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b19a16bbba6663d27d9c61e92b4290abba00be4ce4c396940ef0a832f069231c |
|
MD5 | fb73970adc3235ec90dd9d36028b1324 |
|
BLAKE2b-256 | d264182a2eb4507783a08599ce689a0f35173441257abe26043995ae136d5bda |
Hashes for glum-2.4.1-cp39-cp39-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8ceef305a9b043f8f140d622362a5f95061c97775e9ae83280b347d0e26f7ed5 |
|
MD5 | 3a092beb68820da8a917433a89a6d194 |
|
BLAKE2b-256 | d2f1eed4c20a1471c1184040a3425cbe9f1622839af0847882b6403164a4afbe |
Hashes for glum-2.4.1-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 041601e087ec7f2eb966f8a490836b106c4318cd027b3b816111030c1ff523e4 |
|
MD5 | a545f6d7a104b38c40888e63d2a3bd00 |
|
BLAKE2b-256 | 80b15266046e2fbb0f30c433c745190e7ba625d3dfff723747d0cd602cd1affd |
Hashes for glum-2.4.1-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b85d1b0df25088972c55f43f5ffe117c2ba8a1b2bc7f6931817d75c2188b9e1a |
|
MD5 | 9358cdd0c6a8b52c4544a77003ca3bb5 |
|
BLAKE2b-256 | e0dfbe925f078b43565b4a06f6c90f96b69a7ba0dbdd80cb68f91f00b69082b7 |
Hashes for glum-2.4.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d789639112e69194054d8bb83a84f9322f66c7c916304e163a6407faef85c656 |
|
MD5 | bd355e6d268a2c2f256167e584560a3f |
|
BLAKE2b-256 | 439610cd22ba2a1f305bf769ef405c84b030f1f25008b51c7d89d9d0adc3e30b |
Hashes for glum-2.4.1-cp38-cp38-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fed2c0d2414f9731de362d38a67d7b8a411e55e4ddb296fa53d1d3cef13ce47d |
|
MD5 | 3db892072ccf9fdd77576ec6dd3bc3a4 |
|
BLAKE2b-256 | 0c6b74ab4f05671dedc1d981254074e97c773ff5f8c45712c73cc73eab711c2b |
Hashes for glum-2.4.1-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5d36aba6f68ceed8ada709ca95bd023b51aecdc92b8cbda42328a72ec517e2c7 |
|
MD5 | 006775b34b414bbba455cb16d7b7eba1 |
|
BLAKE2b-256 | 3f6058dc61d03570d76b10b336bdc67f93f5fab80fba5431e5ae7d919cbe5ba6 |
Hashes for glum-2.4.1-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e5862fdc0ad80553029ae9ccd2730b76354e707fcc80fa2985ff7e6190068b35 |
|
MD5 | 880bbd852f7f1934af5735645e31eb2c |
|
BLAKE2b-256 | 25b2a2ae8a7027c5b83423816e51e5a010f023d8dabde3cda2df8f91b5c76fee |
Hashes for glum-2.4.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e469ced18431e8d79367bbad3b93a4954018069de970c2287c7a019dcfacfd24 |
|
MD5 | ed6699c906fbf5e0e87ac449c8f71e4a |
|
BLAKE2b-256 | f5f3020b4b529ed79774b316409448db01840eb37d9efc9c7bec626b11a5caaf |
Hashes for glum-2.4.1-cp37-cp37m-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2e2490c5f44abc766c4583fc892a90073e93341318c8fec179bde7221c27cce7 |
|
MD5 | 5eda440b578a0bd4cd09bd4bede9f604 |
|
BLAKE2b-256 | 92c897ed24ff4c36d7305359158e5b59ae3a7ac71c0c6c034a679d5796df8e46 |
Hashes for glum-2.4.1-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8f3ec88051f17e61bfb5d969690b756a47d293c05dc538377eb07ee029d16dfa |
|
MD5 | 8909d368cc8c109b1f337cff17fd5ae9 |
|
BLAKE2b-256 | 7166dbbad96ae762cf04e7dfc108dcb1b56311a388867fd615e89af9be7f5e89 |
Hashes for glum-2.4.1-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c9547228ff49a58d8ec0df34158e8194fe356e4cd950474a2762e6264bc3511b |
|
MD5 | d525cf7b284ce6393e8e34994f21d4ba |
|
BLAKE2b-256 | 50b2bddb3ce8fe37e8d5753cef6e95f0230296336967d8ca866ee690b73796ff |
Hashes for glum-2.4.1-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8126cbad3df8758936a9465297613d57df0594af1b3239dfe1a11cbe69ab33f9 |
|
MD5 | 71a2f2fff98eed9a1e42d751ab27a469 |
|
BLAKE2b-256 | 41765df23b179245530ae29f1f1f5e9b2bf4f3169093f327b499bd471622a4fb |
Hashes for glum-2.4.1-cp36-cp36m-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6e9b6478508683cd8c5409760b3349c4554a43a07f41315828dc4b27b3c9d7d3 |
|
MD5 | 16cf615f22ca6299e1366f4fc300ea86 |
|
BLAKE2b-256 | 0297b9534509ed2b71633651be62ed5995b4b9b0b919e0943a3bf1e26c89a29e |
Hashes for glum-2.4.1-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cbd1ea0bf1d263dc669bdd3b4bf5b9d6dad7ef62d8554b0d4f88284e96e0c54f |
|
MD5 | 095049ab44b922fe6cf29adf176016c8 |
|
BLAKE2b-256 | 5d1a8a45904fbbe4272b310a55c4a36202d6079e72270a3c2013e9f576b577ba |