High performance Python GLMs with all the features!
Project description
glum
Generalized linear models (GLM) are a core statistical tool that include many common methods like least-squares regression, Poisson regression and logistic regression as special cases. At QuantCo, we have used GLMs in e-commerce pricing, insurance claims prediction and more. We have developed glum
, a fast Python-first GLM library. The development was based on a fork of scikit-learn, so it has a scikit-learn-like API. We are thankful for the starting point provided by Christian Lorentzen in that PR!
The goal of glum
is to be at least as feature-complete as existing GLM libraries like glmnet
or h2o
. It supports
- Built-in cross validation for optimal regularization, efficiently exploiting a “regularization path”
- L1 regularization, which produces sparse and easily interpretable solutions
- L2 regularization, including variable matrix-valued (Tikhonov) penalties, which are useful in modeling correlated effects
- Elastic net regularization
- Normal, Poisson, logistic, gamma, and Tweedie distributions, plus varied and customizable link functions
- Box constraints, linear inequality constraints, sample weights, offsets
This repo also includes tools for benchmarking GLM implementations in the glum_benchmarks
module. For details on the benchmarking, see here. Although the performance of glum
relative to glmnet
and h2o
depends on the specific problem, we find that when N >> K (there are more observations than predictors), it is consistently much faster for a wide range of problems.
For more information on glum
, including tutorials and API reference, please see the documentation.
Why did we choose the name glum
? We wanted a name that had the letters GLM and wasn't easily confused with any existing implementation. And we thought glum sounded like a funny name (and not glum at all!). If you need a more professional sounding name, feel free to pronounce it as G-L-um. Or maybe it stands for "Generalized linear... ummm... modeling?"
A classic example predicting housing prices
>>> from sklearn.datasets import fetch_openml
>>> from glum import GeneralizedLinearRegressor
>>>
>>> # This dataset contains house sale prices for King County, which includes
>>> # Seattle. It includes homes sold between May 2014 and May 2015.
>>> house_data = fetch_openml(name="house_sales", version=3, as_frame=True)
>>>
>>> # Use only select features
>>> X = house_data.data[
... [
... "bedrooms",
... "bathrooms",
... "sqft_living",
... "floors",
... "waterfront",
... "view",
... "condition",
... "grade",
... "yr_built",
... "yr_renovated",
... ]
... ].copy()
>>>
>>>
>>> # Model whether a house had an above or below median price via a Binomial
>>> # distribution. We'll be doing L1-regularized logistic regression.
>>> price = house_data.target
>>> y = (price < price.median()).values.astype(int)
>>> model = GeneralizedLinearRegressor(
... family='binomial',
... l1_ratio=1.0,
... alpha=0.001
... )
>>>
>>> _ = model.fit(X=X, y=y)
>>>
>>> # .report_diagnostics shows details about the steps taken by the iterative solver
>>> diags = model.get_formatted_diagnostics(full_report=True)
>>> diags[['objective_fct']]
objective_fct
n_iter
0 0.693091
1 0.489500
2 0.449585
3 0.443681
4 0.443498
5 0.443497
Installation
Please install the package through conda-forge:
conda install glum -c conda-forge
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for glum-2.1.0-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e469787a029210b3ec9d8353cd08fdf3395dcfd5b9f8d871f7ac1edbbc821290 |
|
MD5 | fa9ac6bc0ead5866a753ce11af2a24ce |
|
BLAKE2b-256 | b03ad47f2536e0620a220281ae06a2efb992b69012e66f89769013a51293f504 |
Hashes for glum-2.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a57ce3141c6a273f99b2e2eb80e13c2e5f4525a6d1d77b0c1db31b8210923829 |
|
MD5 | 56404746c7e45be800be25bd5b760a5d |
|
BLAKE2b-256 | 37044e40c6a5ff3d94a0ac58f533c60e6bf52bdc93ff94ba5f7476bd69de0188 |
Hashes for glum-2.1.0-cp39-cp39-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d355586604341fb782bf088f30d8fd244c701b81061e6f366ab02a9019762536 |
|
MD5 | 49795a041df9fb7dc5df0248e32db5a4 |
|
BLAKE2b-256 | fee7ffa193b2d4e1ba7ca5a4a384fef243402afaf3c398e3063e4f2696325cf7 |
Hashes for glum-2.1.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | df5c215b8e64303b63631faa21e5c0411a250e4e213e1a44c1225c3656d0dedf |
|
MD5 | bdced7dcd328c52e4eca05591c50e801 |
|
BLAKE2b-256 | 7c4e32867c347a37e01d55114c3a5f86806bf86c1b6cb2d96f0193eb5349cf1c |
Hashes for glum-2.1.0-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cfc8e3c53a21ae0d5a6cfb8a66b46149f70f0c0e4f126802769fe10a6ef35ab3 |
|
MD5 | d19f8c711b52b42d2942f13d6858c708 |
|
BLAKE2b-256 | 6456e32812874edc61cbf96dbeae159c85f303ff356ca681862e8445dc605248 |
Hashes for glum-2.1.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 119115e7329a699e43e6c8889e80b5a179eb6f85214e055bddcef85c095ebd5c |
|
MD5 | be12ab6c8167e62adda6f2d170424214 |
|
BLAKE2b-256 | e91eceb91e372985def71c7a9e60ea8e219775f64ee4e1ffc37d5f742cdd4a04 |
Hashes for glum-2.1.0-cp38-cp38-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fbc8b37e0d859265d21ffe657af78263812cae95d30d3ddfda78febfcc8c9e66 |
|
MD5 | a0f04c1c82919b8d3a97eb3da4e4daba |
|
BLAKE2b-256 | 66877a0698f023ac8a6111cb1aacfccc53829c3e7467b56d26c870ffd09ac38b |
Hashes for glum-2.1.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 499d787f2d7dede4f581cad9f5b79a2ea9b1b7da2f82071c000d5e93d34dfcfe |
|
MD5 | 8d940b3367c9060a1cadc90a799cbc0d |
|
BLAKE2b-256 | 170fac9dcdbc7d3e1fca82db373740b0845435e9cef58a48352ea5f00c3b8bbe |
Hashes for glum-2.1.0-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | af5d2e18eb353a8779b4831bda8baf78eecefbe049dfd2088bc0dd63031fb4df |
|
MD5 | 086cf6cf767bf5f1ca0c585a0268a96a |
|
BLAKE2b-256 | e0549c4c1ee413f9bb359fde57d8bce164563fd042c0df848418551b6d0fe4f8 |
Hashes for glum-2.1.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6943cdd983b0c10ae322a0eea284d8fa54fa307a07712ddb922ed0f3530813cc |
|
MD5 | d967ff0b61a7913c636795f66d0faa12 |
|
BLAKE2b-256 | c93a1898dc7b8659cae0bc103cbdc1d5ba01ca87e4c21226dad301eb2e2b493e |
Hashes for glum-2.1.0-cp37-cp37m-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 18496872b0bdcc4a41f1acc7ec27ee068e046fc85a3dfc072dcaa81fef8bf7d0 |
|
MD5 | e5aff18088c391e84b57371c5682343b |
|
BLAKE2b-256 | 11d8383181bc9cde2a4b7c2002cdd512c5d1e5378f64f293893303924f40e420 |
Hashes for glum-2.1.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9f341c53aa9f6d6fbe4623aa1b7e7e9644a04189951d948469b027c0a7ef8eba |
|
MD5 | 0c412f3e03d0f75ea9c8ee387b13bda0 |
|
BLAKE2b-256 | 7baaed6762a85a6fa989da41e33538adb4235019fc6bcf8f4b9daacf40175939 |
Hashes for glum-2.1.0-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | da4acbec3dbf00df68bcc31269a74772854539658c9f2c2852d926ef0533b1fa |
|
MD5 | 06b0c0f0262841c05f73291ed635278f |
|
BLAKE2b-256 | 2bfeb9c2bb4916aa571c2e3268f619abbd13f9e2ee5efb204ebbbb1494eef024 |
Hashes for glum-2.1.0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 18d3b9a672faf46540ef645681e9747d35a98f694cac570dd124ff983cf60c04 |
|
MD5 | 9b1e55065f2bbbbcf217b19944ded2c9 |
|
BLAKE2b-256 | ccc55f58cd9b5a4bda31a8b8b0aa336f882ea8d19a63965572f75aaa925daf59 |
Hashes for glum-2.1.0-cp36-cp36m-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2d1206e0f0b72da296548fea60ceaf7ce9aaee281521428bfc98ca31c04963e4 |
|
MD5 | 069a7ad62f89546c653379f5a6a0eabc |
|
BLAKE2b-256 | e2f5de1414d1263025bcec60694ca6f26b54e9d819da7689f75e9fe4176952ba |
Hashes for glum-2.1.0-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8d53e110f4fd40349ada1eb1e230982b10ba1ecc809f832040488cd31d5eb704 |
|
MD5 | d2a9b5ab2cc6289ceaaad13084c95fcf |
|
BLAKE2b-256 | 4e04ac817dd497b1156e16f95359952ab0df5e4fe2d17a5016903ce7f6ee8206 |