PyScaffold extension for Data Science projects
Project description
pyscaffoldext-dsproject
PyScaffold extension tailored for Data Science projects. This extension is inspired by cookiecutter-data-science and enhanced in many ways. The main differences are that it
- advocates a proper Python package structure that can be shipped and distributed,
- uses a conda environment instead of something virtualenv-based and is thus more suitable for data science projects,
- more default configurations for Sphinx, py.test, pre-commit, etc. to foster clean coding and best practices.
Also consider using dvc to version control and share your data within your team.
The final directory structure looks like:
├── AUTHORS.rst <- List of developers and maintainers.
├── CHANGELOG.rst <- Changelog to keep track of new features and fixes.
├── LICENSE.txt <- License as chosen on the command-line.
├── README.md <- The top-level README for developers.
├── configs <- Directory for configurations of model & application.
├── data
│ ├── external <- Data from third party sources.
│ ├── interim <- Intermediate data that has been transformed.
│ ├── processed <- The final, canonical data sets for modeling.
│ └── raw <- The original, immutable data dump.
├── docs <- Directory for Sphinx documentation in rst or md.
├── environment.yaml <- The conda environment file for reproducibility.
├── models <- Trained and serialized models, model predictions,
│ or model summaries.
├── notebooks <- Jupyter notebooks. Naming convention is a number (for
│ ordering), the creator's initials and a description,
│ e.g. `1.0-fw-initial-data-exploration`.
├── references <- Data dictionaries, manuals, and all other materials.
├── reports <- Generated analysis as HTML, PDF, LaTeX, etc.
│ └── figures <- Generated plots and figures for reports.
├── scripts <- Analysis and production scripts which import the
│ actual PYTHON_PKG, e.g. train_model.
├── setup.cfg <- Declarative configuration of your project.
├── setup.py <- Use `python setup.py develop` to install for development or
| or create a distribution with `python setup.py bdist_wheel`.
├── src
│ └── PYTHON_PKG <- Actual Python package where the main functionality goes.
├── tests <- Unit tests which can be run with `py.test`.
├── .coveragerc <- Configuration for coverage reports of unit tests.
├── .isort.cfg <- Configuration for git hook that sorts imports.
└── .pre-commit-config.yaml <- Configuration of pre-commit git hooks.
See a demonstration of the initial project structure under dsproject-demo and also check out the the documentation of PyScaffold for more information.
Usage
Just install this package with pip install pyscaffoldext-dsproject
and note that putup -h
shows a new option --dsproject
.
Creating a data science project is then as easy as:
putup --dsproject my_ds_project
Note
This project has been set up using PyScaffold 3.2. For details and usage information on PyScaffold see https://pyscaffold.org/.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for pyscaffoldext-dsproject-0.4.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 680492103d6d69b5b5a768e269d6fb6ad604f2e67dd2612f6dbeeca2e5d601bf |
|
MD5 | ebd5c4752abf2350c49aaa5cf82439f4 |
|
BLAKE2b-256 | 9e110f2591583c852c2cbecd19256ab466fda4f7d5ab42a6063ee791aeb6ad8d |
Hashes for pyscaffoldext_dsproject-0.4-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 31291ae33faf31549dbdfc6b90d8afe8ac138ff1f5c5d676ace6442fd33fad0f |
|
MD5 | 3db42b7df069dd02fcc1227931881621 |
|
BLAKE2b-256 | 402272a03ebc382bfa33791b5403e4f6b85431534885d924273363b2a14969f7 |