CONCISE (COnvolutional Neural for CIS-regulatory Elements) is a model for predicting PTR features like mRNA half-life from cis-regulatory elements using deep learning.
Project description
CONCISE
CONCISE (COnvolutional neural Network for CIS-regulatory Elements) is a model for predicting any quatitative outcome (say mRNA half-life) from cis-regulatory sequence using deep learning.
Developed by the Gagneur Lab (computational biology): http://www.gagneurlab.in.tum.de
Free software: MIT license
Documentation: https://concise-bio.readthedocs.io
Features
Architecture:
Sequence +-> Conv -> reLU -> average pool +-> y <-+ Additional linear features ^ | Positional bias +-----+
Very simple API
Serializing the model to JSON - allows to analyze the results in any langugage of choice
Helper function for hyper-parameter random search
CONCISE uses TensorFlow at its core and is hence able of using GPU computing
Installation
After installing the following prerequisites:
Python (3.4 or 3.5) with pip (see Python installation guide and pip documentation)
TensorFlow python package (see TensorFlow installation guide or Installing Tensorflow on AWS GPU-instance)
install CONCISE using pip:
pip install concise
Getting Started
import pandas as pd
import concise
# read-in and prepare the data
dt = pd.read_csv("./data/pombe_half-life_UTR3.csv")
X_feat, X_seq, y, id_vec = concise.prepare_data(dt,
features=["UTR3_length", "UTR5_length"],
response="hlt",
sequence="seq",
id_column="ID",
seq_align="end",
trim_seq_len=500,
)
######
# Train CONCISE
######
# initialize CONCISE
co = concise.Concise(motif_length = 9, n_motifs = 2, init_motifs = ("TATTTAT", "TTAATGA"))
# train:
# - on a GPU if tensorflow is compiled with GPU support
# - on a CPU with 5 cores otherwise
co.train(X_feat[500:], X_seq[500:], y[500:], n_cores = 5)
# predict
co.predict(X_feat[:500], X_seq[:500])
# get fitted weights
co.get_weights()
# save/load from a file
co.save("./Concise.json")
co2 = Concise.load("./Concise.json")
######
# Train CONCISE in 5-fold cross-validation
######
# intialize
co3 = concise.Concise(motif_length = 9, n_motifs = 2, init_motifs = ("TATTTAT", "TTAATGA"))
cocv = concise.ConciseCV(concise_object = co3)
# train
cocv.train(X_feat, X_seq, y, id_vec,
n_folds=5, n_cores=3, train_global_model=True)
# out-of-fold prediction
cocv.get_CV_prediction()
# save/load from a file
cocv.save("./Concise.json")
cocv2 = ConciseCV.load("./Concise.json")
Where to go from here:
See the example file scripts/example-workflow.py
Read the API Documenation https://concise-bio.readthedocs.io/en/latest/documentation.html
History
0.1.0 (2016-09-15)
First release on PyPI.
0.1.1 (2016-09-17)
Minor documentation changes
Renamed some internal variables
0.2.0 (2016-09-21)
Introduced new feature: regress_out_feat
Major renaming of variables for concistency
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for concise-0.2.0-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9ca7abf0a5fdae0c5f849ad5298e6d511793b3eb4c03b344645425bed1c1c629 |
|
MD5 | 899e5acd0cacabf0c14a4d5a0604fc1e |
|
BLAKE2b-256 | 6d5f391197e23461a70f57e162e7963c08d69126a31f829a4733990891b404b4 |