jsontableschema-pandas

Generate Pandas data frames, load and extract data, based on JSON Table Schema descriptors.

These details have not been verified by PyPI

Project links

Homepage

Project description

# jsontableschema-pandas

[![Travis](https://img.shields.io/travis/frictionlessdata/jsontableschema-pandas-py/master.svg)](https://travis-ci.org/frictionlessdata/jsontableschema-pandas-py)
[![Coveralls](http://img.shields.io/coveralls/frictionlessdata/jsontableschema-pandas-py.svg?branch=master)](https://coveralls.io/r/frictionlessdata/jsontableschema-pandas-py?branch=master)
[![PyPi](https://img.shields.io/pypi/v/jsontableschema-pandas.svg)](https://pypi.python.org/pypi/jsontableschema-pandas)
[![SemVer](https://img.shields.io/badge/versions-SemVer-brightgreen.svg)](http://semver.org/)
[![Gitter](https://img.shields.io/gitter/room/frictionlessdata/chat.svg)](https://gitter.im/frictionlessdata/chat)

Generate and load Pandas data frames based on JSON Table Schema descriptors.

> Version `v0.2` contains breaking changes:
- removed `Storage(prefix=)` argument (was a stub)
- renamed `Storage(tables=)` to `Storage(dataframes=)`
- renamed `Storage.tables` to `Storage.buckets`
- changed `Storage.read` to read into memory
- added `Storage.iter` to yield row by row

## Getting Started

### Installation

```
$ pip install datapackage
$ pip install jsontableschema-pandas
```

### Example

You can easily load resources from a data package as Pandas data frames by simply using `datapackage.push_datapackage` function:

```python
>>> import datapackage

>>> data_url = 'http://data.okfn.org/data/core/country-list/datapackage.json'
>>> storage = datapackage.push_datapackage(data_url, 'pandas')

>>> storage.buckets
['data___data']

>>> type(storage['data___data'])
<class 'pandas.core.frame.DataFrame'>

>>> storage['data___data'].head()
Name Code
0 Afghanistan AF
1 Åland Islands AX
2 Albania AL
3 Algeria DZ
4 American Samoa AS
```

Also it is possible to pull your existing data frame into a data package:

```python
>>> datapackage.pull_datapackage('/tmp/datapackage.json', 'country_list', 'pandas', tables={
... 'data': storage['data___data'],
... })
Storage
```

### Storage

Package implements [Tabular Storage](https://github.com/frictionlessdata/jsontableschema-py#storage) interface.

We can get storage this way:

```python
>>> from jsontableschema_pandas import Storage

>>> storage = Storage()
```

Storage works as a container for Pandas data frames. You can define new data frame inside storage using `storage.create` method:

```python
>>> storage.create('data', {
... 'primaryKey': 'id',
... 'fields': [
... {'name': 'id', 'type': 'integer'},
... {'name': 'comment', 'type': 'string'},
... ]
... })

>>> storage.buckets
['data']

>>> storage['data'].shape
(0, 0)
```

Use `storage.write` to populate data frame with data:

```python
>>> storage.write('data', [(1, 'a'), (2, 'b')])

>>> storage['data']
id comment
1 a
2 b
```

Also you can use [tabulator](https://github.com/frictionlessdata/tabulator-py) to populate data frame from external data file:

```python
>>> import tabulator

>>> with tabulator.Stream('data/comments.csv', headers=1) as stream:
... storage.write('data', stream)

>>> storage['data']
id comment
1 a
2 b
1 good
```

As you see, subsequent writes simply appends new data on top of existing ones.

## API Reference

### Snapshot

https://github.com/frictionlessdata/jsontableschema-py#snapshot

### Detailed

- [Docstrings](https://github.com/frictionlessdata/jsontableschema-py/tree/master/jsontableschema/storage.py)
- [Changelog](https://github.com/frictionlessdata/jsontableschema-pandas-py/commits/master)

## Contributing

Please read the contribution guideline:

[How to Contribute](CONTRIBUTING.md)

Thanks!

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.5.0

Sep 5, 2017

0.4.1

Jun 9, 2017

0.4.0

Jun 6, 2017

This version

0.3.0

Mar 28, 2017

0.2.0

Oct 26, 2016

0.1.4

Aug 19, 2016

0.1.3

May 24, 2016

0.0.0

May 24, 2016

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jsontableschema-pandas-0.3.0.tar.gz (6.4 kB view hashes)

Uploaded Mar 28, 2017 Source

Built Distribution

jsontableschema_pandas-0.3.0-py2.py3-none-any.whl (8.9 kB view hashes)

Uploaded Mar 28, 2017 Python 2 Python 3

Hashes for jsontableschema-pandas-0.3.0.tar.gz

Hashes for jsontableschema-pandas-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`fde129fa45d3472788cafae2eaf6a9bf9055195bc7d2dbf413c5d5146bad120c`
MD5	`1505555c588115fcf72361d47cb7a809`
BLAKE2b-256	`9d47daeb3214fd106072efffd2079c60300cc866fd0aa5be2e4ec1df91de76a4`

Hashes for jsontableschema_pandas-0.3.0-py2.py3-none-any.whl

Hashes for jsontableschema_pandas-0.3.0-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`3ca31bb20145d711088f29159edfdf12a63a9a934d6d9a2eef67d83defafc52b`
MD5	`558d9f017c44957f6681bf762161ff37`
BLAKE2b-256	`0722fc6dd79fc72cebd664ff8b2e485b240735cd9b1f44b8ce32c8dcf930597f`