2.4. Read JSON

  • File paths works also with URLs

  • File can be compressed with .gz, .bz2, .zip, .xz

2.4.1. Compressed

  • If the extension is .gz, .bz2, .zip, and .xz, the corresponding compression method is automatically selected

>>> df = pd.read_json('sample_file.zip', compression='zip')  
>>> df = pd.read_json('sample_file.gz', compression='infer')  

2.4.2. Assignments

Code 2.43. Solution
"""
* Assignment: Pandas Read JSON
* Complexity: easy
* Lines of code: 1 lines
* Time: 3 min

English:
    1. Read data from `DATA` as `result: pd.DataFrame`
    2. Run doctests - all must succeed

Polish:
    1. Wczytaj dane z DATA jako result: pd.DataFrame
    2. Uruchom doctesty - wszystkie muszą się powieść

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert result is not Ellipsis, \
    'Assign result to variable: `result`'
    >>> assert type(result) is pd.DataFrame, \
    'Variable `result` must be a `pd.DataFrame` type'

    >>> result.loc[[0,10,20]]
        sepal_length  sepal_width  petal_length  petal_width     species
    0            5.1          3.5           1.4          0.2      setosa
    10           7.0          3.2           4.7          1.4  versicolor
    20           6.3          3.3           6.0          2.5   virginica
"""

import pandas as pd


DATA = 'https://python3.info/_static/iris.json'


# Read DATA from JSON
# type: pd.DataFrame
result = ...

Code 2.44. Solution
"""
* Assignment: Pandas Read JSON OpenAPI
* Complexity: medium
* Lines of code: 3 lines
* Time: 5 min

English:
    1. Import `requests` module
    2. Define `resp` with result of `requests.get()` for `DATA`
    3. Define `data` with conversion of `resp` from JSON to Python dict by calling `.json()` on `resp`
    4. Define `result: pd.DataFrame` from value for key `paths` in `data` dict
    5. Run doctests - all must succeed

Polish:
    1. Zaimportuj moduł `requests`
    2. Zdefiniuj `resp` z resultatem `requests.get()` dla `DATA`
    3. Zdefiniuj `data` z przekształceniem `resp` z JSON do Python dict wywołując `.json()` na `resp`
    4. Zdefiniuj `result: pd.DataFrame` dla wartości z klucza `paths` w słowniku `data`
    5. Uruchom doctesty - wszystkie muszą się powieść

Hints:
    * `pd.DataFrame(data)`

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert result is not Ellipsis, \
    'Assign result to variable: `result`'
    >>> assert type(result) is pd.DataFrame, \
    'Variable `result` must be a `pd.DataFrame` type'

    >>> list(result.index)
    ['put', 'post', 'get', 'delete']

    >>> list(result.columns)  # doctest: +NORMALIZE_WHITESPACE
    ['/pet', '/pet/findByStatus', '/pet/findByTags', '/pet/{petId}', '/pet/{petId}/uploadImage',
     '/store/inventory', '/store/order', '/store/order/{orderId}',
     '/user', '/user/createWithList', '/user/login', '/user/logout', '/user/{username}']
"""

import pandas as pd
import requests

DATA = 'https://python3.info/_static/openapi.json'


# Define `resp` with result of `requests.get()` for `DATA`
# type: requests.models.Response
resp = ...

# Define `data` with result of calling `.json()` on `resp` object
# type: dict
data = ...

# Convert `data` DataFrame object
# type: pd.DataFrame
result = ...