4.3. Iterable Set¶

Only unique values
Mutable - can add, remove, and modify items
Stores only hashable elements (int, float, bool, None, str, tuple)
Set is unordered data structure and do not record element position or insertion
Do not support getitem and slice
Contains in set has O(1) average case complexity [1]

4.3.1. Syntax¶

data = set() - empty set
No short syntax
Only unique values

Defining only with set() - no short syntax:

>>> data = set()

Comma after last element of a one element set is optional. Brackets are required

>>> data = {1}
>>> data = {1, 2, 3}
>>> data = {1.1, 2.2, 3.3}
>>> data = {True, False}
>>> data = {'a', 'b', 'c'}
>>> data = {'a', 1, 2.2, True, None}

Stores only unique values:

>>> {1, 2, 1}
{1, 2}

Compares by values, not types:

>>> {1}
{1}
>>> {1.0}
{1.0}
>>> {1, 1.0}
{1}
>>> {1.0, 1}
{1.0}

4.3.2. Hashable¶

Can store elements of any hashable types

Hashable (Immutable):

int

float

bool

NoneType

str

tuple

>>> data = {1, 2, 'a'}
>>> data = {1, 2, (3, 4)}

Non-hashable (Mutable):

list

set

dict

>>> data = {1, 2, [3, 4]}
Traceback (most recent call last):
TypeError: unhashable type: 'list'
>>>
>>> data = {1, 2, {3, 4}}
Traceback (most recent call last):
TypeError: unhashable type: 'set'

"Hashable types are also immutable" is true for builtin types, but it's not a universal truth.

More information in OOP Hash.
More information in OOP Object Identity.

4.3.3. Type Conversion¶

set() converts argument to set

>>> data = 'abcd'
>>> set(data) == {'a', 'b', 'c', 'd'}
True

>>> data = ['a', 'b', 'c', 'd']
>>> set(data) == {'a', 'b', 'c', 'd'}
True

>>> data = ('a', 'b', 'c', 'd')
>>> set(data) == {'a', 'b', 'c', 'd'}
True

>>> data = {'a', 'b', 'c', 'd'}
>>> set(data) == {'a', 'b', 'c', 'd'}
True

4.3.4. Deduplicate¶

Works with str, list, tuple

>>> data = [1, 2, 3, 1, 1, 2, 4]
>>> set(data)
{1, 2, 3, 4}

Converting set deduplicate items:

>>> data = [
...     'Watney',
...     'Lewis',
...     'Martinez',
...     'Watney',
... ]
>>>
>>> set(data) == {'Watney', 'Lewis', 'Martinez'}
True

4.3.5. Add¶

>>> data = {1, 2}
>>>
>>> data.add(3)
>>> data == {1, 2, 3}
True
>>>
>>> data.add(3)
>>> data == {1, 2, 3}
True
>>>
>>> data.add(4)
>>> data == {1, 2, 3, 4}
True

>>> data = {1, 2}
>>> data.add([3, 4])
Traceback (most recent call last):
TypeError: unhashable type: 'list'

>>> data = {1, 2}
>>> data.add((3, 4))
>>> data == {1, 2, (3, 4)}
True

>>> data = {1, 2}
>>> data.add({3, 4})
Traceback (most recent call last):
TypeError: unhashable type: 'set'

4.3.6. Update¶

>>> data = {1, 2}

>>> data.update({3, 4})
>>> data == {1, 2, 3, 4}
True

>>> data.update([5, 6])
>>> data == {1, 2, 3, 4, 5, 6}
True

>>> data.update((7, 8))
>>> data == {1, 2, 3, 4, 5, 6, 7, 8}
True

4.3.7. Pop¶

Gets and remove items

>>> data = {1, 2, 3}
>>> value = data.pop()
>>> value in [1, 2, 3]
True

4.3.8. Membership¶

Is Disjoint?:

True - if there are no common elements in data and x
False - if any x element are in data

>>> data = {1,2}
>>>
>>> data.isdisjoint({1,2})
False
>>> data.isdisjoint({1,3})
False
>>> data.isdisjoint({3,4})
True

Is Subset?:

True - if x has all elements from data
False - if x don't have element from data

>>> data = {1,2}
>>>
>>> data.issubset({1})
False
>>> data.issubset({1,2})
True
>>> data.issubset({1,2,3})
True
>>> data.issubset({1,3,4})
False

>>> {1,2} < {3,4}
False
>>> {1,2} < {1,2}
False
>>> {1,2} < {1,2,3}
True
>>> {1,2,3} < {1,2}
False

>>> {1,2} <= {3,4}
False
>>> {1,2} <= {1,2}
True
>>> {1,2} <= {1,2,3}
True
>>> {1,2,3} <= {1,2}
False

Is Superset?:

True - if data has all elements from x
False - if data don't have element from x

>>> data = {1,2}
>>>
>>> data.issuperset({1})
True
>>> data.issuperset({1,2})
True
>>> data.issuperset({1,2,3})
False
>>> data.issuperset({1,3})
False
>>> data.issuperset({2,1})
True

>>> {1,2} > {1,2}
False
>>> {1,2} > {1,2,3}
False
>>> {1,2,3} > {1,2}
True

>>> {1,2} >= {1,2}
True
>>> {1,2} >= {1,2,3}
False
>>> {1,2,3} >= {1,2}
True

4.3.9. Basic Operations¶

Union (returns sum of elements from data and x):

>>> data = {1,2}
>>>
>>> data.union({1,2})
{1, 2}
>>> data.union({1,2,3})
{1, 2, 3}
>>> data.union({1,2,4})
{1, 2, 4}
>>> data.union({1,3}, {2,4})
{1, 2, 3, 4}

>>> {1,2} | {1,2}
{1, 2}
>>> {1,2,3} | {1,2}
{1, 2, 3}
>>> {1,2,3} | {1,2,4}
{1, 2, 3, 4}
>>> {1,2} | {1,3} | {2,4}
{1, 2, 3, 4}

Difference (returns elements from data which are not in x):

>>> data = {1,2}
>>>
>>> data.difference({1,2})
set()
>>> data.difference({1,2,3})
set()
>>> data.difference({1,4})
{2}
>>> data.difference({1,3}, {2,4})
set()
>>> data.difference({3,4})
{1, 2}

>>> {1,2} - {2,3}
{1}
>>> {1,2} - {2,3} - {3}
{1}
>>> {1,2} - {1,2,3}
set()

Symmetric Difference (returns elements from data and x, but without common):

>>> data = {1,2}
>>>
>>> data.symmetric_difference({1,2})
set()
>>> data.symmetric_difference({1,2,3})
{3}
>>> data.symmetric_difference({1,4})
{2, 4}
>>> data.symmetric_difference({1,3}, {2,4})
Traceback (most recent call last):
TypeError: set.symmetric_difference() takes exactly one argument (2 given)
>>> data.symmetric_difference({3,4})
{1, 2, 3, 4}

>>> {1,2} ^ {1,2}
set()
>>> {1,2} ^ {2,3}
{1, 3}
>>> {1,2} ^ {1,3}
{2, 3}

Intersection (returns common element from in data and x):

>>> data = {1,2}
>>>
>>> data.intersection({1,2})
{1, 2}
>>> data.intersection({1,2,3})
{1, 2}
>>> data.intersection({1,4})
{1}
>>> data.intersection({1,3}, {2,4})
set()
>>> data.intersection({1,3}, {1,4})
{1}
>>> data.intersection({3,4})
set()

>>> {1,2} & {2,3}
{2}
>>> {1,2} & {2,3} & {2,4}
{2}
>>> {1,2} & {2,3} & {3}
set()

4.3.10. Cardinality¶

>>> data = {1, 2, 3}
>>> len(data)
3

4.3.11. References¶

4.3.12. Assignments¶

Code 4.7. Solution¶

"""
* Assignment: Iterable Set Create
* Type: class assignment
* Complexity: easy
* Lines of code: 5 lines
* Time: 5 min

English:
    1. Create sets:
        a. `result_a` without elements
        b. `result_a` with elements: 1, 2, 3
        c. `result_b` with elements: 1.1, 2.2, 3.3
        d. `result_c` with elements: 'a', 'b', 'c'
        e. `result_d` with elements: True, False
        f. `result_e` with elements: 1, 2.2, True, 'a'
    2. Run doctests - all must succeed

Polish:
    1. Stwórz sety:
        a. `result_a` bez elementów
        b. `result_a` z elementami: 1, 2, 3
        c. `result_b` z elementami: 1.1, 2.2, 3.3
        d. `result_c` z elementami: 'a', 'b', 'c'
        e. `result_d` z elementami: True, False, True
        f. `result_e` z elementami: 1, 2.2, True, 'a'
    2. Uruchom doctesty - wszystkie muszą się powieść

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert result_a is not Ellipsis, \
    'Assign your result to variable `result_a`'
    >>> assert result_b is not Ellipsis, \
    'Assign your result to variable `result_b`'
    >>> assert result_c is not Ellipsis, \
    'Assign your result to variable `result_c`'
    >>> assert result_d is not Ellipsis, \
    'Assign your result to variable `result_d`'
    >>> assert result_e is not Ellipsis, \
    'Assign your result to variable `result_e`'
    >>> assert result_f is not Ellipsis, \
    'Assign your result to variable `result_f`'

    >>> assert type(result_a) is set, \
    'Variable `result_a` has invalid type, should be set'
    >>> assert type(result_b) is set, \
    'Variable `result_b` has invalid type, should be set'
    >>> assert type(result_c) is set, \
    'Variable `result_c` has invalid type, should be set'
    >>> assert type(result_d) is set, \
    'Variable `result_d` has invalid type, should be set'
    >>> assert type(result_e) is set, \
    'Variable `result_e` has invalid type, should be set'
    >>> assert type(result_f) is set, \
    'Variable `result_f` has invalid type, should be set'

    >>> assert result_a == set(), \
    'Variable `result_a` has invalid value, should be set()'
    >>> assert result_b == {1, 2, 3}, \
    'Variable `result_b` has invalid value, should be {1, 2, 3}'
    >>> assert result_c == {1.1, 2.2, 3.3}, \
    'Variable `result_c` has invalid value, should be {1.1, 2.2, 3.3}'
    >>> assert result_d == {'a', 'b', 'c'}, \
    'Variable `result_d` has invalid value, should be {"a", "b", "c"}'
    >>> assert result_e == {True, False}, \
    'Variable `result_e` has invalid value, should be {True, False}'
    >>> assert result_f == {1, 2.2, True, 'a'}, \
    'Variable `result_f` has invalid value, should be {1, 2.2, True, "a"}'
"""

# Set without elements
# type: set
result_a = ...

# Set with elements: 1, 2, 3
# type: set[int]
result_b = ...

# Set with elements: 1.1, 2.2, 3.3
# type: set[float]
result_c = ...

# Set with elements: 'a', 'b', 'c'
# type: set[str]
result_d = ...

# Set with elements: True, False
# type: set[bool]
result_e = ...

# Set with elements: 1, 2.2, True, 'a'
# type: set[int|float|bool|str]
result_f = ...

Code 4.8. Solution¶

"""
* Assignment: Iterable Set Many
* Type: class assignment
* Complexity: easy
* Lines of code: 9 lines
* Time: 8 min

English:
    1. Non-functional requirements:
        a. Assignmnet verifies creation of `set()` and method `.add()` and
           `.update()` usage
        b. For simplicity numerical values type as `floats`, and not `str`
        c. Example: instead of '5.8' just type 5.8
        d. Do not use `str.split()`, `slice`, `getitem`, `for`, `while` or
           any other control-flow statement
    2. Create set `result` representing row with index 1
    3. Values from row at index 2 add to `result` using `.add()` (five calls)
    4. From row at index 3 create `set` and add it to `result` using
       `.update()` (one call)
    5. From row at index 4 `tuple` and add it to `result` using `.update()`
       (one call)
    6. From row at index 5 `list` and add it to `result` using `.update()` (
       one call)
    7. Run doctests - all must succeed

Polish:
    1. Wymagania niefunkcjonalne:
        a. Zadanie sprawdza tworzenie `set()` oraz użycie metod `.add()` i
           `.update()`
        b. Dla uproszczenia wartości numeryczne wypisuj jako `float`,
        a nie `str`
        c. Przykład: zamiast '5.8' zapisz 5.8
        d. Nie używaj `str.split()`, `slice`, `getitem`, `for`, `while` lub
           jakiejkolwiek innej instrukcji sterującej
    2. Stwórz zbiór `result` reprezentujący wiersz o indeksie 1
    3. Wartości z wiersza o indeksie 2 dodawaj do `result` używając `.add()`
       (pięć wywołań)
    4. Na podstawie wiersza o indeksie 3 stwórz `set` i dodaj go do `result`
       używając `.update()` (jedno wywołanie)
    5. Na podstawie wiersza o indeksie 4 stwórz `tuple` i dodaj go do
       `result` używając `.update()` (jedno wywołanie)
    6. Na podstawie wiersza o indeksie 5 stwórz `list` i dodaj go do
       `result` używając `.update()` (jedno wywołanie)
    7. Uruchom doctesty - wszystkie muszą się powieść

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert result is not Ellipsis, \
    'Assign your result to variable `result`'
    >>> assert type(result) is set, \
    'Variable `result` has invalid type, should be set'
    >>> assert len(result) == 22, \
    'Variable `result` length should be 22'

    >>> assert ('sepal_length' not in result
    ...     and 'sepal_width' not in result
    ...     and 'petal_length' not in result
    ...     and 'petal_width' not in result
    ...     and 'species' not in result)

    >>> assert result >= {5.8, 2.7, 5.1, 1.9, 'virginica'}
    >>> assert result >= {5.1, 3.5, 1.4, 0.2, 'setosa'}
    >>> assert result >= {5.7, 2.8, 4.1, 1.3, 'versicolor'}
    >>> assert result >= {6.3, 2.9, 5.6, 1.8, 'virginica'}
    >>> assert result >= {6.4, 3.2, 4.5, 1.5, 'versicolor'}
"""

DATA = [
    'sepal_length,sepal_width,petal_length,petal_width,species',
    '5.8,2.7,5.1,1.9,virginica',
    '5.1,3.5,1.4,0.2,setosa',
    '5.7,2.8,4.1,1.3,versicolor',
    '6.3,2.9,5.6,1.8,virginica',
    '6.4,3.2,4.5,1.5,versicolor',
]

# Set with row at DATA[1] (manually converted to float and str)
# type: set[float|str]
result = ...

# Add to result float 5.1
...

# Add to result float 3.5
...

# Add to result float 1.4
...

# Add to result float 0.2
...

# Add to result str setosa
...

# Update result with set 5.7, 2.8, 4.1, 1.3, 'versicolor'
...

# Update result with tuple 6.3, 2.9, 5.6, 1.8, 'virginica'
...

# Update result with list 6.4, 3.2, 4.5, 1.5, 'versicolor'
...