Musings, Rants and Ponderings Of A DB Architect: Summer of code 2017: Python, Day 9 Collections: list, dictionary and set

As explained in my Summer of code 2017: Python post I decided to pick up Python

This is officially day 9, today, I looked at Collections in Python, here are my notes

Lists

Lists are mutable sequences, typically used to store collections of homogeneous items

Lists may be constructed in several ways:

Using a pair of square brackets to denote the empty list: []
Using square brackets, separating items with commas: [a], [a, b, c]
Using a list comprehension: [x for x in iterable]
Using the type constructor: list() or list(iterable)

Lists implement all of the common and mutable sequence operations. Lists also provide the following additional method: sort

Here is what a list looks like in Python before and after the sort method is called

>>> a = [ [2],[4], [1]]
>>> a
[[2], [4], [1]]
>>> a.sort()
>>> a
[[1], [2], [4]]
>>>

I also played around with making copies of lists

>>> a = [[1], [2], [4]]
>>> a
[[1], [2], [4]]

>>> b =a[0:1]
>>> b
[[1]]

>>> b =a[0:2]
>>> b
[[1], [2]]

>>> b=a[:]
>>> b
[[1], [2], [4]]

>>> a is b
False

>>> a == b
True

>>> a[0]
[1]

>>> b[0]
[1]

>>> a[0] is b[0]
True

>>> a[2].append(9)
>>> a
[[1], [2], [4, 9]]

>>> b
[[1], [2], [4, 9]]
>>>

Here is what is going on (I added line numbers so it is easier to look at the code)
On line 1 made a new list a
On line 5 I made a list b with 1 element of list a
On line 9 I made list b have the first 2 elements of list a
On line 13 I made list be be a copy of list a by supplying just the colon
On line 17 you will see that list a and b are not the same, on line 29 you will see that the first element of both lists are the same
If I now append 9 to the 3 rd element of the list, you will see that both list a and b now have the same 3rd element. The reason for this is that the copies are shallow

I decided to try some more things, this time I created a list of words

>>> l ="SpaceX successfully launches and recovers second Falcon 9 in 48 hours".split()
>>> l
['SpaceX', 'successfully', 'launches', 'and', 'recovers', 'second', 'Falcon', '9',
 'in', '48', 'hours']

>>> i=l.index('Falcon')
>>> i
6

>>> 'launches' in l
True

>>> l.reverse()
>>> l
['hours', '48', 'in', '9', 'Falcon', 'second', 'recovers', 'and', 'launches',
 'successfully', 'SpaceX']

>>> l.sort()
>>> l
['48', '9', 'Falcon', 'SpaceX', 'and', 'hours', 'in', 'launches', 'recovers', 
'second', 'successfully']

>>> l.sort(reverse=True)
>>> l
['successfully', 'second', 'recovers', 'launches', 'in', 'hours', 'and', 'SpaceX', 
'Falcon', '9', '48']
>>>

On line 1 I created a list by using the split method on a function
On line 6 I am grabbing the index of the item with the value Falcon
On line 10 I am checking if the value launches is part of this list
On line 13 I am reversing the list
On Line 18 I am sorting the list
On Line 23 I am sorting the list in reversed order

Make sure to type True and not true, otherwise you will get the error NameError: name 'true' is not defined
Here is what the error looks like in the console

>>> l.sort(reverse=true)
Traceback (most recent call last):
  File "", line 1, in 
    l.sort(reverse=true)
NameError: name 'true' is not defined

I then decided to play around with deleting items from the list

>>> l ="SpaceX successfully launches and recovers second Falcon 9 in 48 hours".split()
>>> l
['SpaceX', 'successfully', 'launches', 'and', 'recovers', 'second', 'Falcon', '9',
 'in', '48', 'hours']
>>> del l[1]
>>> l
['SpaceX', 'launches', 'and', 'recovers', 'second', 'Falcon', '9', 'in', '48', 
'hours']
>>> l.remove('recovers')
>>> l
['SpaceX', 'launches', 'and', 'second', 'Falcon', '9', 'in', '48', 'hours']

On line 5 I deleted an item by supplying the index
On line 9 I removed an item by supplying the value

What would happen if the same value is more than once in the list and I removed it by supplying the value?
Let's take a look

>>> x = ['and', 'and', 'and','boo']
>>> x
['and', 'and', 'and', 'boo']
>>> x.count('and')
3
>>> x.remove('and')
>>> x
['and', 'and', 'boo']
>>> x.count('and')
2
>>>

On line 1 I created a list with a value of and 3 times
On line 4 we count this value and it returns 3
On line 6 we call the remove method
On line 9 we call count again and the count is now 2, only 1 item got removed.

Here is what the docs have to say

s.remove(x)
remove the first item from s where s[i] == x

As you can see, only the first item will be removed, not all of them with that value

You can grab the last element of a list by using -1

>>> l ="SpaceX successfully launches and recovers second Falcon 9 in 48 
hours".split()
>>> l
['SpaceX', 'successfully', 'launches', 'and', 'recovers', 'second', 'Falcon', '9', 
'in', '48', 'hours']
>>> l[0]
'SpaceX'
>>> l[-1]
'hours'
>>>

As you can see, on line 8 we use [-1] and the last element from the list is returned

To insert an item into a list at a specific place, you can use insert

>>> l ="SpaceX successfully and recovers second Falcon 9 in 48 hours".split()
>>> l
['SpaceX', 'successfully', 'and', 'recovers', 'second', 'Falcon', '9', 'in', 
'48', 'hours']
>>> l.insert(2, 'launches')
>>> l
['SpaceX', 'successfully', 'launches', 'and', 'recovers', 'second', 'Falcon', '9', 
'in', '48', 'hours']
>>>

As you can see from line 5, we are inserting the value launches as the 3rd item (0 based, so index =2). now the whole list makes sense again

Dict

I also took a look at dictionaries/dicts, a dict in other languages is an associative array. Most likely you will think of dicts as key value pairs

Let's take a look what it looks like, I created a dictionary of stocks with tickers and prices

>>> stocks ={
'TGT' :  '51.33', 
'AAPL': '147.48',
'MSFT':  '71.49',
'F'   :  '11.12', 
'INTC':  '34.37'}

>>> stocks
{'TGT': '51.33', 'AAPL': '147.48', 'MSFT': '71.49', 'F': '11.12', 'INTC': '34.37'}

>>> stocks['AAPL']
'147.48'

>>> stocks.update({'AAPL': '147.25', 'MSFT': '71.45'})

>>> stocks['AAPL']
'147.25'

>>> stocks['MSFT']
'71.45'
>>>

On lines 1 till 6 I created a dictionary with 5 items
On line 8 I wanted to see what would be returned if I asked back for it
On line 11 I just asked for the value of key AAPL
On line 15 I updated the value of key AAPL and MSFT
On lines 16 and 19 I asked for the values that I just updated and you can see the values have changed

I also decided to print the dictionary, here is what that looks like

>>> stocks ={
'TGT' :  '51.33', 
'AAPL': '147.48',
'MSFT':  '71.49',
'F'   :  '11.12', 
'INTC':  '34.37'}

>>> for key in stocks:
    print("{key} = {value}".format(key=key, value=stocks[key]))

    
TGT = 51.33
AAPL = 147.48
MSFT = 71.49
F = 11.12
INTC = 34.37
>>>

Python also includes a module for pretty printing pprint, I played around with pprint and wrote a separate post about it here: Summer of code 2017: Python, Pretty printing with pprint in Python

Sets

A set is an unordered collection of unique, immutable objects
A set is delimited by { and }

>>> s ={1, 2, 3, 4, 5}
>>> print(s)
{1, 2, 3, 4, 5}

If you just do something like {} it will actually be a dict

>>> y ={}
>>> type(y)
<class 'dict'>
>>>

To create an empty set use the set() constructor.

>>> x =set()
>>> x
set()
>>> print(x)
set()
>>> type(x)
<class 'set'>

>>> x.add(1)
>>> x
{1}

Printing an empty set just prints set()
As you can see from the output you can use add to add an item to a set

Sets are unique, to quickly remove duplicates, you can convert it to a set

Here is an example

>>> t =[1, 2, 2, 2, 2, 3, 4, 5]
>>> set(t)
{1, 2, 3, 4, 5}

Adding the same value to a set will just ignore it

>>> x={1}
>>> x
{1}
>>> x.add(2)
>>> x
{1, 2}
>>> x.add(2)
>>> x
{1, 2}

To remove items from a set you can user remove or discard, remove will throw an error if the item doesn't exist, discard will not throw an error

Here is an example

>>> x={1}
>>> x
{1}
>>> x.discard(3)
>>> x.remove(3)
Traceback (most recent call last):
  File "", line 1, in 
    x.remove(3)
KeyError: 3
>>> x.discard(3)
>>>

Set Algebra

If you come from the database/SQL world, the following will look very familiar to you

Here is what sets support

union
Return a new set with elements from the set and all others

intersection
Return a new set with elements common to the set and all others

difference
Return a new set with elements in the set that are not in the others

symmetric_difference
Return a new set with elements in either the set or other but not both

issubset
Test whether every element in the set is in other

issuperset
Test whether every element in other is in the set

isdisjoint

Return True if the set has no elements in common with other. Sets are disjoint if and only if their intersection is the empty set

Here is some code that shows some of these

>>> s1 = {1,2,3,4,5}
>>> s2 = {3,4,5}

>>> s1.union(s2)
{1, 2, 3, 4, 5}

>>> s1.intersection(s2)
{3, 4, 5}

>>> s1.difference(s2)
{1, 2}

>>> s1.symmetric_difference(s2)
{1, 2}

>>> s1.issubset(s2)
False

>>> s1.issuperset(s2)
True

>>> s1.isdisjoint(s2)
False

We need to create some more examples for disjoint and symmetric_difference with different values so you can see what is returned

>>> s1 = {1,2,3,4,5}
>>> s2 ={0,8}

>>> s1.isdisjoint(s2)
True

>>> s1 = {1,2,3,4,5}
>>> s2 ={4,5,6,7}

>>> s1.symmetric_difference(s2)
{1, 2, 3, 6, 7}

That is all for this post... next time I will take a look at are exceptions

Musings, Rants and Ponderings Of A DB Architect

Monday, June 26, 2017

Summer of code 2017: Python, Day 9 Collections: list, dictionary and set

Lists

Dict

Sets

No comments: