8.5. CSV DictWriter¶
csv.DictWriter: list[dict]
8.5.1. Conversion¶
list[tuple] to list[dict]
>>> DATA = [
... ('SepalLength', 'SepalWidth', 'PetalLength', 'PetalWidth', 'Species'),
... (5.8, 2.7, 5.1, 1.9, 'virginica'),
... (5.1, 3.5, 1.4, 0.2, 'setosa'),
... (5.7, 2.8, 4.1, 1.3, 'versicolor')]
>>>
>>> header, *rows = DATA
>>> result = []
>>>
>>> for row in rows:
... pairs = zip(header, row)
... result.append(dict(pairs))
>>>
>>> result
[{'SepalLength': 5.8, 'SepalWidth': 2.7, 'PetalLength': 5.1, 'PetalWidth': 1.9, 'Species': 'virginica'},
{'SepalLength': 5.1, 'SepalWidth': 3.5, 'PetalLength': 1.4, 'PetalWidth': 0.2, 'Species': 'setosa'},
{'SepalLength': 5.7, 'SepalWidth': 2.8, 'PetalLength': 4.1, 'PetalWidth': 1.3, 'Species': 'versicolor'}]
8.5.2. DictWriter¶
Remember to add
mode='w'
toopen()
functionDefault encoding is
encoding='utf-8'
>>> import csv
>>>
>>> FILE = r'/tmp/myfile.csv'
>>>
>>> DATA = [{'Sepal Length': 5.4, 'Sepal Width': 3.9,
... 'Petal Length': 1.3, 'Petal Width': 0.4,
... 'Species': 'setosa'},
...
... {'Sepal Length': 5.9, 'Sepal Width': 3.0,
... 'Petal Length': 5.1, 'Petal Width': 1.8,
... 'Species': 'virginica'},
...
... {'Sepal Length': 6.0, 'Sepal Width': 3.4,
... 'Petal Length': 4.5, 'Petal Width': 1.6,
... 'Species': 'versicolor'}]
>>>
>>>
>>> header = DATA[0].keys()
>>>
>>> with open(FILE, mode='w') as file:
... result = csv.DictWriter(file, fieldnames=header)
... result.writeheader()
... result.writerows(DATA)
59
>>>
>>> with open(FILE) as file:
... print(file.read())
Sepal Length,Sepal Width,Petal Length,Petal Width,Species
5.4,3.9,1.3,0.4,setosa
5.9,3.0,5.1,1.8,virginica
6.0,3.4,4.5,1.6,versicolor
Write data to CSV file using csv.DictWriter()
:
>>> import csv
>>>
>>> FILE = r'/tmp/myfile.csv'
>>>
>>> DATA = [{'Sepal Length': 5.4, 'Sepal Width': 3.9,
... 'Petal Length': 1.3, 'Petal Width': 0.4,
... 'Species': 'setosa'},
...
... {'Sepal Length': 5.9, 'Sepal Width': 3.0,
... 'Petal Length': 5.1, 'Petal Width': 1.8,
... 'Species': 'virginica'},
...
... {'Sepal Length': 6.0, 'Sepal Width': 3.4,
... 'Petal Length': 4.5, 'Petal Width': 1.6,
... 'Species': 'versicolor'}]
>>>
>>> FIELDNAMES = ['Sepal Length', 'Sepal Width',
... 'Petal Length', 'Petal Width', 'Species']
>>>
>>> with open(FILE, mode='w', encoding='utf-8') as file:
... result = csv.DictWriter(f=file,
... fieldnames=FIELDNAMES,
... delimiter=',',
... quotechar='"',
... quoting=csv.QUOTE_ALL,
... lineterminator='\n')
...
... result.writeheader()
... result.writerows(DATA)
68
>>>
>>> with open(FILE) as file:
... print(file.read())
"Sepal Length","Sepal Width","Petal Length","Petal Width","Species"
"5.4","3.9","1.3","0.4","setosa"
"5.9","3.0","5.1","1.8","virginica"
"6.0","3.4","4.5","1.6","versicolor"
8.5.3. Recap¶
csv.DictWriter()
writes list[dict]Schema and schemaless data
fieldname
parametersorted(...)
- sorts dataset(...)
- deduplicates datadict.keys()
- get keys fromdict
8.5.4. Assignments¶
"""
* Assignment: CSV DictWriter Fixed
* Complexity: easy
* Lines of code: 4 lines
* Time: 5 min
English:
1. Using `csv.DictWriter()` save `DATA` to `FILE`
2. Open file in your spreadsheet program like:
Microsoft Excel, Libre Office or Numbers etc.
3. Open file in simple in your IDE and simple text editor like:
Notepad, vim, gedit
4. Non functional requirements:
a. All fields must be enclosed by double quote `"` character
b. Use `,` to separate columns
d. Use Unix `\n` line terminator
5. Run doctests - all must succeed
Polish:
1. Za pomocą `csv.DictWriter()` zapisz `DATA` do `FILE`
2. Spróbuj otworzyć plik w arkuszu kalkulacyjnym tj.
Microsoft Excel, Libre Office lub Numbers itp
3. Spróbuj otworzyć plik w IDE i prostym edytorze tekstu tj.
Notepad, vim lub gedit
4. Wymagania niefunkcjonalne:
a. Wszystkie pola muszą być otoczone znakiem cudzysłowu `"`
b. Użyj `,` do oddzielenia kolumn
d. Użyj zakończenia linii Unix `\n`
5. Uruchom doctesty - wszystkie muszą się powieść
Hint:
* For Python before 3.8: `dict(OrderedDict)`
Tests:
>>> import sys; sys.tracebacklimit = 0
>>> from os import remove
>>> result = open(FILE).read()
>>> remove(FILE)
>>> assert result is not Ellipsis, \
'Assign result to variable: `result`'
>>> assert type(result) is str, \
'Variable `result` has invalid type, should be str'
>>> print(result) # doctest: +NORMALIZE_WHITESPACE
"firstname","lastname"
"Pan","Twardowski"
"Rick","Martinez"
"Mark","Watney"
"Ivan","Ivanovic"
"Melissa","Lewis"
"""
import csv
DATA = [{'firstname': 'Pan', 'lastname': 'Twardowski'},
{'firstname': 'Rick', 'lastname': 'Martinez'},
{'firstname': 'Mark', 'lastname': 'Watney'},
{'firstname': 'Ivan', 'lastname': 'Ivanovic'},
{'firstname': 'Melissa', 'lastname': 'Lewis'}]
FILE = r'_temporary.csv'
# Write DATA to FILE, generate header from DATA
# type: ContextManager
with open(FILE, mode='w') as file:
...
"""
* Assignment: CSV DictWriter Schemaless
* Complexity: medium
* Lines of code: 7 lines
* Time: 5 min
English:
1. Using `csv.DictWriter()` write variable schema data to `FILE`
2. `fieldnames` must be automatically generated from `DATA`
3. Non functional requirements:
a. All fields must be enclosed by double quote `"` character
b. Use `,` to separate columns
c. Use `utf-8` encoding
d. Use Unix `\n` line terminator
e. Sort `fieldnames` using `sorted()`
4. Run doctests - all must succeed
Polish:
1. Za pomocą `csv.DictWriter()` zapisz dane o zmiennej strukturze do `FILE`
2. `fieldnames` musi być generowane automatycznie na podstawie `DATA`
3. Wymagania niefunkcjonalne:
a. Wszystkie pola muszą być otoczone znakiem cudzysłowu `"`
b. Użyj `,` do oddzielenia kolumn
c. Użyj kodowania `utf-8`
d. Użyj zakończenia linii Unix `\n`
e. Posortuj `fieldnames` używając `sorted()`
4. Uruchom doctesty - wszystkie muszą się powieść
Tests:
>>> import sys; sys.tracebacklimit = 0
>>> from os import remove
>>> result = open(FILE).read()
>>> remove(FILE)
>>> assert result is not Ellipsis, \
'Assign result to variable: `result`'
>>> assert type(result) is str, \
'Variable `result` has invalid type, should be str'
>>> print(result)
"Petal length","Petal width","Sepal length","Sepal width","Species"
"","","5.1","3.5","setosa"
"4.1","1.3","","","versicolor"
"","1.8","6.3","","virginica"
"","0.2","5.0","","setosa"
"4.1","","","2.8","versicolor"
"","1.8","","2.9","virginica"
<BLANKLINE>
"""
import csv
DATA = [{'Sepal length': 5.1, 'Sepal width': 3.5, 'Species': 'setosa'},
{'Petal length': 4.1, 'Petal width': 1.3, 'Species': 'versicolor'},
{'Sepal length': 6.3, 'Petal width': 1.8, 'Species': 'virginica'},
{'Sepal length': 5.0, 'Petal width': 0.2, 'Species': 'setosa'},
{'Sepal width': 2.8, 'Petal length': 4.1, 'Species': 'versicolor'},
{'Sepal width': 2.9, 'Petal width': 1.8, 'Species': 'virginica'}]
FILE = r'_temporary.csv'
# Write DATA to FILE, generate header from DATA
# type: ContextManager
with open(FILE, mode='w', encoding='utf-8') as file:
...
"""
* Assignment: CSV DictWriter Objects
* Complexity: medium
* Lines of code: 6 lines
* Time: 8 min
English:
1. Using `csv.DictWriter()` save data to `FILE`
2. Non-functional requirements:
a. Use `,` to separate columns
b. Use `utf-8` encoding
c. Use Unix `\n` line terminator
3. Run doctests - all must succeed
Polish:
1. Za pomocą `csv.DictWriter()` zapisz dane do `FILE`
2. Wymagania niefunkcjonalne:
a. Użyj `,` do oddzielenia kolumn
b. Użyj kodowania `utf-8`
c. Użyj zakończenia linii Unix `\n`
3. Uruchom doctesty - wszystkie muszą się powieść
Hints:
* `vars()`
Tests:
>>> import sys; sys.tracebacklimit = 0
>>> from os import remove
>>> result = open(FILE).read()
>>> remove(FILE)
>>> assert result is not Ellipsis, \
'Assign result to variable: `result`'
>>> assert type(result) is str, \
'Variable `result` has invalid type, should be str'
>>> print(result)
petal_length,petal_width,sepal_length,sepal_width,species
1.4,0.2,5.1,3.5,setosa
5.1,1.9,5.8,2.7,virginica
1.4,0.2,5.1,3.5,setosa
4.1,1.3,5.7,2.8,versicolor
5.6,1.8,6.3,2.9,virginica
4.5,1.5,6.4,3.2,versicolor
<BLANKLINE>
"""
import csv
class Iris:
def __init__(self, sepal_length, sepal_width,
petal_length, petal_width, species):
self.sepal_length = sepal_length
self.sepal_width = sepal_width
self.petal_length = petal_length
self.petal_width = petal_width
self.species = species
DATA = [Iris(5.1, 3.5, 1.4, 0.2, 'setosa'),
Iris(5.8, 2.7, 5.1, 1.9, 'virginica'),
Iris(5.1, 3.5, 1.4, 0.2, 'setosa'),
Iris(5.7, 2.8, 4.1, 1.3, 'versicolor'),
Iris(6.3, 2.9, 5.6, 1.8, 'virginica'),
Iris(6.4, 3.2, 4.5, 1.5, 'versicolor')]
FILE = r'_temporary.txt'
# Write DATA to FILE, generate header from DATA
# type: ContextManager
with open(FILE, mode='w', encoding='utf-8') as file:
...