Sometimes you need to keep a running summary in an application, for example the number of cars, bikes, lorries, buses and vans to go over a bridge in a 24 hour period. You don’t always want to log away each data item to the database or other data structure as you are only interested in the summary.
Python makes creating running summaries easy by using it’s built in Dictionary type.
Here is an example:
# Set-up an empty dictionary summary = dict() # 1st pass if 'Dogs' in summary: summary['Dogs'] = summary['Dogs']+1 else: summary['Dogs'] = 1 # 2nd pass if 'Cats' in summary: summary['Cats'] = summary['Cats']+1 else: summary['Cats'] = 1 # 3rd pass if 'Dogs' in summary: summary['Cats'] = summary['Cats']+1 else: summary['Cats'] = 1 # 4th pass if 'Snake' in summary: summary['Snake'] = summary['Snake']+1 else: summary['Snake'] = 1 # Now output the summary of the Pets print('Summary in arrival order') for k, v in summary.items(): print(k, v) # Bonus - now sort the output by value in reverse order. print('Summary by value') for k, v in sorted(summary.items(), key=lambda item: item[1], reverse=True): print(k, v)
Explanation
It’s worth reminding ourselves that the Python dictionary is a KEY, VALUE pair store. Where the KEY is unique. Think of this as a table where the first column is the KEY and the second column is the VALUE.
Key / Value stores are an essential data structure in modern computing and the Python implements its Dictionary using a hashed data structure. This means it transforms each Key value into a location where the value is stored so it can go straight there rather than having to search through a list. Python allocates a certain amount of key space when a dictionary is created and roughly doubles this space each time you go to add a new key and it finds there is no space.
In some applications where performance is critical you may want to pre-warm the key store with each possible value at start-up so that there is not an unpredictable delay while the store is resized at some point in the future.
summary = dict()
if ‘Dogs’ in summary:
summary[‘Dogs’] = summary[‘Dogs’]+1
else:
summary[‘Dogs’] = 1
The code starts by declaring an empty Python dictionary which will be used to store the summary.
We then move on to record the first pet in the summary, we test if “Dogs” is already in the dictionary. If it is then the value is retrieved and incremented by 1 and stored again. If “Dogs” isn’t already in the dictionary then a new entry is inserted and the value initialised to 1.
The code above prints the summary in two ways. The first is based on the arrival sequence and the second uses a Python Lambda function to sort the dictionary by value, giving the following output:
Summary in arrival order
Dogs 1
Cats 2
Snake 1
Summary by value
Cats 2
Dogs 1
Snake 1
Saving or serialising the python dictionary to disk
If you have performed a lot of long running processing to build up your summary you will want to save the results somewhere. The simplest option is to serialised or output the Python dictionary object as a JSON file on disk.
import json summary = dict() # Code to add entries to summary with open('result.json', 'w') as fp: json.dump(summary, fp)
The above code couldn’t be much simpler. You just need to import json and the call json.dump to create a serialised JSON version of the dictionary on disk. In my example above this is called results.json.
Loading a JSON dictionary from disk in Python
import json with open('result.json') as json_file: newSummary = json.load(json_file)
That’s it, just call the json.load function, passing in the json filename and you will load up the dictionary again.
0 Comments