Time Profiling
In Python, we have three modules that help us find the execution time of a program.
Using Timeit Module
Python timeit module is often used to measure the execution time of small code snippets. We can also use the timeit() function which executes an anonymous function with a number of executions. It temporarily turns off garbage collection (the process of collecting unwanted variables whose use has been over and clears them by marking them as garbage values to free up the memory) during calculating the time of execution. How is the "timeit" command better than "time"? It averages results and ignores warm up.
# importing the module
import timeit
# using the timeit method and lambda
# expression to get the execution time of
# the function.
t = timeit.timeit(lambda: "print('Hello World!')")
# printing the execution time
print(t)
0.08696221099944523
# importing the module
import timeit
# sample function that returns square
# of the value passed
def print_square(x):
return (x**2)
# using the timeit method and lambda
# expression to get the execution time of
# the function, number defines how many
# times we execute this function
t = timeit.timeit(lambda: print_square(3), number=10)
# printing the execution time
print(t)
4.83900021208683e-06
Now, practically to estimate the execution time of a program we generally do not use the value obtained once as the ultimate correct value, because the execution of a program may depend upon os and availability of hardware at a particular instant of time. So generally we take multiple values of execution time and generally the average computed gives us the best possible answer. For this, we would use the timeit.repeat() method instead of timeit.timeit() which takes a repeat parameter and saves you the trouble of creating a loop and storing the values in the array.
# importing the module
import timeit
# sample function that returns square
# of the value passed
def print_square(x):
return (x**2)
# using the repeat method and lambda
# expression to get the execution time of
# the function, number defines how many
# times we execute this function and the
# repeat defines the number of times the
# time calculation needs to be done.
t = timeit.repeat(lambda: print_square(3), number=10, repeat=5)
# printing the execution time
print(t)
[5.1380002332734875e-06, 3.6009996620123275e-06, 3.5189996197004803e-06, 3.484999979264103e-06, 3.50700065609999e-06]
Also, we can use the timeit.default_timer() which basically records the time at the instant when the method is called. So we call the method just before and after the lines of the code for which we want to calculate the execution time and then basically the difference between the two times give us the result. So for finding the time we record the times using the timeit.defaultTimer() method, and then we print the difference between the two times in the last line.
# importing the module
import timeit
# sample function that returns
# square of the value passed
def print_square(x):
return (x**2)
# records the time at this instant
# of the program
start = timeit.default_timer()
# calls the function
print_square(3)
# records the time at this instant
# of the program
end = timeit.default_timer()
# printing the execution time by subtracting
# the time before the function from
# the time after the function
print(end-start)
3.099999958067201e-05
"""Using "timeit"""
from timeit import timeit
items = {
'a': 1,
'b': 2,
}
default = -1
def use_catch(key):
"""Use try/catch to get a key with default"""
try:
return items[key]
except KeyError:
return default
def use_get(key):
"""Use dict.get to get a key with default"""
return items.get(key, default)
if __name__ == '__main__':
# Key is in the dictionary
print('catch', timeit('use_catch("a")', 'from __main__ import use_catch'))
print('get', timeit('use_get("a")', 'from __main__ import use_get'))
# Key is missing from the dictionary
print('catch', timeit('use_catch("x")', 'from __main__ import use_catch'))
print('get', timeit('use_get("x")', 'from __main__ import use_get'))
catch 0.12401290399975551
get 0.16413491899947985
catch 0.32834199399985664
get 0.17250985499958915
Using Time Module
We can use the time.perf_counter() method in the same way as the timeit.default_timer() method discussed above. It can use the highest possible resolution clock and gives you the most accurate result. In fact the timeit.default_timer() also uses time.perf_counter() as it’s base. This also records the time before and after the required lines of code whose execution or elapsed time needs to be calculated. Then we subtract the recorded time before the start of the lines from the recorded time after the lines of the code.
# importing the module
import time
# sample function that returns square
# of the value passed
def print_square(x):
return (x**2)
# records the time at this instant of the
# program
start = time.perf_counter()
# calls the function
print_square(3)
# records the time at this instant of the
# program
end = time.perf_counter()
# printing the execution time by subtracting
# the time before the function from
# the time after the function
print(end-start)
2.8733999897667672e-05
# importing the module
import time
# sample function that returns square
# of the value passed
def print_square(x):
return (x**2)
# records the time in nanoseceonds at
# this instant of the program
start = time.time_ns()
# calls the function
print_square(3)
# records the time in nanoseceonds at this
# instant of the program
end = time.time_ns()
# printing the execution time by subtracting
# the time before the function from
# the time after the function
print(end-start)
50676
"""Measuring time"""
from time import perf_counter
def upto_for(n):
"""Sum 1...n with a for loop"""
total = 0
for i in range(n):
total += i
return total
def upto_sum(n):
"""Sum 1...n with built-in sum and range"""
return sum(range(n))
if __name__ == '__main__':
n = 1_000_000
start = perf_counter()
upto_for(n)
duration = perf_counter() - start
print('upto_for', duration)
start = perf_counter()
upto_sum(n)
duration = perf_counter() - start
print('upto_sum', duration)
upto_for 0.0630132690002938
upto_sum 0.018293597999218036
Using Datetime Module
Datetime module can also be used to find the time elapsed or spent in a code block provided you are not looking for high precision. the datetime.now() also works the same way as the timeit.default_timer() or time.perf_counter() but lacks the same precision as those. It returns the result in HH:MM:SS format. This function is generally used for getting the current time and not a preferred use case for calculating execution time. However, these can be programs that take quite some time to execute like training ML/AI models, web scraping large websites, etc.
# importing the module
from datetime import datetime
# sample function that returns square
# of the value passed
def print_square(x):
return (x**2)
# records the time at this instant of
# the program in HH:MM:SS format
start = datetime.now()
# calls the function
print_square(3)
# records the time at this instant of the
# program in HH:MM:SS format
end = datetime.now()
# printing the execution time by subtracting
# the time before the function from
# the time after the function
print(end-start)
0:00:00.000034
%Timeit Magic inline
import numpy as np
import pandas as pd
data = pd.Series(np.random.randint(50, 60 , 10_000))
#Outliers
data[7]=3
data[1003]=100
def find_outliers(data):
"""Find outliers in data, return indices of outliers"""
out = data[(data - data.mean()).abs() > 2 * data.std()]
return out.index
find_outliers(data)
Int64Index([7, 1003], dtype='int64')
%timeit find_outliers(data)
1000 loops, best of 5: 629 µs per loop
%prun -s cumulative find_outliers(data)
Using cprofile
Python includes a built in module called cProfile which is used to measure the execution time of a program.cProfiler module provides all information about how long the program is executing and how many times the function get called in a program. How does cProfile work? It records every function entry and exit.
# importing cProfile
import cProfile
cProfile.run("10 + 10")
3 function calls in 0.000 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.000 0.000 <string>:1(<module>)
1 0.000 0.000 0.000 0.000 {built-in method builtins.exec}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
# importing cProfile
import cProfile
def f():
print("hello")
cProfile.run('f()')
hello
39 function calls in 0.000 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.000 0.000 <ipython-input-46-7b34c7345cd2>:4(f)
1 0.000 0.000 0.000 0.000 <string>:1(<module>)
3 0.000 0.000 0.000 0.000 iostream.py:195(schedule)
2 0.000 0.000 0.000 0.000 iostream.py:307(_is_master_process)
2 0.000 0.000 0.000 0.000 iostream.py:320(_schedule_flush)
2 0.000 0.000 0.000 0.000 iostream.py:382(write)
3 0.000 0.000 0.000 0.000 iostream.py:93(_event_pipe)
3 0.000 0.000 0.000 0.000 socket.py:480(send)
3 0.000 0.000 0.000 0.000 threading.py:1050(_wait_for_tstate_lock)
3 0.000 0.000 0.000 0.000 threading.py:1092(is_alive)
3 0.000 0.000 0.000 0.000 threading.py:507(is_set)
1 0.000 0.000 0.000 0.000 {built-in method builtins.exec}
2 0.000 0.000 0.000 0.000 {built-in method builtins.isinstance}
1 0.000 0.000 0.000 0.000 {built-in method builtins.print}
2 0.000 0.000 0.000 0.000 {built-in method posix.getpid}
3 0.000 0.000 0.000 0.000 {method 'acquire' of '_thread.lock' objects}
3 0.000 0.000 0.000 0.000 {method 'append' of 'collections.deque' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
Using line_profiler
Python provides a built-in module to measure execution time and the module name is LineProfiler.It gives detailed report on time consumed by a program.
!pip install line_profiler
# importing line_profiler module
from line_profiler import LineProfiler
def peek(rk):
print(rk)
rk ="risk"
profile = LineProfiler(peek(rk))
profile.print_stats()
risk
Timer unit: 1e-06 s
%load_ext line_profiler
%lprun -f find_outliers