2021年6月13日 星期日

[ Python 文章收集 ] How to Make Python Code Run Incredibly Fast

 Source From Here

Preface
Python is one of the most popular programming languages among developers. It is used everywhere, whether it’s web development or machine learning.

There are many reasons for its popularity, such as its community support, its amazing libraries, its wide usage in Machine Learning and Big Data, and its easy syntax.

Despite having these many qualities, python has one drawback, which is its slow speed. Being an interpreted language, python is slower than other programming languages. Still, we can overcome this problem using some tips.

In this article, I will share some python tricks using which we can make our python code run faster than usual. Let’s get started!

1. Proper algorithm & data structure
Each data structure has a significant effect on runtime. There are many built-in data structures such as list, tuple, set, and dictionary in python. Most of the people use list data structure in all the cases.

In python, sets and dictionaries have O(1) lookup performance as they use hash tables for that. You can use sets and dictionaries instead of lists in the following cases:
* You do not have duplicate items in the collection.
You need to search items repeatedly in the collection.
* The collection contains a large number of items.

You can see the time complexity of different data structures in python here: Time Complexity via Python Wiki
This page documents the time-complexity (aka "Big O" or "Big Oh") of various operations in current CPython...

2. Using built-in functions and libraries
Python’s built-in functions are one of the best ways to speed up your code. You must use built-in python functions whenever needed. These built-in functions are well tested and optimized.

The reason these built-in functions are fast is that python’s built-in functions, such as minmaxallmap, etc., are implemented in the C language.

You should use these built-in functions instead of writing manual functions that will help you execute your code faster. Let's check a simple example:
  1. newlist = []  
  2.   
  3. for word in wordlist:  
  4.     newlist.append(word.upper())  
A better way to write this code is:
  1. newlist = map(str.upper, wordlist)  
Here we are using the built-in map function, which is written in C. Therefore, it is much faster than using a loop.

3. Use multiple assignments
If you want to assign the values of multiple variables, then do not assign them line by line. Python has an elegant and better way to assign multiple variables.

Example:
  1. firstName = "John"  
  2. lastName = "Henry"  
  3. city = "Manchester"  
A better way to assign these variables is:
  1. firstName, lastName, city = "John""Henry""Manchester"  
This assignment of variables is much cleaner and elegant than the above one.

4. Prefer list comprehension over loops
List comprehension is an elegant and better way to create a new list based on the elements of an existing list in just a single line of code.

List comprehension is considered a more Pythonic way to create a new list than defining an empty list and adding elements to that empty list.

Another advantage of list comprehension is that it is faster than using the append method to add elements to a python list.

Example: Using list append method:
  1. newlist = []  
  2. for i in range(1100):  
  3.     if i % 2 == 0:  
  4.         newlist.append(i**2)  
A better way using list comprehension:
  1. newlist = [i**2 for i in range(1100if i%2==0]  
Code looks cleaner when using list comprehensions.

5. Proper import
You should avoid importing unnecessary modules and libraries until and unless you need them. You can specify the module name instead of importing the complete library.

Importing the unnecessary libraries will result in slowing down your code performance.

Example: Suppose you need to find out the square root of a number. Instead of this:
  1. import math  
  2. value = math.sqrt(50)  
Use this:
  1. from math import sqrt  
  2. value = sqrt(50)  
6. String Concatenation
In python, we concatenate strings using the ‘+’ operator. But another way to concatenate the strings in python is using the join method.

Join method is a more pythonic way to concatenate strings, and it is also faster than concatenating strings with the ‘+’ operator.

The reason why the join() method is faster is that the ‘+’ operator creates a new string and then copies the old string at each step, whereas the join method does not work that way.

Example:
  1. output = "Programming" + "is" + "fun  
Using join method:
  1. output = " ".join(["Programming" , "is""fun"])  
The output of both methods will be the same. The only difference is that the join() method is faster than the ‘+’ operator.

Conclusion
That’s all from this article. In this article, we have discussed some tricks that can be used to make your code run faster. These tips can be used especially with competitive programming where the time limit is everything. I hope you liked this article. Thanks for reading!

2021年6月10日 星期四

[ 常見問題 ] List submodules and import them

 Source from here1 and here2

Question
How to import submodules of a given module and import them dynamically? Said I have a package A with modules as below:
  1. [23:54:22 tmp]$ tree A/  
  2. A/  
  3. ├── submodule1.py  
  4. └── submodule2.py  
Let's say I have imported A as below and how to list submodule1 and submodule2 and then imported them?
  1. import A  
  2.   
  3. # TBD  
  4. # List all submodules under package A and import them  
HowTo
We can leverage pkgutil & importlib. Check sample code below:
test.py
  1. import A  
  2. import importlib  
  3. import pkgutil  
  4.   
  5.   
  6. for sm_info in pkgutil.iter_modules(A.__path__):  
  7.     print(f"Found submodule: {sm_info}")  
  8.     submodule_name = f"A.{sm_info.name}"  
  9.     print(f"Importing {submodule_name}")  
  10.     modu = my_module = importlib.import_module(submodule_name)  
  11.     print(f"Done: {modu.name()}")  
Execution result:
$ ./test.py
Found submodule: ModuleInfo(module_finder=FileFinder('/tmp/A'), name='submodule1', ispkg=False)
Importing A.submodule1
Done: Hello submodule1
Found submodule: ModuleInfo(module_finder=FileFinder('/tmp/A'), name='submodule2', ispkg=False)
Importing A.submodule2
Done: Hi, submodule2


2021年6月3日 星期四

[OO 文章收集] SOLID: The First 5 Principles of Object Oriented Design

 Source From Here

Introduction
SOLID is an acronym for the first five object-oriented design (OOD) principles by Robert C. Martin (also known as Uncle Bob).

These principles establish practices that lend to developing software with considerations for maintaining and extending as the project grows. Adopting these practices can also contribute to avoiding code smells, refactoring code, and Agile or Adaptive software development.

SOLID stands for:
* S - Single-responsiblity Principle
* O - Open-closed Principle
* L - Liskov Substitution Principle
* I - Interface Segregation Principle
* D - Dependency Inversion Principle

In this article, you will be introduced to each principle individually to understand how SOLID can help make you a better developer.

Single-Responsibility Principle
Single-responsibility Principle (SRP) states:
A class should have one and only one reason to change, meaning that a class should have only one job.

For example, consider an application that takes a collection of shapes—circles, and squares—and calculates the sum of the area of all the shapes in the collection. First, create the shape classes and have the constructors set up the required parameters.

For squares, you will need to know the length of a side:
  1. class Square:  
  2.     def __init__(self, length):  
  3.         self.length = length  
For circles, you will need to know the radius:
  1. class Circle:  
  2.     def __init__(self, radius):  
  3.         self.radius = radius  
Next, create the AreaCalculator class and then write up the logic to sum up the areas of all provided shapes. The area of a square is calculated by length squared. The area of a circle is calculated by pi times radius squared:
  1. import math  
  2.   
  3. class AreaCalculator:  
  4.     def __init__(self, shapes):  
  5.         self.shapes = shapes  
  6.           
  7.     def area_sum(self):  
  8.         sum_of_area = 0  
  9.         for s in self.shapes:  
  10.             if isinstance(s, Square):  
  11.                 sum_of_area += pow(s.length, 2)  
  12.             elif isinstance(s, Circle):  
  13.                 sum_of_area += pow(s.radius, 2) * math.pi  
  14.                   
  15.         return sum_of_area  
  16.       
  17.     def __repr__(self):  
  18.         return self.str()  
  19.       
  20.     def __str__(self):  
  21.         return f"Area sum of {len(self.shapes)} shape(s) is {self.area_sum()}"  
To use the AreaCalculator class, you will need to instantiate the class and pass in an array of shapes and display the output at the bottom of the page. Here is an example with a collection of three shapes:
  1. shapes = [Circle(2), Square(5), Square(6)]  
  2. ac = AreaCalculator(shapes)  
  3. print(ac)  
Output:
Area sum of 3 shape(s) is 73.56637061435917

Now let's consider a scenario where the output should be converted to another format like JSON.

All of the logic would be handled by the AreaCalculator class in this design. This would violate the single-responsibility principle. The AreaCalculator class should only be concerned with the sum of the areas of provided shapes. It should not care whether the user wants JSON or HTML.
The problem is with the output method that makes AreaCalculator handles the logic to output the data.

To address this, you can create a separate SumCalculatorOutputter class and use that new class to handle the logic you need to output the data to the user:
  1. import json  
  2.   
  3. class SumCalculatorOutputter:  
  4.     def __init__(self, calculator:AreaCalculator):  
  5.         self.calculator = calculator  
  6.           
  7.     def json(self):  
  8.         return json.dumps({'sum': round(self.calculator.area_sum(), 1)}, indent=4)  
  9.       
  10.     def text(self):  
  11.         return f"Area sum is {self.calculator.area_sum():.01f}"  
The SumCalculatorOutputter class would work like this:
  1. sc_output = SumCalculatorOutputter(ac)  
  2. print(f"JSON output: {sc_output.json()}")  
  3. print(f"TEXT output: {sc_output.text()}")  
Output:
  1. JSON output: {  
  2.     "sum"73.6  
  3. }  
  4. TEXT output: Area sum is 73.6  
Now, the logic you need to output the data to the user is handled by the SumCalculatorOutputter class. That satisfies the single-responsibility principle.

Open-Closed Principle
Open-closed Principle (OCP) states:
Objects or entities should be open for extension but closed for modification.

This means that a class should be extendable without modifying the class itself.

Let’s revisit the AreaCalculator class and focus on the area_sum method:
  1. def area_sum(self):  
  2.     sum_of_area = 0  
  3.     for s in self.shapes:  
  4.         if isinstance(s, Square):  
  5.             sum_of_area += pow(s.length, 2)  
  6.         elif isinstance(s, Circle):  
  7.             sum_of_area += pow(s.radius, 2) * math.pi  
  8.               
  9.     return sum_of_area  
Consider a scenario where the user would like the area_sum of additional shapes like triangles, pentagons, hexagons, etc. You would have to constantly edit this file and add more if/else blocks. That would violate the open-closed principle.

A way you can make this area_sum method better is to remove the logic to calculate the area of each shape out of the AreaCalculator class method and attach it to each shape’s class. Here is the area method defined in Square:
  1. class Square:  
  2.     def __init__(self, length):  
  3.         self.length = length  
  4.           
  5.     def area(self):  
  6.         return pow(self.length, 2)  
And here is the area method defined in Circle:
  1. class Circle:  
  2.     def __init__(self, radius):  
  3.         self.radius = radius  
  4.           
  5.     def area(self):  
  6.         return math.pi * pow(self.radius, 2)  
The refined AreaCalculator:
  1. class AreaCalculator:  
  2.     def __init__(self, shapes):  
  3.         self.shapes = shapes  
  4.           
  5.     def area_sum(self):  
  6.         sum_of_area = 0  
  7.         for s in self.shapes:  
  8.             sum_of_area += s.area()  
  9.                   
  10.         return sum_of_area  
Now, you can create another shape class and pass it in when calculating the sum of area without breaking the code.

However, another problem arises. How do you know that the object passed into the AreaCalculator is actually a shape or if the shape has a method named area?

Coding to an interface is an integral part of SOLID. Create a ShapeInterface that supports area:
  1. from abc import ABC, abstractmethod  
  2.   
  3. class ShapeInterface(ABC):  
  4.     @abstractmethod  
  5.     def area(self):  
  6.         raise NotImplementedError()  
Next is to modify your shape classes to implement the ShapeInterface:
  1. class Square(ShapeInterface):  
  2.     def __init__(self, length):  
  3.         self.length = length  
  4.           
  5.     def area(self):  
  6.         return pow(self.length, 2)  
  7.   
  8. class Circle(ShapeInterface):  
  9.     def __init__(self, radius):  
  10.         self.radius = radius  
  11.           
  12.     def area(self):  
  13.         return math.pi * pow(self.radius, 2)  
From AreaCalculator, you can check if the shapes provided are actually instances of the ShapeInterface; otherwise, throw an exception:
  1. from typing import List  
  2. import math  
  3.   
  4. class AreaCalculator:  
  5.     def __init__(self, shapes:List[ShapeInterface]):  
  6.         self.shapes = shapes  
  7.           
  8.     def area_sum(self):  
  9.         sum_of_area = 0  
  10.         for s in self.shapes:  
  11.             if isinstance(s, ShapeInterface):  
  12.                 sum_of_area += s.area()  
  13.             else:  
  14.                 raise Exception(f"Unexpected class={s.__class__}")  
  15.                   
  16.         return sum_of_area  
That satisfies the open-closed principle.

Liskov Substitution Principle
Liskov Substitution Principle states:
Let q(x) be a property provable about objects of x of type T. Then q(y) should be provable for objects y of type S where S is a subtype of T.

This means that every subclass or derived class should be substitutable for their base or parent class.

Building off the example AreaCalculator class, consider a new VolumeCalculator class that extends the AreaCalculator class:
  1. class Cuboid(ShapeInterface):  
  2.     def __init__(self, width, height, depth):  
  3.         self.width, self.height, self.depth = width, height, depth  
  4.           
  5.     def area(self):  
  6.         return self.width * self.height  
  7.       
  8. class VolumeCalculator(AreaCalculator):  
  9.     def __init__(self, shapes):  
  10.         self.shapes = shapes  
  11.           
  12.     def area_sum(self):  
  13.         sum_of_area = 0.0  
  14.         for s in self.shapes:  
  15.             sum_of_area = s.area()  
  16.               
  17.         return sum_of_area  
Recall that the SumCalculatorOutputter class resembles this:
  1. class SumCalculatorOutputter:  
  2.     def __init__(self, calculator:AreaCalculator):  
  3.         self.calculator = calculator  
  4.           
  5.     def json(self):  
  6.         return json.dumps({'sum': round(self.calculator.area_sum(), 1)}, indent=4)  
  7.       
  8.     def text(self):  
  9.         return f"Area sum is {self.calculator.area_sum():.01f}"  
If you tried to run an example like this:
  1. shapes = [Circle(2), Square(5), Square(6)]  
  2. solid_shapes = [Cuboid(123)]  
  3. ac = AreaCalculator(shapes)  
  4. vc = VolumeCalculator(solid_shapes )  
  5.   
  6. sc_out1 = SumCalculatorOutputter(ac)  
  7. sc_out2 = SumCalculatorOutputter(vc)  
  8.   
  9. # Ok  
  10. print(sc_out1.json())  
  11. # Error: TypeError: type list doesn't define __round__ method  
  12. print(sc_out2.json())  
You will get an TypeError error informing you of an array to round operation. To fix this, instead of returning a list from the VolumeCalculator class area_sum method, return float instead:
  1. class VolumeCalculator(AreaCalculator):  
  2.     def __init__(self, shapes):  
  3.         self.shapes = shapes  
  4.           
  5.     def area_sum(self):  
  6.         sum_of_area = 0.0  
  7.         # logic to calculate the volumes and then return an array of output  
  8.         for s in self.shapes:  
  9.             sum_of_area = s.volume()  
  10.               
  11.         return sum_of_area  
That satisfies the Liskov substitution principle.

Interface Segregation Principle
Interface segregation principle states:
A client should never be forced to implement an interface that it doesn’t use, or clients shouldn’t be forced to depend on methods they do not use.

Still building from the previous ShapeInterface example, you will need to support the new three-dimensional shapes of Cuboid and Spheroid, and these shapes will need to also calculate volume. Let’s consider what would happen if you were to modify the ShapeInterface to add another contract:
  1. from abc import ABC, abstractmethod  
  2.   
  3. class ShapeInterface(ABC):  
  4.     @abstractmethod  
  5.     def area(self):  
  6.         raise NotImplementedError()  
  7.       
  8.     @abstractmethod  
  9.     def volume(self):  
  10.         raise NotImplementedError()  
Now, any shape you create must implement the volume method, but you know that squares are flat shapes and that they do not have volumes, so this interface would force the Square class to implement a method that it has no use of.

This would violate the interface segregation principle. Instead, you could create another interface called ThreeDimensionalShapeInterface that has the volume contract and three-dimensional shapes can implement this interface:
  1. class ShapeInterface(ABC):  
  2.     @abstractmethod  
  3.     def area(self):  
  4.         raise NotImplementedError()  
  5.   
  6. class ThreeDimensionalShapeInterface(ABC):          
  7.     @abstractmethod  
  8.     def volume(self):  
  9.         raise NotImplementedError()  
  10.   
  11. class Cuboid(ShapeInterface, ThreeDimensionalShapeInterface):  
  12.     def __init__(self, width, height, depth):  
  13.         self.width, self.height, self.depth = width, height, depth  
  14.           
  15.     def area(self):  
  16.         return self.width * self.height  
  17.       
  18.     def volume(self):  
  19.         return self.width * self.height * self.depth  
That satisfies the interface segregation principle.

Dependency Inversion Principle
Dependency inversion principle states:
Entities must depend on abstractions, not on concretions. It states that the high-level module must not depend on the low-level module, but they should depend on abstractions.

This principle allows for decoupling.

Here is an example of a PasswordReminder that connects to a MySQL database:
  1. class MySQLConnection:  
  2.     def connect(self):  
  3.         return 'Database connection'  
  4.       
  5.       
  6. class PasswordReminder:  
  7.     def __init__(self, db_connection):  
  8.         self.db_connection = db_connection  
First, the MySQLConnection is the low-level module while the PasswordReminder is high level, but according to the definition of D in SOLIDwhich states to Depend on abstraction, not on concretions. This snippet above violates this principle as the PasswordReminder class is being forced to depend on the MySQLConnection class.

Later, if you were to change the database engine, you would also have to edit the PasswordReminder class, and this would violate the open-close principle.

The PasswordReminder class should not care what database your application uses. To address these issues, you can code to an interface since high-level and low-level modules should depend on abstraction:
  1. class DBConnectionInterface(ABC):  
  2.     @abstractmethod  
  3.     def connect(self):  
  4.         raise NotImplementedError()  
  5.       
  6. class MySQLConnection(DBConnectionInterface):  
  7.     def connect(self):  
  8.         return 'Database connection'  
  9.       
  10.       
  11. class PasswordReminder:  
  12.     def __init__(self, db_connection:DBConnectionInterface):  
  13.         self.db_connection = db_connection  
This code establishes that both the high-level and low-level modules depend on abstraction.

Supplement
Software Design - Introduction to SOLID Principles in 8 Minutes
Design Patterns in Plain English | Mosh Hamedani

[ Python 文章收集 ] How to Make Python Code Run Incredibly Fast

  Source From  Here Preface Python is one of the most popular programming languages among developers. It is used everywhere, whether it’s we...