A module is a file containing Python definitions and statements. Modules specify functions, methods and new Python types which solved particular problems.

A package is a collection of modules in directories. There are many available packages for Python covering different problems. For example, “NumPy”, “matplotlib”, “seaborn”, and “scikit-learn” are very famous data science packages.

- “NumPy” is used for efficiently working with arrays
- “matplotlib” and “seaborn” are popular libraries used for data visualization
- “scikit-learn” is a powerful library for machine learning

There are some packages available in Python by default, but there are also so many packages that we need and that we don’t have by default. If we want to use some package, we have to have it installed already or just install it using pip (package maintenance system for Python).

However, there is also something called “Anaconda”.

Anaconda Distribution is a free, easy-to-install package manager, environment manager and Python distribution with a collection of 1,000+ open source packages with free community support.

So, if you don’t want to install many packages, I’ll recommend you to use the “Anaconda”. There are so many useful packages in this distribution.

**Import Statements**

Once you have installed the needed packages, you can import them into your Python files. We can import an entire package, submodules or specific functions from it. Also, we can add an alias for a package. We can see the different ways of import statements from the examples below.

Simple Import Statement:

import numpy

numbers = numpy.array([3, 4, 20, 15, 7, 19, 0])

Import statement With an Alias:

import numpy as np # np is an alias for the numpy package

numbers = np.array([3, 4, 20, 15, 7, 19, 0]) # works fine

numbers = numpy.array([3, 4, 20, 15, 7, 19, 0]) # NameError: name ‘numpy’ is not defined

Import Submodule From a Package With an Alias:

import the “pyplot” submodule from the “matplotlib” package with alias “plt”

import matplotlib.pyplot as plt

Import Only One Function From a Package:

from numpy import array

numbers = array([3, 4, 20, 15, 7, 19, 0]) # works fine

numbers = numpy.array([3, 4, 20, 15, 7, 19, 0]) # NameError: name ‘numpy’ is not defined

type(numbers) # numpy.ndarray

We can also do something like this `from numpy import *`

. The asterisk symbol here means to import everything from that module. This import statement creates references in the current namespace to all public objects defined by the `numpy`

module. In other words, we can just use all available functions from `numpy`

only with their names without prefix. For example, now we can use the NumPy’s absolute function like that `absolute()`

instead of `numpy.absolute()`

.

However, I’m not recommending you to use that because:

- If we import all functions from some modules like that, the current namespace will be filled with so many functions and if someone looks our code, he or she can get confused from which package is a specific function.
- If two modules have a function with the same name, the second import will override the function of the first.

### NumPy

NumPy is a fundamental package for scientific computing with Python. It’s very fast and easy to use. This package helps us to make calculations element-wise (element by element).

The regular Python list doesn’t know how to do operations element-wise. Of course, we can use Python lists, but they’re slow, and we need more code to achieve a wanted result. A better decision in most cases is to use `NumPy`

.

Unlike the regular Python list, the NumPy array always has one single type. If we pass an array with different types to the `np.array()`

, we can choose the wanted type using the parameter `dtype`

. If this parameter is not given, then the type will be determined as the minimum type required to hold the objects.

NumPy Array — Type Converting:

np.array([False, 42, “Data Science”]) # array([“False”, “42”, “Data Science”], dtype="<U12")

np.array([False, 42], dtype = int) # array([ 0, 42])

np.array([False, 42, 53.99], dtype = float) # array([ 0. , 42. , 53.99])

# Invalid converting

np.array([False, 42, “Data Science”], dtype = float) # could not convert string to float: ‘Data Science’

NumPy array comes with his own attributes and methods. Remember that the operators in Python behave differently on the different data types? Well, in NumPy the operators behave element-wise.

Operators on NumPy Array:

np.array([37, 48, 50]) + 1 # array([38, 49, 51])

np.array([20, 30, 40]) * 2 # array([40, 60, 80])

np.array([42, 10, 60]) / 2 # array([ 21., 5., 30.])

np.array([1, 2, 3]) * np.array([10, 20, 30]) # array([10, 40, 90])

np.array([1, 2, 3]) - np.array([10, 20, 30]) # array([ -9, -18, -27])

If we check the type of a NumPy array the result will be `numpy.ndarray`

. Ndarray means n-dimensional array. In the examples above we used 1-dimensional arrays, but nothing can stop us to make 2, 3, 4 or more dimensional array. We can do subsetting on an array independently of that how much dimensions this array has. I’ll show you some examples with a 2-dimensional array.

Subsetting 2-dimensional arrays:

numbers = np.array([

[1, 2, 3],

[4, 5, 6],

[7, 8, 9],

[10, 11, 12]

])

numbers[2, 1] # 8

numbers[-1, 0] # 10

numbers[0] # array([1, 2, 3])

numbers[:, 0] # array([ 1, 4, 7, 10])

numbers[0:3, 2] # array([3, 6, 9])

numbers[1:3, 1:3] # array([[5, 6],[8, 9]])

If we want to see how many dimensional is our array and how much elements have each dimension, we can use the `shape`

attribute. For 2-dimensional arrays, the first element of the tuple will be the number of rows and the second the number of the columns.

NumPy Shape Attribute:

numbers = np.array([

[1, 2, 3],

[4, 5, 6],

[7, 8, 9],

[10, 11, 12],

[13, 14, 15]

])

numbers.shape # (5, 3)