Here’s NumPy normalization tutorial, we will learn how to normalize a table using the Python NumPy library. But before we move on to that, let’s first try to understand the definition and meaning of NumPy and normalization.

  1. Normalization
  2. NumPy
  3. NumPy functions
  4. Normalization of a one-dimensional (1D) matrix
  5. Normalization of a two-dimensional (2D) matrix

Normalization

In general, normalization is the process used to scale the actual values ​​of numeric attributes to the range 0–1. Normalization helps to organize data so that it looks similar across regions and records. Data normalization has several benefits, such as reducing redundancy, reducing complexity, clarity, and obtaining higher quality data.

Data normalization is commonly used in machine learning. Normalization helps to make model training less sensitive to the range of machine learning features. When data is used to practice a model, we need to scale the data so that all numeric values ​​are in the same range and large values ​​do not exceed smaller values. This allows the models to meet better weights, which in turn leads to a more accurate model. Simply put, normalization helps the model to predict outputs more accurately.

The next question that arises is how to perform data normalization? One of the methods of data normalization is the use of Python. To do this, Python provides users with a NumPy library that contains the “linalg.norm ()” function, which is used to normalize data. The normalization function takes the matrix as an input, normalizes the values ​​in the table between 0 and 1 using some formula, and outputs the normalized table as an output. We will look at this in detail soon. But before that, we need to understand the meaning and applications of NumPy.

NumPy

NumPy, as the name suggests, means Numerical Python. NumPy has a built-in Python library used to work with matrices. Now that we already know that in Python you can create a table using lists, then why do we need NumPy? Well, NumPy offers a faster way to work with matrices compared to traditional catalogs.

To use the NumPy system, you must install the NumPy library using the pip application. Below is the command used to install NumPy on your system –

pip install numpy

After installation, we need to import this library into our application / program in order to use its functions. Below is the syntax for importing the numpy library in Python –

Import numpy

Now let’s look at an example of creating a one-dimensional table using the numpy library –

import numpy  as np           # importing numpy library
my_array = np.array([10, 30, 50, 70, 90])    #defining the input array
print(“This is my array - ”, my_array)         # Printing the array

The result of the above program is as follows –

This is my group – [10, 30, 50, 70, 90]

Let’s look at an example of how to create a two-dimensional table using the NumPy library –

import numpy as np       # importing numpy library as np
two_d_array = np.array([[10, 30, 50, 70, 90], [20, 40, 60, 80, 100]])  # defining the 2 D array
print(“This is a two dimensional array - ”,  two_d_array)  # printing the array

The result of the above program is as follows –

This is a two-dimensional table – [[10 30 50 70 90]

[20 40 60 80 100]]

NumPy functions

The NumPy library includes several functions that make it easier to work in the fields of matrices, linear algebra, polynomials, and Fourier transforms. A few of them are listed below:

Add – the numpy.add () function is used to add two tables.

Decrease – the numpy.subtract () function is used to subtract two matrices.

Tell – The numpy.multiply () function is used to multiply two matrices.

Yes – The numpy.divide () function is used to split two tables.

Min – The numpy.min () function is used to find the minimum value of a matrix.

Max – The numpy.max () function is used to find the maximum value of a table.

Means – the function numpy.mean () is used to calculate the matrix average.

Where – the function numpy.var () is used to calculate the variance of a table.

Std – the function numpy.std () is used to calculate the standard deviation of the table.

Piste – The numpy.dot () function is used to find the point product of two matrices.

Cross – The numpy.cross () function is used to find the cross product of two tables.

Internal – The numpy.inner () function is used to execute the result within two groups.

Outer – The numpy.outer () function is used to execute the outer result of two tables.

Move – the numpy.transpose () function is used to create the Transposition table.

Concatenate – The numpy.concatenate () function is used to concatenate two or more matrices.

Like the functions mentioned above, the NumPy library also contains several functions for performing linear algebraic calculations. These functions can be found in the linalg submodule. Linalg is a sub-module of the NumPy library, which means linear algebra and is used to solve various algebraic problems. Let’s look at some of the following functions of the linalg submodule –

Se – the function numpy.linalg.det () is used to calculate the determinant of the matrix (matrix).

Inv – the function numpy.linalg.inv () is used to compute the inverse of the matrix.

Property the function numpy.linalg.eig () is used to compute the eigenvalues ​​and eigenvectors of a square table (matrix).

Norm – the function numpy.linalg.norm () is used to find the norm of a table (matrix). We are going to use this function for numpy normalization. This function takes a matrix or matrix as an argument and returns the norm of that table.

As we know, which function should be used to normalize a table. An attempt is made to understand the theoretical concept of matrix normalization. And then we see how to write a complete normalization program for one dimension group and two-dimensional table as well.

So the norm we use in our code is called the Euclidean norm or the Frobenius norm. This norm is used to calculate the normalized matrix. The mathematical formula for normalizing the matrix is ​​shown below –

Lightbox

Where,

v cap – represents a normalized table or matrix.

V – represents the input matrix.

| v | – represents a Euclidean norm or matrix determinant.

Now we have an idea and understanding of all the relevant terms and functions that are going to be used to normalize a NumPy table in Python. So let’s look at doing the same by looking at the examples below –

1. Normalization of a one-dimensional (1D) matrix –

a.) Normalization of a predefined 1D group –

import numpy as np                        # importing numpy library as np                     
pre_one_array = np.array([10, 20, 30, 40, 50])    # defining a 1D array
print(pre_one_array)                                  # printing the array
norm = np.linalg.norm(pre_one_array)                  # To find the norm of the array
print(norm)                                            # Printing the value of the norm
normalized_array = pre_one_array/norm  # Formula used to perform array normalization
print(normalized_array)                 # printing the normalized array 

The result of the above program is as follows –

[10 20 30 40 50]

74.161984871

[ 0.13483997  0.26967994  0.40451992  0.53935989  0.67419986]

As we can see, all output group values ​​are between 0 and 1. Therefore, it is clear that the predefined input 1D table has been successfully normalized.

b.) Normalization of a random 1D group –

If we want to normalize a 1D table with random values, the following method is used –

import numpy as np                        # importing numpy library as np                     
ran_one_array = np.random.rand(5)*10   # defining a random array of 5 elements using  rand function of random sub module of the numpy library. Here 10 represents the range of the values of the elements which will be between 0 to 10
print(ran_one_array)                                  # printing the array
norm = np.linalg.norm(ran_one_array)                  # To find the norm of the array
print(norm)                                            # Printing the value of the norm
normalized_array = ran_one_array/norm    # Formula used to perform array normalization
print(normalized_array)                 # printing the normalized array

The result of the above program is as follows –

[ 2.66782852  6.70146289  5.38289872  0.52054369  9.62171167]

13.1852498544

[ 0.20233432  0.50825452  0.40825155  0.03947924  0.72973298]

As we can see, all the values ​​in the output table are between 0 and 1. Therefore, it is clear that the random input 1D table has been successfully normalized.

2. Normalization of a two-dimensional (2D) matrix –

a.) Normalization of a predefined 2D group –

import numpy as np                        # importing numpy library as np                     
pre_two_array = np.array([[10, 30, 50, 70, 90], [20, 40, 60, 80, 100], [5, 15, 25, 35, 45], [55, 65, 75, 85, 95], [11, 22, 33, 44, 55]])    # defining a 2D array having 5 rows and 5 columns
print(pre_two_array)                                  # printing the array
norm = np.linalg.norm(pre_two_array)                  # To find the norm of the array
print(norm)                                            # Printing the value of the norm
normalized_array = pre_two_array/norm  # Formula used to perform array normalization
print(normalized_array)                 # printing the normalized array

The result of the above program is as follows –

[[ 10  30  50  70  90]

[ 20  40  60  80 100]

[  5  15  25  35  45]

[ 55  65  75  85  95]

[ 11  22  33  44  55]]

280,008928429

[[ 0.03571315  0.10713944  0.17856573  0.24999203  0.32141832]

[ 0.07142629  0.14285259  0.21427888  0.28570518  0.35713147]

[ 0.01785657  0.05356972  0.08928287  0.12499601  0.16070916]

[ 0.19642231  0.23213545  0.2678486   0.30356175  0.3392749 ]

[ 0.03928446  0.07856892  0.11785338  0.15713785  0.19642231]]

Here, as we see, all the values ​​of the output group are between 0 and 1. Therefore, it is clear that the predetermined input 2D table has been successfully normalized.

b.) Normalization of a random 2D group –

If we want to normalize a 2D table with random values, the following method is used in the same –

import numpy as np                        # importing numpy library as np                     
ran_two_array = np.random.rand(5, 5)*10   # defining a random array of 5 rows and 5 columns using  rand function of random sub module of the numpy library. Here 10 represents the range of the values of the elements which will be between 0 and 10
print(ran_two_array)                                  # printing the array
norm = np.linalg.norm(ran_two_array)                  # To find the norm of the array
print(norm)                                            # Printing the value of the norm
normalized_array = ran_two_array/norm    # Formula used to perform array normalization
print(normalized_array)                 # printing the normalized array

The result of the above program is as follows –

[[ 4.57411295  8.65220668  9.63324979  1.9971668   3.23869927]

[ 0.84966168  5.90483284  0.47779068  3.28578339  2.45708816]

[ 5.85465399  4.49030481  9.12849734  9.05088372  2.16890579]

[ 1.24442784  3.31225636  5.72207596  3.9220778   1.45400695]

[ 5.49354678  3.63828521  3.66439748  3.75588512  4.4547876 ]]

25.1725603225

[[ 0.18171028  0.3437158   0.38268852  0.07933904  0.12865991]

[ 0.03375349  0.23457419  0.01898062  0.13053036  0.09760978]

[ 0.23258079  0.17838093  0.36263682  0.35955356  0.08616151]

[ 0.04943589  0.13158202  0.22731402  0.15580766  0.05776158]

[ 0.21823552  0.14453378  0.14557111  0.14920553  0.17696998]]

As we can see, all the values ​​in the output table are between 0 and 1. Therefore, it is clear that the random input 2D table has been successfully normalized.

This has allowed us to complete this NumPy Normalization tutorial. We hope you now understand the concept of NumPy normalization. In this NumPy normalization tutorial, we have covered the definition of normalization, its benefits, and applications. We have also seen the definition and use of the NumPy library and its other functions. Then we learn the theoretical concept and formula of the normalization process. And last but not least, we implemented normalization using both a one-dimensional and a two-dimensional table using Python’s NumPy library and checking the corresponding outputs.

Discover the true value of information by learning from the world-renowned MIT faculty Data Processing and Machine Learning: Making Informed Decisions by MIT IDSS and Applied data processing software by MIT vocational training. The programs, whose curricula are designed by the MIT faculty, are complemented by mentored learning sessions with industry experts to help you solve real-life business problems and build a portfolio with the latest computing and machine learning skills.

Interested? Download brochures Applied data processing software and Data Processing and Machine Learning: Making Informed Decisions

0

LEAVE A REPLY

Please enter your comment!
Please enter your name here