Friday, August 12, 2022
HomeArtificial IntelligenceCreating a Python Program Utilizing Inspection Instruments

Creating a Python Program Utilizing Inspection Instruments


Last Updated on May 30, 2022

Python is an interpreting language. It means there is an interpreter to run our program, rather than compiling the code and running natively. In Python, a REPL (read-eval-print loop) can run commands line by line. Together with some inspection tools provided by Python, it helps to develop codes.

In the following, you will see how to make use of the Python interpreter to inspect an object and develop a program.

After finishing this tutorial, you will learn:

  • How to work in the Python interpreter
  • How to use the inspection functions in Python
  • How to develop a solution step by step with the help of inspection functions

Let’s get started!

Developing a Python Program Using Inspection Tools.
Photo by Tekton. Some rights reserved.

Tutorial Overview

This tutorial is in four parts; they are:

  • PyTorch and TensorFlow
  • Looking for Clues
  • Learning from the Weights
  • Making a Copier

PyTorch and TensorFlow

PyTorch and TensorFlow are the two biggest neural network libraries in Python. Their code is different, but the things they can do are similar.

Consider the classic MNIST handwritten digit recognition problem; you can build a LeNet-5 model to classify the digits as follows:

This is a simplified code that does not need any validation or testing. The counterpart in TensorFlow is the following:

Running this program would give you the file lenet5.pt from the PyTorch code and lenet5.h5 from the TensorFlow code.

Looking for Clues

If you understand what the above neural networks are doing, you should be able to tell that there is nothing but many multiply and add calculations in each layer. Mathematically, there is a matrix multiplication between the input and the kernel of each fully-connected layer before adding the bias to the result. In the convolutional layers, there is the element-wise multiplication of the kernel to a portion of the input matrix before taking the sum of the result and adding the bias as one output element of the feature map.

While developing the same LeNet-5 model using two different frameworks, it should be possible to make them work identically if their weights are the same. How can you copy over the weight from one model to another, given their architectures are identical?

You can load the saved models as follows:

This probably does not tell you much. But if you run python in the command line without any parameters, you launch the REPL, in which you can type in the above code (you can leave the REPL with quit()):

Nothing shall be printed in the above. But you can check the two models that were loaded using the type() built-in command:

So here you know they are neural network models from PyTorch and Keras, respectively. Since they are trained models, the weight must be stored inside. So how can you find the weights in these models? Since they are objects, the easiest way is to use dir() built-in function to inspect their members:

There are a lot of members in each object. Some are attributes, and some are methods of the class. By convention, those that begin with an underscore are internal members that you are not supposed to access in normal circumstances. If you want to see more of each member, you can use the getmembers() function from the inspect module:

The output of the getmembers() function is a list of tuples, in which each tuple is the name of the member and the member itself. From the above, for example, you know that __call__ is a “bound method,” i.e., a member method of a class.

By carefully looking at the members’ names, you can see that in the PyTorch model, the “state” should be your interest, while in the Keras model, you have some member with the name “weights.” To shortlist the names of them, you can do the following in the interpreter:

This might take some time in trial and error. But it’s not too difficult, and you may discover that you can see the weight with state_dict in the torch model:

For the TensorFlow/Keras model, you can find the weights with get_weights():

Here it is also with the attribute weights:

Here,  you can observe the following: In the PyTorch model, the function state_dict() gives an OrderedDict, which is a dictionary with the key in a specified order. There are keys such as 0.weight, and they are mapped to a tensor value. In the Keras model, the get_weights() function returns a list. Each element in the list is a NumPy array. The weight attribute also holds a list, but the elements are tf.Variable type.

You can know more by checking the shape of each tensor or array:

While you do not see the name of the layers from the Keras model above, in fact, you can use similar reasoning to find the layers and get their name:

Learning from the Weights

By comparing the result of state_dict() from the PyTorch model and that of get_weights() from the Keras model, you can see that they both contain 10 elements. From the shape of the PyTorch tensors and NumPy arrays, you can further notice that they are in similar shapes. This is probably because both frameworks recognize a model in the order from input to output. You can further confirm that from the key of the state_dict() output compared to the layer names from the Keras model.

You can check how you can manipulate a PyTorch tensor by extracting one and inspecting:

From the output of dir() on a PyTorch tensor, you found a member named numpy, and by calling that function, it seems to convert a tensor into a NumPy array. You can be quite confident about that because you see the numbers match and the shape matches. In fact, you can be more confident by looking at the documentation:

The help() function will show you the docstring of a function, which usually is its documentation.

Since this is the kernel of the first convolution layer, by comparing the shape of this kernel to that of the Keras model, you can note their shapes are different:

Know that the input to the first layer is a 28×28×1 image array while the output is 6 feature maps. It is natural to correspond the 1 and 6 in the kernel shape to be the number of channels in the input and output. Also, from our understanding of the mechanism of a convolutional layer, the kernel should be a 5×5 matrix.

At this point, you probably guessed that in the PyTorch convolutional layer, the kernel is represented as (output × input × height × width), while in Keras, it is represented as (height × width × input × output).

Similarly, you also see in the fully-connected layers that PyTorch presents the kernel as (output × input) while Keras is in (input × output):

Matching the weights and tensors and showing their shapes side by side should make these clearer:

And we can also match the name of the Keras weights and PyTorch tensors:

Making a Copier

Since you learned what the weights look like in each model, it doesn’t seem difficult to create a program to copy weights from one to another. The key is to answer:

  1. How to set the weights in each model
  2. What the weights are supposed to look like (shape and data type) in each model

The first question can be answered from the previous inspection using the dir() built-in function. You saw the load_state_dict member in the PyTorch model, and it seems to be the tool. Similarly, in the Keras model, you saw a member named set_weight that is exactly the counterpart name for get_weight. You can further confirm it is the case by checking their documentation online or via the help() function:

You confirmed that these are both functions, and their documentation explained they are what you believed them to be. From the documentation, you further learned that the load_state_dict() function of the PyTorch model expects the argument to be the same format as that returned from the state_dict() function; the set_weights() function of the Keras model expects the same format as returned from the get_weights() function.

Now you have finished your adventure with the Python REPL (you can enter quit() to leave).

By researching a bit on how to reshape the weights and cast from one data type to another, you come up with the following program:

And the other way around, copying weights from the PyTorch model to the Keras model can be done similarly,

Then, you can verify they work the same by passing a random array as input, in which you can expect the output tied out exactly:

In our case, the output is:

This agrees with each other at sufficient precision. Note that your result may not be exactly the same due to the random nature of training. Also, due to the nature of floating point calculation, the PyTorch and TensorFlow/Keras model would not produce the exact same output even if the weights were the same.

However, the objective here is to show you how you can make use of Python’s inspection tools to understand something you didn’t know and develop a solution.

Further Readings

This section provides more resources on the topic if you are looking to go deeper.

Articles

Summary

In this tutorial, you learned how to work under the Python REPL and use the inspection functions to develop a solution. Specifically,

  • You learned how to use the inspection functions in REPL to learn the internal members of an object
  • You learned how to use REPL to experiment with Python code
  • As a result, you developed a program converting between a PyTorch and a Keras model
RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments