Get Even More Visitors To Your Blog, Upgrade To A Business Listing >>

Demystifying Tensorflow and Keras: A Beginner’s Guide

smita prasadFollowCode Like A Girl--ListenShareThis article introduces TensorFlow and Keras, explaining their features, recent updates, and their role in constructing neural networks for machine learning applications.Topics:Introduction to TensorFlowSo what is Keras ?A Simple example using KerasTensorflow is a free and open-source machine learning platform based on Python . It was developed by Google primarily and it comprises various components developed by Google and other third parties.TensorFlow users use the Keras APIs by default. The Tensorflow website states “Whether you’re an engineer, a researcher, or an ML practitioner, you should start with Keras.”Keras is an open-source library, and its current version is built on top of TensorFlow. It serves as a high-level API for the TensorFlow platform. Keras was developed as part of the research effort in the ONEIROS project (Open-ended Neuro-Electronic Intelligent Robot Operating System), with its primary author and maintainer being François Chollet.Recently , Keras announced its intention to go back to its multi-backend roots. It originally was based on Theano and then later on Theano as well as Tensorflow. The Keras team have now introduced a new library called Keras Core, a preview version of the future of Keras. Soon , as early as this fall 2023, this library will become Keras 3.0. Keras Core is a complete rewrite of the Keras codebase that rebases it on top of a modular backend architecture. It makes it possible to run Keras workflows on top of arbitrary frameworks — starting with TensorFlow, JAX, and PyTorch.The full details of the announcement can be found here — https://keras.io/keras_core/announcement/Keras, derived from the Greek word “κέρας” meaning horn, has an intriguing backstory. The name symbolically references a literary concept from ancient Greek and Latin tales like the Odyssey. In these stories, dream spirits (Oneiroi) would come to Earth either through a gate made of ivory, deceiving dreamers with false visions, or through a gate made of horn, foretelling a future that would unfold. This clever wordplay relates to the words κέρας (horn) and κραίνω (fulfil), as well as ἐλέφας (ivory) and ἐλεφαίρομαι (deceive).Keras is often described as a high level neural Network API written in python.So understanding neural networks is key to understanding deep learning as well Keras.A neural network is called so because it is inspired by the workings of neurons or nerve cells in the human body. The core data structure of a neural network is a Layer. Each layer acts like a filter or a sieve extracting useful information from the input data at each successive layer and then passes on these representations to the next layer.All transformations learned by or performed by neural networks can ultimately be reduced to tensors or tensor operations.( for more about tensors , refer to this ). Thus, a layer can be described as something that takes one or more tensors as input and outputs one or more tensors after processing or tensor operations.Everything in Keras is either a layer or something that closely interacts with a layer.As discussed above , Neural networks are computational models inspired by the human brain’s interconnected neurons, utilized in machine learning to process and learn from data, making them capable of complex pattern recognition and decision-making.Neural network layers can have a state ( i.e have weights) or be stateless. Mostly layers have a state, the weights, which represents the “knowledge” of that layer. The weights are initialised randomly at first and then updated.These weights are typically tensors learned using stochastic gradient descent or SGD .The derivative of a tensor operation is its gradient and stochastic refers to its randomness.It aims to find the combination of weight values which yields the smallest possible loss function value.A loss function essentially measures the gap between the current predictions made by the model and the actual data. It is therefore a measure of the success of the model.The SGD is optimised using optimization methods or optimizers.When the SGD operates on the data in batches instead of the entire dataset it is termed as a batch or mini-batch SGD.A neural net has two main stages so to speak : forward pass and backpropagation.By repeating the forward pass and backpropagation through many iterations (epochs) with different data samples, the neural network learns to make better predictions by updating its parameters (like the layer weights) to minimize the loss.Let’s break down neural networks, the forward pass, and backpropagation layer by layer.1. Input Layer:The input layer is where data is fed into the neural network. Each node (or neuron) in this layer represents a feature of the input data.2. Hidden Layers:Between the input and output layers, we have one or more hidden layers. Each layer consists of neurons that apply weighted sums and activation functions to their inputs. A typical neural network in practice can have hundreds of hidden layers.3. Neurons:Each neuron takes the weighted sum of its inputs (from the previous layer) plus a bias term.4. Activation Function:After the weighted sum, an activation function is applied to introduce non-linearity into the model. Common activation functions include ReLU (Rectified Linear Unit), Sigmoid, Tanh, and Softmax (for the output layer).Forward Pass:During the forward pass, data flows through the network from the input layer to the output layer. Each neuron’s weighted sum is computed, followed by the activation function. The activations become inputs for the next layer, and this process continues layer by layer until the final output is obtained.5. Output Layer:The output layer provides the predictions or results of the neural network. The number of nodes in this layer depends on the nature of the task (e.g., regression, binary/multi-class classification) and each node often corresponds to a class or a value.After the forward pass, we compare the predicted output with the actual target values using a loss function (e.g., Mean Squared Error for regression, Cross-Entropy for classification). The goal is to minimize this loss.Backpropagation:Backpropagation is an algorithm used to update the weights and biases of the network to minimize the loss. It involves two main steps:As stated earlier , by repeating the forward pass and backpropagation through many iterations (epochs) with different data samples, the neural network learns to make better predictions by updating its parameters to minimize the loss.We looked at MNIST data in the previous article .The MNIST database is a large database of handwritten digits that is commonly used for training various image processing systems.The training data is a combination of array of 60K images of shape 28*28 and corresponding training labels , an array of 60K labels identifying the digits in the image ( between 0 and 9).Next, we import keras and layers from keras.This creates a network with two dense or fully connected layers.In a fully connected layer, each neuron or node is connected to every neuron in the previous and subsequent layers. The first layer is defined with “relu” activation. Relu or Rectified Linear unit is used to ensure non-linearity and is one of the most commonly used activation function in neural networks.For more information on relu refer to A Gentle Introduction to the Rectified Linear Unit (ReLU).The second is Softmax layer which is used for multi class classification.It is often used as the activation function for the last layer.It is often used as the last activation function of a neural network to normalize the output of a network to a probability distribution over predicted output classes.[wikipedia](Also refer to Softmax Activation Function — How It Actually Works)A deep layer network is graph of layers. We used Sequential models above. however there are a wide variety of models available.For instance chatGPT are a type Generative Pre-trained Transformers, commonly known as GPT, are a family of neural network models that uses the transformer architecture and is a key advancement in artificial intelligence (AI) powering generative AI applications such as ChatGPT.[what is GPT?]The compile( ),fit( ) and predict( ) methodsAfter defining the architecture we compile the model , that is , we configure the training method . In this step we choose the following:After compile comes the fit( ) method :The fit method here specifies data to be trained on, the number of epochs or loops to trained for and the batch size for the mini-gradient descent algorithm.After the training is complete we evaluate the accuracy of the model of new data ,i.e the data the model hasn’t “seen” earlier . This is either the validation or the test data used to assess model accuracy.Now the model is ready to be used and we can test it against new images of our own.This step is “inference” and we use the predict( ) method to test this model with an image of a handwritten digit.We will have to convert this image into a format suitable for the model.Now we can check the prediction of the model by simply printing the output. The output is an array of 10 probabilities ( between 0 and 1) corresponding to the ten digits 0 to 9 , the index with the highest probability is the digit it most likely predicts the image to be.Thus if we print prediction_test the result will be something like -:This doesn’t seem very promising but the predictions can be improved by processing the input image .Comparing the predictions for the test images we can see that the model did much better on the test images .For instance, if we test the prediction for the sample image test_images[0] which was the digit 7, we get the following output-:Thus, counting from 0 , the seventh index has a value of 9.9909711e-01 or almost 0.999097 which is close to 1.Enhancing model accuracy involves a variety of steps like optimizing data quality, feature engineering, selecting appropriate algorithms, tuning hyperparameters, and utilizing advanced techniques like ensembling or transfer learning.SummaryThe intention of this article is to serve as a starting point for understanding Keras as well as Tensorflow , particularly as the usage of Keras will potentially expand with the recent developments and the introduction of Keras 3.0.I would recommend the Keras website and the book Deep Learning with Python by François Chollet for those interested in exploring Keras further.Deep Learning with Python by François Chollet : https://www.manning.com/books/deep-learning-with-python-second-editionhttps://www.tensorflow.org/guide/kerasKeras for Researchers: https://keras.io/getting_started/intro_to_keras_for_researchers/Difference between Keras and TensorFlow : https://www.geeksforgeeks.org/difference-between-tensorflow-and-keras/Wikipedia : https://en.wikipedia.org/wiki/TensorFlow and https://en.wikipedia.org/wiki/Keras----Code Like A GirlIt is our choices that show what we truly are , far more than our abilities - Albus Dumbledoresmita prasadinAI Mind--ayşe bilge gündüzinCode Like A Girl--9Andrea M. FullerinCode Like A Girl--2Mónika LombosinCode Like A Girl--3TracyreneeinAI Mind--Ilias Papastratis--1misun_song--Can Ozdogar--Sadaf Saleem--2Frederik vlinAdvanced Deep Learning--HelpStatusBlogCareersPrivacyTermsAboutText to speechTeams



This post first appeared on VedVyas Articles, please read the originial post: here

Share the post

Demystifying Tensorflow and Keras: A Beginner’s Guide

×

Subscribe to Vedvyas Articles

Get updates delivered right to your inbox!

Thank you for your subscription

×