Artificial Intelligence

Machine Learning for Makers - Part 5

Teachable AI Machine using Tensorflow.js

Rob Bell

Issue 35, June 2020

This article includes additional downloadable resources.
Please log in to access.

Log in

Create your very own Teachable AI Machine using Tensorflow.js.

In Part 4, we used Teachable Machine to determine the difference between different microcontrollers, and deployed it to our own NodeExpress/NodeJS server.

This installment will launch directly off the back of what we did last time. You’ll need to have NodeExpress or another web server installed and running, and be getting familiar with the process. If you haven’t caught Part 4, please go back and read it again (any others will be useful for you if you haven’t been reading them too).

Teachable Machine is fun to play with and experiment, but makes it difficult to customise the outcomes. However it’s built on Tensorflow.js and we can deploy this in our own way, creating a far more customisable system than just a pre-trained classifier.

We have taken some of this code from the Google Teachable Machine Codelab, but customised for our purposes here.

EDGE COMPUTING VS CLOUD COMPUTING

It’s worth touching on Edge computing, which will be a new phrase here for some people. What we’re doing here is actually a form of edge computing. By now I’m sure you understand the concept of cloud computing, which relies on a web server to do most of the heavy lifting.

Of course, the line between Cloud Computing and Edge Computing has been blurring for a while now, as web browsers gain more and more power, and the actions we want them to perform become more and more complex.

Typically for cloud computing, the browser can be fairly “dumb”. It can be merely a vessel for requesting, submitting, and displaying data from a web server. It wouldn’t actually undertake much of the computational effort required to display the website. The Web2.0 revolution when rounded corners and transparent GIFs began to grace our browsers also brought much of the Asynchronous data transfer (that is, parsing data between browser and server without a page reload) that helped set the tone for current development practices.

Review the Cloud Computing Data Flow diagram. As you can see, the Client requests the page, sends user or system interactions back to the server for processing, which returns the results to the client (while often saving it to a database on the server too).

Edge computing however, takes this much further. However now we use the host computer (or Edge node) itself with much greater depth and browser applications than ever before. Once the application is loaded by the edge node, it might never need to talk to the server again.

Web servers can now play a simpler role of serving the raw code, while the computer itself actually performs computational requirements. This is especially true for in-browser 3D work, and now Artificial Intelligence / Machine Learning using something like TensorFlow.js.

When you look at the data flow in Edge Computing, you can quickly see the difference in data flow. The client makes the initial request to load the code, handles multiple interactions and data inputs and can display the outcomes.

The raw data can be provided back to the Server for centralised storage if required. Additionally, many Edge Computing clients can communicate directly with each other also. This reduces load on the main server even further, and can reduce latency in communication.

In reality, most systems will use a mix of these two approaches. It might make sense for the value of a shopping cart to be done on the server during processing (for example), while calculating the total when selecting two of the same item might be better done on the client.

When it comes to Tensorflow.js, as we’ll demonstrate, the web server is barely more than a simple way to load the required application into a browser.

DECIDING BETWEEN EDGE COMPUTING AND CLOUD COMPUTING MODELS

When it comes to Edge and Cloud computing, they’re just different approaches, there’s no universal winner. Determining which method is best for you will depend on the hardware being implemented, and the functionality required of the system.

Cloud Computing:

  • Allows for “dumber” nodes
  • Requires a robust server and connectivity
  • Server does most of the processing
  • Some latency in results
  • Server can bottleneck with requests

Edge Computing:

  • Requires “smarter” nodes
  • Lower (or no post-load) reliance on server and persistent connections
  • Edge nodes and server can both do processing
  • Lower latency results
  • Less prone to server bottlenecking

We’ll cover decisionmaking for Edge Computing vs Cloud Computing in more detail another time.

BROWSER-BASED MACHINE LEARNING WITH TENSORFLOW.JS

Tensorflow.js is a powerful part of the Tensorflow family. Perhaps the biggest benefit is that you can train and retrain models right in a web browser.

Tensorflow.js uses the same fundamental code as the Python library, however it’s deployed as a WebGL accelerated Javascript framework.

What is WebGL? Think of WebGL as the route to your computer’s real processing power, namely the Graphics Processing Unit (GPU). The GPU is far better suited than the CPU for complex mathematics, however what this really means is that you can run software from your web browser with similar processing power at your disposal as when running native applications.

While there are advantages to native applications (background processing is one in particular), the in-browser use of Tensorflow.js has one distinct advantage - no installation! All you have to do is serve the content via Javascript and it’ll run right there in browser (as long as it’s a modern, WebGL enabled browser that is).

WEB SERVER REQUIREMENTS

It’s important to recognise that this code will run on just about any type of web server. If you only want to focus on the Tensorflow.js / Machine Learning portion of this, then you don’t need to worry about the server code itself too much at all. As long as you have filesystem / FTP access to put the files on it, and can access the served pages in your browser, you’ll be able to make the Machine Learning portion happen.

After we create our own Teachable Machine however, we’re going to use the results to drive GPIO and essentially switch some LEDs or relays on and off based on the classifications made. That’s where you’ll want to be running your own server. We’re going to use our NodeExpress server from Part 4. If you want to work through the GPIO portion and didn’t follow Part 4, you’re best to go through that now, and come back after.

Most of the NodeExpress configuration is the same as in Part 4 of this series.

OFFLOADING COMPUTATION

This is where things get interesting. Rather than training our data on the Teachable Machine website, we’re going to essentially create our own using TensorFlow.js, which is the same technology Teachable Machine uses too. The major benefit of TensorFlow.js is that there’s no installation hassle or major system dependencies. All you need is a modern browser in order to train your model and classify your data. Everything happens in-browser from a Machine Learning perspective.

Naturally, you need a web server to serve the page in the first place, that is, HTML, Javascript, and CSS etc. You can use any old web server to do this, perhaps even a regular Arduino, since the browser is the one doing the heavy lifting not the server.

This contrasts with regular Tensorflow which is server-side, which indeed does place the load on the host machine itself (a server or the mobile device, etc). Whether or not this is actually desired, may be determined by your particular use-case and requirements.

the simple in-browser classifier using mobilenet

We're going to work with a very simple pre-trained TensorFlow model, which is called MobileNet. While it sounds like something you use to make phone calls, it's actually a powerful and popular image classifier which can detect all sorts of real-world images right out of the box.

This demonstration is focused more on "using" the system. While this might seem like a shortcut in some ways, it's really no different to using an Arduino Library to run a display. You don't need to know how to code the library itself, just how to use it.

THE CODE

It takes barely a few dozen lines of code to implement. We'll provide all of these in the digital resources too.

First you need to create index.html (or replace the contents if you have last month's files still there).

<html>
  <head>
<!-- Load the latest version of TensorFlow.js -->
    <script src="https://cdn.jsdelivr.net/npm/
@tensorflow/tfjs"></script>
    <script src="https://cdn.jsdelivr.net/npm/
@tensorflow-models/mobilenet"></script>
  </head>
  <body>
    <div id="console"></div>
<!-- Add an image that we will use to test -->
    <img id="img" crossorigin src="
https://i.imgur.com/JlUvsxa.jpg" 
width="227" height="227"/>
    <script src="/files/mobilenet.js"></script>
  </body>
</html>

You'll also need to create a javascript file - we'll call it mobilenet.js. It's important that you save this to your files folder, as we've provided the route for this in your NodeExpress configuration already. You can theoretically put it anywhere but you'll need to update the contents of nodeserver.js if you do.

let net;
async function app() {
  console.log('Loading mobilenet..');
// Load the model.
  net = await mobilenet.load();
  console.log('Successfully loaded model');
// Make a prediction through model on our image.
  const imgEl = document.getElementById('img');
  const result = await net.classify(imgEl);
  console.log(result);
}
app();

Now open up your favourite web browser and head to http://localhost:5000

You should see an image of a dog.

Cute? Sure. But we don't need machine learning to get a photo. But to see what we've actually just done, you'll need developer tools. You should be able to find them somewhere like this:

That will open up a window that provides us with all of the output for what we've actually just done with Mobilenet.

Remember, this is a VERY raw implementation so all we get is console data logging in return. We can improve this aspect and handle results better another time.

Open up devtools and expand everything you see.

Note: You can ignore the error trying to load the favicon - that's a browser function that we don't yet accommodate with our server.

As you can see from this output however, we've done some true Machine Learning magic here. We've not only classified this image as a dog, we've attempted to determine which breed of dog it is.

This has all happened with no model training, using the pretrained model Mobilenet.

You can have a closer look at all of the data there in the returned object yourself. We determine 64% probability that this image is of a Kelpie. However you'll notice the second and third arrays.

It has also determined that with 14% probability it's an image of a Muzzle, and with 8% probability it's actually a Malinois, not a Kelpie after all. So while it's not totally perfect, this is amazing classification power for something "right out of box".

Mobilenet specifically however, is trained on everyday objects. As makers, we encounter odd objects all the time, and this is why we had to train our own classifier last month, since Mobilenet is not much use at classifying between a Raspberry Pi and an Arduino.

We can demonstrate this struggle with a new image. We'll provide led.jpeg in the digital resources. Drop that into your "files" folder, and update the code to locate the image.

<img id="img" crossorigin src="/files/led.jpeg" width="227" height="227"/>

Save the code and refresh your browser. You should see an image from our oversized LED project.

Let the script run and you'll get similarly structured results.

Hourglass? Window Shade? Table Lamp? Not a bad result! Perhaps not as granular as we were hoping, but again - this is totally out of the box functionality! It's not "wrong", but when it comes to classification we often have very specific outcomes that we're looking for, and this may not hit the spot. That's where Transfer Learning comes in to play, and what we'll be doing in the next installment of this series.

PART ZERO

PART ONE

PART TWO

PART THREE

PART FOUR

Rob Bell

Rob Bell

Editor in Chief, Maker and Programmer