Visual Voice Display

Machines don't have mouths, but for some reason lights that move with the voice make robots seem more human. And it looks pretty awesome...

Adding more style to his Autonomous robot, David built a fantastic voice-display to visually showcase the synthesised speech from the robot. It looks amazing and sounds really cool, so we caught up with David to find out more.

First of all, you’re adding onto a robot in the first place… so let’s talk about the robot for a second. What is your robot used for?

The marketing division of the Sirius Cybernetics Corporation defines a robot as “your plastic pal who’s fun to be with.” The Hitchhiker’s Guide to the Galaxy defines the marketing division of the Sirius Cybernetics Corporation as “a bunch of mindless jerks who’ll be the first against the wall when the revolution comes.”

I agree with the marketing division of the Sirius Cybernetics Corporation, which may be a problem come the revolution! But seriously though, I developed AVA, to investigate behavioural robotics. This AI approach creates emergent behaviours by having a number of prioritised responses to various situations. For example, a robot could have a behaviour that avoids walls and sudden drops, and this would be a higher priority than performing a task like following a line or heading towards the brightest light. By layering behaviours on top of each other you can create a system that responds to a changing environment, without having to specify what to do in every single circumstance.

The other common approach, called “cognitive robotics” or “symbolic modelling”, attempts to create a model of the world in which everything can be represented by symbols. The robot controller can then parse these symbols in a manner similar to the way that a compiler creates machine code from a high-level programming language. This approach isn’t suited to a cheap micro-controller application though so given my cost constraints, my choice of approach was easy!

Behavioural robotics is also currently preferred by the academics (for what that’s worth). More importantly, a behavioural robot can be built in stages, and each behaviour debugged separately. As an engineer, this makes life easier, and it also reflects how, over time, organisms evolve with increasing complexity.

The ultimate expression of behavioural robotics is to create artificial neurons, which are not intrinsically intelligent, but that have enough connections in place to create a neural network with emergent intelligence. But, I wont be attempting to crack that nut with my network of Arduino’s and Raspberry Pi micro-controllers!

To design our robot intelligence I used the behaviour diagram below, which is a simple way to represent the robot control scheme.

Input from sensors are shown on the left, and these trigger behaviours that are manifested by the robot doing something (e.g., moving). The correct behaviour will be chosen by the “arbiter”, based on priority.

In addition to the primitive sensor behaviours, the robot can be assigned a specific task using the joystick and debug screen, a mobile app (over BLE) or via voice. Such tasks are things like:

Report status
Remote control (via the Bluetooth app)
Patrol
Follow IR (to follow a person or pet using heat)
Avoid IR
Network penetration test
Find sound; and
Avoid sound

The above tasks have the lowest priority, as I want the robot to prioritise self-survival. For example, I don’t want to accidentally drive the robot down the stairs, so it prioritises behaviours from the cliff sensors before responding to remote control commands. Of course, I do realise that this is not in line with Asimov’s three laws of robotics!

We can see some ultrasonics, analogue joysticks, PIR, and more onboard there… can you explain a little about what’s installed on your robot and why?

The structure of AVA is based on the Arlo Robotic Platform System from Parallax. The Arlo platform can carry up to 27kg and is very modular and expandable.

Importing the platform from the USA turned out to be painful. Due to the value of the order, it got held up in customs and I needed to complete an “Import Declaration (N10) – Post (Form B374).” This was impossible due to the 13-digit identification codes, tariff classifications and statistical codes required. There are 13,000 different customs tariffs and 4,000 concessions, and you have to provide the tariff classification and statistical code for every line item in your order – and I had 16 items in this order! Then there is Section 243T, which states that customs can hit you with a Strict Liability Penalty of $2,550 should you get the declaration wrong (even unintentionally). I ended up paying $77 to get a speciality customs broker to complete the form for me.

Power is provided by two 12V, 8Ah SLA batteries, wired in parallel. These are connected to the Arlo power distribution board, which provides a number of isolated 12V and 5V buses with fuse protection. This helps prevent noise from the motors getting into your control lines, and makes wiring much neater. You can turn off the motors separately from the electronics, which is great for testing. The board also allows you to easily recharge the 12V batteries.

I’m using the Parallax motors and wheels, controlled by their HB-25 motor controller. There wasn’t an existing Arduino library to control the HB-25s so I wrote my own (see download links provided below). I have wheel encoders and current monitoring to determine if the robot is stalled or stuck, and if so the escape behaviour is triggered.

At the moment I have a network of Arduino’s controlling the robot. The master is an Arduino Mega 2560. The other Arduinos are focused on specific tasks; for example, one samples the various sensors, another handles communication with the remote control app, and a third looks after the logging display and interface with a real time clock (RTC) board. There is also a Raspberry Pi attached, which is used for camera-based activities and specific tasks like penetration testing of networks. There is serial communication between the micro-controllers, which is okay, but a bit slow. In version two, I will make the Raspberry Pi the master, and use the robot operating system (ROS) to network in the Arduinos as nodes.

With regard to sensors, I have a plethora! They include:

4 x Sharp IR (10-80 cm) Cliff Sensors
1 x Sharp Long Range IR (20-150 cm) Sensor
1 x Parallax PING Ultrasonic Sensor (mounted on a pan/tilt servo)
1 x BMP085 Barometric Pressure Sensor
1 x MPL3115A2 Altitude Sensor
1 x Wide angle Passive Infrared Sensor
2 x Quad (Wheel) Encoders
2 x ACS711EX Current Sensors
1 x DHT11 Temperature and Humidity sensor
1 x DS18B20 Temperature sensor
1 x Light Sensor
1 x Compass/Magnetometer Module
1 x Microphone Sound Sensor Module
1 x Joystick

This may seem a bit over the top, but some of the sensors do complement each other. For example the IR sensor will pick up things that the ultrasonic sensor will miss (e.g., curtains). The various sensors are used to trigger specific behaviours, as shown in the Behaviour Diagram. While there are Arduino libraries available for the shorter distance Sharp IR sensors, I couldn’t find one for the Sharp GP2Y0A02YK0F IR Distance Sensor and so had to write my own (see download link provided below).

OK so that’s pretty cool, we admit. And when the robot revolution comes, we’ll hail our new mechanical overlords too! What speech functionality does your robot have, and more importantly when does it talk to you?

Text-to-speech is provided by the EMIC 2 Speech Synthesizer. The EMIC 2 Text-to-Speech Module is a multi-language voice synthesizer that converts a stream of digital text into speech. You can select either the Epson (default) or DECtalk parsers. The Arduino communicates with the EMIC 2 via one of its serial ports. To make it easier to use, I wrote an Arduino library, which takes care of the EMIC 2 setup, and a helper method to speak text (see download link provided). The speech is quite “robotic” and not up to the Alexa/Cortana/Siri standard, but I kind of like that.

The robot talks on startup, describing the boot status and what is going on during the self-check procedure. It will also talk if the status report task is activated, or if specific behaviours are triggered. This helps me know what state the robot is currently in and what it should be doing.

Hopefully it’s not complaining of a stomach ache on startup - that would be a challenge to solve! With regards to the voice display itself, we’re impressed that you went with an analogue route. Was this your first choice, or did you try a digital/microcontroller circuit first?

I liked the idea of having some sort of visual indication for when AVA was communicating; I’ve always liked the one used by KITT the robot car in the TV show Knight Rider! Following a quick internet search it became apparent that there were two main approaches to developing a display like this: analogue or digital. Given that the rest of AVA was largely digital, I thought I would go old-school and try the analogue path. You can watch a video demonstrating this voice display at www.youtube.com/watch?v=gpi9uFKAnM8

{video id="gpi9uFKAnM8"}

We’re big fans of KITT too, and let’s face it - the more interactive lights on a robot the more important it is. Are the six bar graph displays working something like a spectrum analyser, or are they all fed the same signal, with creative layout providing the desired effect?

The Kitt display is basically a VU (Audio Power) meter reflected horizontally. To make things simple, Texas Instruments produce a chip called the LM3915 Dot/Bar Display Driver, which does most of the hard work and is designed for exactly this application. The LM3915 senses analogue voltage levels and drives 10 LEDs to produce a logarithmic 3dB/step display. The LM3915 even provides a regulated LED current supply, eliminating the need for current limiting resistors.

testing — Testing the LM3915 withg a bargraph.

The logarithmic output is suited to signal inputs with wide dynamic range (like audio levels, power, light intensity and vibration).

There is also a linear version of the chip, which could be used for things like battery level, and I may incorporate one of these in a later project.

To ensure that this design works as advertised and to identify any design issues, I built a quick prototype and hooked it up to AVA’s EMIC 2 speech synthesizer and a Piezo, which I used to make some R2D2-like beeps. Based on the Audio Power meter design from the LM3915 Data Sheet (see download link provided below), I built a version on a breadboard. Note that the LED bar graph doesn’t have its polarity marked (that I could see), but a quick check with a multimeter indicated that the anode is the side with the writing. This worked okay, but highlighted a few issues such as:

The PWM digital output on the Arduino, which drives the Piezo to make the robot-sounding beeps, has sufficient residual voltage when making no noise, to light a couple of the LEDs.
The voltage output from the Arduino is low enough that it doesn’t need the voltage divider formed by R1 (18k) and R2 (10k). Consequently the audio signal can plug straight into pin 5 of the LM3915. This pin can handle +/- 35V without damage so we are pretty safe with our 5V Arduino.
The LM3915 can only sink a maximum of 13mA of LED current (refer to the Electrical Characteristics Table on page 3 of the data sheet). To produce the Kitt display we will need six of the LED bar graph displays; so at most, we will need to be able to provide enough current to drive six LEDs. Checking out the data sheet for the LED bar graph display, we find that each LED typically draws 20mA (forward current). In practice, as little as 1mA will light the LEDs, but then they are fairly dim.
As can be seen from the breadboard photo, even one 10-LED bar graph needs a heap of interconnecting wire. Six times this would be a mess, which suggests that a Veroboard solution is not the go. We will need to layout a PCB or two.

Addressing the issues raised by the prototype, I created a design that uses two PCBs mounted on top of each other:

DISPLAY BOARD: This consisted of the six DIL LED bar graph displays and their associated current limiting resistors. The input to the LED driver board is a 10-pin header and +12V (which will come from AVA’s power distribution board).

DRIVER BOARD: The purpose of the driver board is to take an audio input signal and feed it to a LM3915 Dot/Bar Display driver chip, which then sinks the current from the display board LEDs via transistors, when they need to turn on. It achieves this using the 10-pin header that connects the two boards.

Following are the board schematics and the associated double-sided PCB layouts for both boards. All of the hardware for the project is open source.

Thanks for taking us through that, it definitely looks like the right choice. Tell us about the connections on your PCB.

The 10-pin header is labelled A to J. On the display board, pins A, B and C connect to the first three LEDs on the middle bar graphs. Note that the lower bar graph connections are the mirror image to the top bar graphs. Pin D connects to the first LED of the outer bar graphs and LED 4 of the inner ones. Pin E then connects to the second LED of the outer bar graphs and LED 5 of the inner ones. This pattern continues for the rest of the pins. The last three LEDs on each of the outer bar graphs remain unconnected. By connecting the LEDs in this fashion, we create a display that simulates a mouth that moves, based on the audio level of the sound.

On the driver board, pin A of the header is driven by the LED 1 input on the LM3915; pin B is controlled by LED 2 and so on. The LM3915 senses analogue voltage levels and drives the 10 LED inputs to produce a logarithmic 3dB/step display. Consequently (from the data sheet), the LEDs will turn on at the following approximate audio power levels:

A - 0.2W

B - 0.4W

C - 0.8W

D - 1.6W

E - 3W

F - 6W

G - 13W

H - 25W

I - 50W

J - 100W

Fairly straight forward! Did you have any challenges setting up the LM3915 display driver?

Using the LM3915 is fairly straight forward. Pin 1 and pins 10 to 18 are the LED driver outputs (see earlier board schematic). Pin 2 goes to ground. Pin 3 (V+) goes to the 12V supply (maximum 25V). Pin 4 goes to ground, if that is your bottom voltage range (which it is for us).

Pin 5 is the signal input. We want a maximum of 1.2V, so we use a voltage divider to step this input down. The input at pin 5 can withstand voltages of +/- 35V and an input current of +/- 3mA without damage. The audio output from the Arduino will be at most 5V (but less than this in practise as we are using PWM to drive the Piezo). The formula for a voltage divider is:

Vout

= Vin (R2 / (R1 + R2)

= 5V x (10k / (18k + 10k)

= 1.8V

Note that full scale has been set to 1.25V (see below), so we are overdriving this a tad theoretically, but in practise that turns out not to be the case. We are nowhere near levels that would damage the chip.

Pin 9 selects the mode of the display driver. Leave it floating for dot mode, connect it to the supply voltage (pin 3) for bar mode. We added a switch so that you can easily select each mode.

Pin 6 is the full-scale voltage level. We have connected this to pin 7, which is the LM3915 internal voltage reference that delivers a nominal 1.25V between pin 7 (REF OUT) and pin 8 (REF ADJ).

To calculate the current that will be sunk at each LED input pin, we refer to the data sheet formulas:

Vref = 1.25V (1 + R2/R1) + R2 x 80uA

= 1.25V (1 + 0) + 0

= 1.25V

I_LED = 12.5V/R1 + Vref/2.2k

= 12.5/1000 + 1.25/2200

= 13mA

This would be fine if we were connecting an LED directly to the LM3915, but we can’t do that because we need to drive up to six LEDs per input (have a look at pin D on the display board).

We want the current through our LEDs to be 10 to 15mA, so we need to be able to sink up to 90mA. Note that from the data sheet, the maximum LED current is 13mA (from the Electrical Characteristics Table). This explains why we need the transistors.

Capacitor C1 (2.2uF tantalum) is required if leads to the LED supply are 15cm (6 inches) or longer.

That makes sense. It’s certainly not designed to run so many LEDs directly. So you’ve added transistors to provide more current?

In order to sink sufficient current for up to six LEDs we use transistors to switch them on and off. We selected the BC 557, which is a general purpose PNP transistor. This transistor can handle a maximum current (Ic) of 100mA and a maximum voltage (Vce) of 65V. It also has a DC current gain between 125 and 800.

We use a PNP transistor rather than the more common NPN variety, because we need the current to flow out of the base and into the LM3915 LED input, to turn the transistor on. In a NPN transistor the current flows into the base.

PNP Transistors can be thought of as being normally off, but a small output current and negative voltage at its base (B) relative to its emitter (E) will turn it on, allowing a much large emitter-collector current to flow. PNP transistors conduct when Ve is greater than Vc.

To cause the base current to flow in a PNP transistor, the base needs to be more negative than the emitter (current must leave the base) by approximately 0.7V for a silicon device.

The voltage drop across the current-limiting resistor on the display board, is the supply voltage (12V) minus the LED forward drop (2V from the data sheet), which gives us 10V. Therefore:

I_LED = 10V/180R = 55.5mA

However, this current is distributed among seven LEDs, and so typically will be more like 8mA per LED.

The LM3915 LED inputs are at V+ (i.e., 12V) when off and close to ground when on.

We know that for the transistor:

Ie = Ib + Ic; and

DC Current Gain (Hfe or Beta) = Ic/Ib (= 125 to 800 for a BC 557)

Consequently, when the LM3915 input is on, the corresponding transistor is also turned hard on (saturated), and the LEDs associated with that pin are also turned on. The voltage at the base when turned on is 10V to 0.6V = 9.4V. Based on this voltage across the 1k base resistor we can calculate the base current:

Ib = 9.4V/1000 = 9.4 mA

Based on a typical gain of over 300, this may suggest an Ic above the maximum 100mA but this won’t happen because the transistor is saturated and at most, will draw the ILED of 55.5mA calculated above. It is not a bad idea to overdrive the base current (within the limitations of the transistor), as this makes the design conservative and beta drops off with high speed switching (MHz so not really a consideration for audio frequencies).

When the LM3915 inputs are off, the transistor is also off, there is no base current and hence no collector current and the LEDs are off.

We haven’t tried Fritzing’s PCBmanufacturing process. How did you find the experience?

I used Fritzing to do the schematic and PCB layout. It is free and I highly recommend it for simple to medium level designs. Laying out the PCB tracks is good fun, just don’t be tempted to use the auto layout functionality of Fritzing – it is best described as “experimental” (i.e., it doesn’t work!).

They have an associated PCB fabrication service (in Berlin), which is fairly expensive, but for a double sided board like this with a number of vias, I was willing to try it out. The quality of the boards are very good.

Due to an unexpected challenge, I ended up having to redesign the driver board rather than get it made in Berlin, so I went for a cheaper option the second time around. There is an Australian company called the Breadboard Killer, which I came across at a local mini Maker Faire. They will manufacture boards from $25 and the process is a lot quicker since they come from China. You do need to convert the Fritzing files to the Gerber equivalent, but this isn’t difficult.

Nice one! Did you have any unexpected challenges during the design process you’d like to share?

When I connected the voice display unit to the EMIC 2 Speech Synthesizer, which AVA uses to talk, I discovered a problem. The display was driven hard on, even when AVA was silent. A quick measure with the multimeter showed that the EMIC 2 has a constant 2.4VDC offset. As mentioned above, the LM3915 displays full scale at 1.25V, so that was why all the LEDs were on.

This is not unusual and I should have thought of it (a lot of the Arduino MP3 shields do the same thing). Unfortunately though, when I was testing the prototype, the EMIC 2 hadn’t arrived from the USA so I tested using the Piezo analog output on the Arduino (which doesn’t have a DC offset).

There are a number of ways that you can remove the DC component of a signal so that you are just left with the AC component. You could use an isolation transformer, but I didn’t have one in the spare parts bucket. I did however, have a heap of resistors and capacitors, so I decided to try a high pass filter.

A high pass filter, as the name suggests, passes signals above a selected cut-off point, eliminating any low frequency signals from the waveform. The circuit for a first order high pass filter looks like this:

It delivers a response curve:

It is generally accepted that the range of frequencies audible to humans is 20Hz to 20kHz. Consequently, I designed the HP filter with an fc of 20Hz. The formula to calculate fc for a first order high pass filter is:

You can pick two variables and solve for the third, or you can go to an online filter design calculator (see link provided below), and let it do the hard work for you.

We selected an fc of 20Hz and a capacitor of 0.01uF (because we had one of this value), and the design site spat out a resistor value of 820k. If you plug in these values into the formula above you will get an fc of 19.4Hz, which is fine. The filter design site will even plot frequency and transient analysis graphs for you.

Adding the high pass filter between the EMIC 2 Speech Synthesizer and the voice display module did the trick. It removed the DC bias and allowed the speech signal to pass through.

The best kind of challenge is a resolved challenge! This project looks pretty successful. But if you started it again from scratch, what might you change?

I would have included the high pass filter! The other main thing I would change is to have the 10 pin headers on both boards line up like a shield on the Arduino. In fact, making an Arduino shield version of the display is probably not a bad idea – I will add this to my long list of things to do! Using SMT components, you could probably get the design down to one PCB, but wow, they are really hard to solder!

They sure are. A steady hand isn’t the only requirement for good SMT soldering! An Arduino shield version sounds like a great idea too. What are you working on now?

Right now I am working on improving voice recognition for AVA. I am also doing a total redesign of the control software to make a Raspberry Pi the master, as well as using ROS to control the Arduinos. Doing a home-baked network is reinventing the wheel, so I think I’m better off putting all the effort into ROS. Plus, doing this gives me an opportunity to learn something new, which is always fun!

Thanks so much for taking us through your project, David. We look forward to seeing what you come up with next!

Issue 79 out now.

Issue 79 available now

in interactive & digital.

Visual Voice Display