Based in Sydney, Australia, Foundry is a blog by Rebecca Thao. Her posts explore modern architecture through photos and quotes by influential architects, engineers, and artists.

Episode 156 - Machine Vision with Iain Smith

Episode 156 - Machine Vision with Iain Smith

Today's episode covers the explosion of applications of machine vision in industrial automation with guest Iain Smith.

Sponsor

ActiveState has been making open source easier for developers to use and simpler for enterprises to adopt for more than twenty years. ActiveState's "State of Enterprise CI/CD" survey is aimed at helping improve enterprise CI/CD tooling and practices.

About Iain Smith

 
 

Iain Smith is managing director of Fisher Smith and is primarily a Machine Vision Specialist & Software Engineer. He was educated at the University of Bristol where he read engineering mathematics. Before becoming MD, Iain developed Fisher Smith’s leading machine vision system, RoboVis® and software, GenVis®.

Iain Smith LinkedIn
Fisher Smith Website
Fisher Smith Linkedin
Fisher Smith Twitter

Related Episodes

Episode 60 with Hilary Mason: Machine Learning Research in Action

Transcript

Max Sklar: You're listening to The Local Maximum Episode 156. 

Time to expand your perspective. Welcome to The Local Maximum. Now here's your host, Max Sklar.

Max Sklar: Welcome, everyone. Welcome, you've reached The Local Maximum. I’m recording today as this huge snowstorm rages outside my window. And wow, this is incredible. No cars on the road. So, I guess I am stuck in here, podcasting all day and working, so that's great.

So, the idea that you can teach a machine to see, to have vision, has always been very fascinating to me, especially as I saw it live and in person in grad school for the first time. This was in 2010, when I took Yann LeCun’s machine learning course at NYU. And he’d show us a little camera that he had, you can simply point to any object, it would identify that object. And it was really amazing. And it was really clear how far this tech had become even 10 years ago. And of course, then the question is, well, this is fantastic, how do we use it? How do we change the world? What’s the implication of this, that you could teach a machine to see?

Nowadays, you might not kind of see it directly in your daily life because we all have a set of eyes. I don't need a camera to identify the stuff on my desk. I don't need it to say, “Hey, this is a microphone, this is a keyboard. This is a plate, this is a phone,” whatever. But we see some hints of it, because hey, everybody could use an extra set of eyes sometimes. We see it in auto tagging photos, you don't want to have to go through everyone manually. In facial recognition, it's going to become very, very important for your self-driving cars, for example, to like identify all of the objects in the environment around us. And how those objects behave and kind of keep a map of how those objects behave, which will hopefully build cars that will keep us safe and get us to our destination as soon as possible. 

But currently, in broad uses—in wide usage, machine vision has a ton of industrial applications. Just another example of AI being used to make the world more efficient and to actually reduce the cost of building things we need. So this is an example of a win for AI.So we have an expert coming on the program in a few minutes. And we're going to learn a little bit about it. 

But first, let's talk about our sponsor, ActiveState. If you're like me, you're constantly thwarted and mystified by your build tools and continuous integration tools at work as a software engineer, software architect. And I know there are some of you out there, some really special people who are more and more thankful for every day, who are really great—really interested in proving the internal tools that your company. So, we need you, I need you. So, I want to tell you about ActiveState today. ActiveState has been making open source easier for developers to use and simpler for enterprises to adopt for more than 20 years. ActiveState helps enterprises scale securely with open-source languages and gives developers the kind of tools they love to use. More than 2 million developers and 97% of Fortune 1000 Enterprises use ActiveState to support mission critical systems and speed up software development, while enhancing oversight and increasing quality. To learn more, go to activestate.com. And it'll be linked on the show notes page, localmaxradio.com/156.

All right, we're coming up on three years since I started The Local Maximum. And I'm definitely planning on making some big changes this year in 2021. That's my hope. Starting with my discussions on our Locals group at maximum.locals.com, hope you join that. And at some point in the next couple of shows, I'm going to announce a new location that I'm moving to, as well as a new studio setup. So that'll be a ton of fun. And we're going to work a little bit on the format as well. So, lots of big stuff coming this year. So definitely stay tuned for that.

All right. Today's guest is an expert practitioner of machine vision for industrial automation, and the managing director of Fisher Smith. Iain Smith, you've reached The Local Maximum. Welcome to the show.

Iain Smith: A pleasure to be here. Thanks for having me.

Max: So, tell me a little bit about what you do when it comes to machine vision. You started Fisher Smith, I understand. And tell me a little bit about what you guys do.

Iain: Yeah, so we've been doing machine vision systems for the last 15, 16 years. And mainly what we're doing is deploying camera technology into industrial processes. So, this is all fairly hidden away in the depths of factories and manufacturing areas. And generally, what we do falls into a couple of areas, the main one is probably quality control, where we're using a camera technology to take pictures of items that are being manufactured. 

And then using the software, within the cameras or on a computer, attached to the cameras, to assess those pictures of the items. And verify whether they're correctly assembled the right shape, size, whatever it may be, in order to improve and maintain the quality of what's being produced.

Max: So you said you started this—started doing this 15 years ago?

Iain: Yes.

Max: So, where—how did you get involved with this? And what were the capabilities back in, I don't know, 2006 like, compared to today? And why were you so interested in this field? What got you into it?

Iain: Well, so yeah, good question. So, I started, really, when I was in university, so I did an engineering maths or mathematics degree. And while I was doing that, my co-director at Fisher Smith, contacted me, and through a family link and said, “Oh, our business,”—his business at the time—“we do something that's, you might be of use to us. Do you want to come and do a little bit of holiday work, bit of software writing and developments and bits and bobs?” While I was in breaks in university, and that sort of got me into that industry, doing customer demonstrations, or preparing customer demonstrations and playing with the technology. 

So back then, that was the early 2000s. it was a very, very much a fledgling industry, that every single machine vision system that we're deploying required a very big rackmount industrial computer, where we're using analog cameras, specific frame grabbing boards, PCI boards that were required to take the data in from the analog cameras, turn it into digital pictures. And then we were writing software almost from scratch to do that image processing to count pixels, measure distances, things like that.

Max: So, what kind of tasks are sort of—I'm trying to get a sense of like, how does this work? What's possible, what's not? What kind of tasks would you say are like easy for machine vision system to do, in terms of the type you work on? And what kind of tasks if someone asked you to do it are like really hard?

Iain: So generally, the easier ones fall into manufactured items, where they are made to define shape and size. So, imagine you're making a nut or a bolt, or some metal part that's been machined or cast into very exact tolerances. And we're looking to measure that maybe or validate that a hole is not blocked. Then those are the ones where you've got something that's very defined in the first place to set your parameters up to and you can say, “Well, I'm expecting to find a hole here. Does it exist?” “Yes, it does. No, it doesn't.” And the big benefit with the camera technology is that it can be doing all of that at whatever the manufacturing speed is. And it can be doing it without touching the parts and non-contact inspection.

So, whereas you could, there are lots of meteorology microscopes and measuring devices, which will be super, super accurate, that might take a long time to probe one part and measure it in lots of places. The camera systems that we deploy will do that at two parts a second, three parts a second, 20 parts a second. And so there are lots of areas where certain aspects really fall in to that machine vision. So, a lot of it—quality control. 

So if you think automotive car manufacturing, you've got loads and loads and loads of small components that go into building up that one car. And that goes down to nuts, bolts, fastenings, fixings, pipes, really mundane items. But if one of those bolts that's being—holding an engine component in place, doesn't have the right thread on it, or isn't the right length, then that's a line stoppage for however long it takes to get that part out, replace it with the correct parts, put it together. Or even worse than that, then it goes some way through production, or even to the end consumer, before that fault is found. And it turns into a gearbox fault or an engine stripped down and becomes very expensive and bad for the occupation, the automaker. So, some of the checks we end up doing are—will appear very mundane, but they can be very important to the overall process.

The other end of that, the difficult things to do, are where you're talking about natural products. So grading fruit or vegetables—trying to find, is this apple bruised? Is it the right color? Is it red enough?—are very subjective decisions to make, and that's where we're actually starting to see, this is where the industry has really developed over the last 15, 20 years. That we're now starting to be able to do applications like that because we're starting to leverage the sort of specific AI, the machine learning aspects where we can start to teach a system to say, “These are all apples, these are all oranges.” Count them. Tell me which is different. Or here's an apple with a blemish on it, find those blemishes because we don't want to sell those to the shops.

Max: Right, right. So well, I mean, I want to get a little bit of a sense of how some of this works. So let's go back to like the kind of a nut and bolt situation for a second, like, how do you deal with—I’m imagining, like, there's some kind of factory where all these things are coming out one after the other, and they're taking pictures of each one. Is there a problem? If like, they're turned to different angles? Or if the camera—I don't know—something gets a little fuzzier than it should be like does that ever happen? Is that something you have to deal with?

Iain: Absolutely, yeah. So often, where, where the deployments are the most successful is, we’re constraining all of those variables. So, if you're wanting to very accurately measure the length of the bolt, for instance, then you would probably want to make sure that those bolts are presented in a pretty accurate way, every single time. Because if they're going closer and further away from the camera, they're going to look bigger and smaller from the perspective, there are lensing technologies that can get around that. But in general terms, if that objects further away, it's going to be more out of focus, or it's going to look smaller. So, you'd want them all to be fed in a particular manner to keep them in the same orientation and plane to the camera.

Now, sometimes that's possible and relatively easy to do because maybe you've got an industrial robot, offering these to the camera for inspection, or placing them in a certain position. Sometimes you've got a feeding some automation that's singulating the parts, putting in single file down a conveyor, or some sort of movement mechanism. So that all helps. The fewer variations you've got to cope with other than the product variations that you're trying to detect, then generally, that's better for the vision system. 

And we tend to be working with the automation companies that are building conveyors, industrial robots, the machines that are putting these things together or manufacturing, putting the threads on the nuts, or the other bolts. And we're putting a camera on the outfeed of that machine or is in the middle of that machine to check in at a certain point.

Max: Right. So yeah, I imagine that those can be pretty well constrained. And then of course, you talked about the fruit and stuff like that, which can be very difficult. But I guess now with all the machine learning techniques might be some lower-hanging fruit, if I could use that analogy.

So, but what techniques do you tend to use for that? I mean—I think I've explained convolutional neural nets on the show, is that actually something that you use in industry, or are there other like kind of shortcuts that you take or just give me a sense of what the techniques are?

Iain: So, to some extent, those algorithms are now getting to the point where they're embedded—usually, they're embedded reasonably deeply in the software that we're using. So, they're relatively hidden from whoever's deploying the system in the field. So, quite a bit of the technology is now getting to the mature—quite a mature stage, that it's a configuration operation. You're setting parameters and teaching it at quite a high level, rather than getting into the depths of the algorithms that are actually running underneath that. That’s really a big thing that's changed over the last 15, 20 years, is that 20 years ago, you were writing your own CNN, or you're some sort of Fourier transform, or something like that, in order to do the image processing at a very low level. Whereas now, we've got manufacturers in our industry that are producing these algorithms and raising them up into a much higher-level software tool, so they're more easy to access.

So now, when we're doing some of the deep learning products that we've got, we're able to teach them in a pretty, almost drag-and-drop manner, where we're saying, “This is an image of a good apple. And in fact, here's a folder full of images, all of good apples.” We'll label them all up as good. “And here's a folder full of images of bad apples.” And we're going to label them all up as bad. And we put it all in, set a couple of parameters, and click “Train.” And away it goes.

Max: How difficult do these things to train? Like, do you have to label— how much data do you tend to have to label for some of these? I mean, I know, I'm sure it varies.

Iain: It does vary. What we're finding is that rather than the deep learning platforms that we're generally working with, they're not general. So it's not like going to one of the big players and just spinning up a VM on somebody else's platform and getting a NVIDIA neural network off the shelf, and then starting from scratch to train up on images, or even getting one of the generic image-based, pre-trained or preset neural networks. 

We’re working with neural networks that have already been selected and pre trained to some degree to specifically work on industrial image sets. So that to speed things up, quite a lot. Because we're—rather than working on elephants and apples and cats and dogs and road signs, we've got networks that have maybe been optimized already for looking at printed text. So, if you're checking labels, lock codes on pharmaceutical products, or even date codes on food labels, there's probably 10 different fonts that are used. They're generally black text on white background; it’s not handwritten. So, what you're dealing with in those is trying to get a reliable read of that text. 

And you can—over the years—these vision companies that generally our suppliers, have been collecting databases of these images of industrial text-reading applications. And they're able to set up a deep learning neural network, really optimized to read printed text and industrial environments, which then allows us to leverage that and deploy that with maybe very minimal tens of images that we add to it. So, this is our application. We wanted to really focus on this particular label.

Max: And you could do that with tens of images?

Iain: Yeah.

Max: Wow. That seems pretty impressive to me.

Iain: Yeah. Now, of course, some of that is because we're working with a network that's been pre-trained on hundreds and thousands of images. And now we're buying that at a much higher level as a library from our supplier.

Max: Yeah, gotcha. So what sort of—what industries make really good use of this right now that maybe wouldn't expect? And then, I'm also kind of curious to get your take on are there areas in manufacturing and any industries that are just—not making good use of it and maybe need to, or ought to?

Iain: So I guess historically, the main industries that have made best use of this have been high volume, high value manufacturers. So automotive has been a key player in all of this because the manufacturing processes are very defined, and they want to increase in the quality year in year out of the vehicles, making them more reliable. And they've often—the commercial flow is that they'll set up to build a car line for 5, 6, 7, 8, 10 years, maybe, at the maximum. And they set up to build that and they know they're going to be building this type of car, down this line for that period of time, so they can really gear up for it. So, you can put a lot of effort in to get everything right for that in the first place. So that's when purchasing capital equipment like camera systems, and automation more generally, really makes sense.

The other side of that coin is where you've got aerospace. So, airplane manufacturer, and all the alloys, maybe even defense where you're looking at really, really high-quality components. Or even motorsports, for instance, where your tolerances on a racing car or an aero engine are probably through the roof. But you're only making tens of these a year, at the maximum. So, there isn't the sort of commercial rationale to invest in a machine that just inspects something in maybe make 10 or 20 of, because you can get a person to do that. And they're going to do it and it's going to be very slow and laborious, maybe they need to be cross-checked by somebody else. But because you're only making 10 or 20 a year and they're super high value components, that pays for that person to be very diligent to do that work.Whereas where you're trying to drive the cost right down in automotive sector, then that really wins for automating and having the objective decision-making that a camera system can bring. 

And other areas where this is well-deployed is pharmaceutical. They're a big driver because again, quality is a different metric really, with pharma. Because you're looking at—you do not want contamination in your drug or you don't want a needle tip to be bent or have a bit of damage on it, that's going to hurt somebody. So, you've got almost a more litigious problem in that industry. Whereas if something goes out, it's going to hurt somebody, or going to kill somebody at the very worst case. So therefore, they invest in their processes to really make sure that everything is constrained, and the quality is super high. 

So you can have camera systems that are checking for like, needle tips being bent, being the right length, being the right diameter. Checking that they're in the right package, checking that the packaging is the right color, or has the right label on it. Checking that the text on that packaging is fully printed because if it's not correctly printed, and the doctor opens that package up, and it doesn't contain the drug they're expecting, or it's missing...

Max: That’s bad. That's like when I go to the bagel store. It's always what is on the bagel is not what I asked.

Iain: Yes, you're probably less likely to suffer some injury or illness from the bagel store.

Max: Yeah, the wrong kind of cream cheese, I know.

Iain: But even with food allergies are becoming such a big thing, that if you get the wrong label on there. And some of that, you think, well that that's obvious. If you're a doctor and you're reading the label, and you know you're expecting one particular drug, and it says it's something different than it rings alarm bells, and you throw it away. 

But if whoever's checking that label, maybe that label is printed in three or four different languages, maybe it's in Arabic script, or Chinese, or Japanese, or a script that you're not familiar with looking at—whoever is checking that, in first place might think, “Well, the English looks right.” But maybe there's characters missing from one of the others. But if you can check that at source, from a vision system that really doesn't care what the language is, it's just checking: does this look exactly as it's supposed to? Then you've got that objective high-quality check to say, “Yes, this is exactly, to the specification, it's correct. It's good to go out and be used.”

Max: Yeah, yeah. So, you mentioned the automotive and that sort of brings to mind the idea of cars have more and more cameras on them now, and there's talk of the self-driving cars that are going to have all these cameras. And I know you might not be an expert on that. But like, you know about the machine vision technology, like, what are our cars going to be able to see that we can't? And how fast will they be able to see it?

Iain: Well, this is—yeah. I mean, this is where the sort of industrial stuff that I've been involved in starts to diverge from, I guess, what the state of the art is. So, people like Google with their self-driving cars going around, and all of that, it's all similar technologies. It's all allied. It's all generally the same neural networks that are working in the background to detect road signs, to detect other cars, to detect whatever it may be in those self-driving cars. And often, what we're seeing in these industries is the feed down from that, that we're now getting technologies that maybe have matured enough from those people that are breaking new ground and trying to do fancy things in the sort of Google level. Then what industry wants is something that's stable, and reliable, and proven. There's no risks taken; it's a very conservative industry. 

And the other thing that we have in industry that where we differ from the self-driving cars, and to some extent, these are limitation of the self-driving cars, is that you can't rely on having a super good internet connection to be doing that deep learning inference on some massive server somewhere in the cloud and getting the data back. Because if you're needing to real time, which is what's happening in the car, you can't rely—maybe 5g will fix this in the future but right now, you can't rely on that car sending an image away and getting a signal back to say, “That's a stoplight, you better stop within second, tens of seconds?” Who knows what the latest meter is on that. Yeah.

And it's the same with the industrial stuff that we need to make decisions. If parts are going past 10, 20 parts per second, we need to take an image, done the measurement, done the inspection, give them back a pass or fail, in order for the machine to deal with that part within milliseconds. So, this is where some of the really high end, deep-learning AI stuff that's going on in the world generally at the moment is really benefiting from the fact that they can access these great big servers on Microsoft, Amazon, Google's clouds, that you can get access to those you can do super, super powerful computing now for very little money. But you need to have that access to the cloud all the time to do it. We’re often working with a subset of that where on the factory floor, you've got to rely on local computing power to give that result back instantly, every time and reliably quickly enough. And so, we end up having cool edge inference. So we've got a PC with its own GPU card usually to do the acceleration to the deep learning, that is doing the work there. 

And then, yeah, one aspect of that is speed. And the other aspect is also sort of commercial sensitivity that a lot of these big manufacturing companies don't want an open link to the cloud to Amazon, to Google, to whoever, streaming all their data up there. That's a big sort of industrial no-no.

Max: It’s a risk. Yeah. So I'm curious, did most of the processes that you're asked to, is it mostly a matter of just taking a picture and they want like a label or an answer? Or do you ever have to do something more complicated, like deal with a video or try to make a 3D model from something, is that ever required?

Iain: So generally, we're not generating something—I mean, there are, again, other allied products, which do the sort of 3Ds scanning. So, you've got a camera on an arm and the arm knows all its joint positions, and you can move that around and scan an object, and it will create a 3d model. 

The 3D stuff that we tend to be doing will be checking the shape of something 3D, using it to instruct a robot how to pick it up. So we're seeing instances where you maybe got a bin of parts. So maybe you've got some shafts or some cogs or something, and they've been manufactured somewhere else, they've been put into a big tub or a bin or box. And now they're getting loaded in and needs to be dealt with, assembled into the next bit of the process. So, can you get a robot, pick them straight out of that box, where they're all jumbled up? So you maybe have a 3D camera on there that's able to scan that surface and say, “Yeah, there's the one that's on the top. You can pick that one, but you can't pick that one up yet because it's blocked by other parts sitting on top of it.” 

So that tends to be where we're using the 3D technologies is to find objects or validate that objects are correctly shaped and sized.

Max: Yeah. Can I...

Iain: So some of it could be food, checking a loaf of bread that's being made. Is it tall enough? Has it got the right height, or is it not being fully formed? Or measuring the volume of something that's been sliced up. 

So, we did a system a little while ago, where we had a complete wheel of cheese, a great big round, sort of 10, 12-kilogram, block of cheese, circular block of cheese. And we needed to chop that up into individual portions to be sold in the shops. So, we needed to make a volume measurement of that cheese. And we used a 3D camera to scan it, work out how tall it was, how wide it was. Compare that with the weight from a set of scales to then plot where the system was going to cut all the slices out.

Max: It almost seems like all this technology could come together to build some sort of really great like house assistant robot that can clean your house and cook for you.

Iain: It's always that frustratingly close aspects of it where it feels like that. But when you start digging into it, what we're good at deploying a very—super, super specific solutions, they're not very good at generalizing. And when you're talking about walking around houses or dealing and interacting with other people, other things that don't behave in a uniform and predictable manner, then that's where it becomes very difficult, because that's what these self-driving cars and anything that needs to interact and wander around your house to do something clever like that. That's where they really struggle, and it feels like we're getting there. But it feels like there's still simply hurdles...

Max: It’s always a few years away.

Iain: Yeah, yeah.

Max: Great. So, Iain, do you have any last thoughts about our conversation today? And where can people reach you and where can people reach out?

Iain: Yeah, so yeah, I mean, it's been great to talk to you. Hopefully it has given a little bit of an insight into probably what goes along in the background and away from majority of people's sight of what happens but in the background. Every single computer, banknote that's been printed, car that's been made. And a lot of the food and produce that are on the shelves of the supermarkets have all been under a camera system, even with the current pandemic and people moving more and more to online deliveries. 

This again, is putting a lot of emphasis on stuff like logistics, reading barcodes, checking the labels of presents, reading addresses correctly, sorting them to go into the right mail van to be delivered. And all of this—okay, there might be, it's all camera technology, it's all this sort of stuff that we do that gets put into a warehouse somewhere to read and check and sort. So, there's lots of this going on, in the background hidden from sight, hopefully improving the quality of things we're touching day-to-day.

And you can find us, our website. I mean, we're UK-based. And so, we're fishersmith.co.uk is our website. And you can see some examples of systems that we've done in the industry. And you can search me on LinkedIn, or no doubt find it from the pages of your website as well, Max, when you go live, so...

Max: So all this will be on localmaxradio.com/156.

Iain: Perfect.

Max: Episode 156.

All right, Iain, yes, it's really great to hear about this and to hear about—some of the, honestly good news that's happening in the background. It seems like this is all improving our lives. We don't even know about it. And I always like to emphasize that before the pandemic, it seems like we got a little off track, but it's good to hear about it. Thank you for coming on the show today.

Iain: Pleasure. Thanks for having me.

Max: All right, next week. I'm hoping to have Aaron back on for the next couple of weeks. But he doesn't know it yet. So, we'll see what happens. But I want to dive more into the future of cryptocurrencies as well as the future of The Local Maximum podcast and we have some big changes coming up. So, we'll be covering that. Have a great week, everyone.

Max Sklar: That's the show. To support The Local Maximum, sign up for exclusive content and our online community at maximum.locals.com.The Local Maximum is available wherever podcasts are found. If you want to keep up, remember to subscribe on your podcast app. Also, check out the website with show notes and additional materials at localmaxradio.com. If you want to contact me, the host, or ask a question that I can answer on the show, send an email to localmaxradio@gmail.com. Have a great week.

Episode 157 - Financial Tsunami on the Horizon

Episode 157 - Financial Tsunami on the Horizon

Episode 155 - Why Oppose Censorship

Episode 155 - Why Oppose Censorship