It took me a while to find how it interfaces with the system (driver? dedicated application? just drop model and data in a directory which appeared on mounted key?), so I'll post it here.
To access the device, you need to install a sdk which contains python scripts that allow to manipulate it (so, it seems like it's a driver embedded in utilities programs). Source: https://developer.movidius.com/getting-started
> Movidius's NCS is powered by their Myriad 2 vision processing unit (VPU), and, according to the company, can reach over 100 GFLOPs of performance within an nominal 1W of power consumption. Under the hood, the Movidius NCS works by translating a standard, trained Caffe-based convolutional neural network (CNN) into an embedded neural network that then runs on the VPU.
This is sure to save me money on my power bill after marathon sessions of "Not Hotdog."
Ignoring the price tag this is about half the performance of the Jetson TX2 which can manage around 1.5TFLOPS on 7.5W.
Interesting that you could use this to accelerate systems like the Raspberry Pi. The Jetson is a pain in the backside to deploy (at a production level) because you need to make your own breakout board, or buy an overpriced carrier.
EDIT: I use the Pi as an example because it's readily available and cheap. There are lots of other embedded platforms, but the Pi wins on ecosystem.
Keep in mind that supercomputers are a lot less specialized than circuits for running neural nets.
12 years ago you could have gotten a stack of 5-8 7800 GTX cards and had 1.5TFLOPS of single precision. 11 years ago you could have had a stack of 5 cards with unified shaders. It's not fair to compare against the significantly more complicated route of getting 100 CPU cores working together with only 1-4 per chip.
But can't you configure the device to do e.g. fast matrix-vector multiplications instead of inference? I can be wrong, but I suspect that's what people do mostly on supercomputers anyway.
How the Fathom Neural Compute Stick figures into this is that the algorithmic computing power of the learning system can be optimized and output (using the Fathom software framework) into a binary that can run on the Fathom stick itself. In this way, any device that the Fathom is plugged into can have instant access to complete neural network because a version of that network is running locally on the Fathom and thus the device.
This reminds me of Physics co-processors. Anyone remember AGEIA? They were touting "physics cards" similar to video cards. Had they not been acquired by Nvidia, they would've been steamrolled by consumer GPUs / CPUs since they were essentially designing their own.
The $79 price point is attractive. I wonder how much power can be packed into such a small form factor? It's surprising that a lot of power isn't necessary for deep learning applications.
> The $79 price point is attractive. I wonder how much power can be packed into such a small form factor? It's surprising that a lot of power isn't necessary for deep learning applications.
It runs pretrained NN, which is the cheap part. So this is a chip optimized to preform floating point multiplication and that's it.
I wonder how it stacks up against the snapdragon 410e. You can buy one on a dragonboard for roughly the same price ~$80 [1]. The dragonboard has four ARM cores, a GPU, plus a DSP. You could run OpenCV/FastCV on any or all three.
Why not both? Plug in the USB deep-learning USB stick. Use the snapdragon to do ETL and or download models to be run on the Movidlus so that it can perform inference. I am waiting for the day that we can do some nontrivial training on mobile hardware.
It's surprising how much attention this has had over the last few days, without any discussion of the downside: it's slow.
It's true that it is fast for the power it consumes, but it is way (way!) to slow to use for any form of training, which seems to be what many people think they can use it for.
According to Anandtech[1], it will do 10 GoogLeNet inferences per second. By very rough comparison, Inception in TensorFlow on a Raspberry Pi does about 2 inferences per second[2], and I think I saw AlexNet on an i7 doing about 60/second. Any desktop GPU will do orders of magnitude more.
But all those other solutions will consume orders of magnitude more power, especially the GPU. It's actually impressive what can be achieved on 1W of power.
If so I'm amazed, because I have never seen someone conflate those two before. I've seen plenty of people conflate USB-C and USB 3 in general, but not specifically thinking USB-C implies the 10gbps mode.
I went back, looked, and yes it does support USB 3.0. Actually given the chip itself also apparently supports GigE it's a shame there isn't the option with that brought out.
As make3 said, bandwidth can be important, but it's also becoming more-so the case that people have computers that only come with USB-C. Obviously not a dealbreaker since people can just buy an adapter, but it's an issue worth bringing up.
Really? I'm not a fan of USB-C being yet another connector that's easy to snap off. I break a USB cable every few days, and sometimes a socket just gets torqued off a PCB. The other day I had a phone on the edge of my table and I bumped the cable from the top with my elbow. Snap! The big-old A connectors are much, much harder to break, especially if the case has a solid metal rectangle that holds the USB connector in place.
I still wish though USB connectors had some kind of rubber padding around them. Like the computer IEC power connectors -- no matter what you do it's virtually impossible to break cable or socket.
Even DB9 connectors were better. I never broke one in the many years I used them. Rock solid and you could even screw them in.
> I'm not a fan of USB-C being yet another connector that's easy to snap off.
I initially had the same concern, but after using USB-C heavily for over a year now, not had one instance of a connector failing.
> I break a USB cable every few days
Either you're doing it wrong (and really careless / buy really cheap cables) or you're doing some highly specialized thing which means the feedback should be caveated saying I'm in "xyz field, which means i break far more USB cables than most ever will".
I never had this problem before USB.
I also break a lot of USB cables in the field, e.g. hiking, and that never used to be a problem with barrel connectors. USB connectors just are not designed for people who don't sit in an office all day.
The "connector" has nothing to do with it compared to older cables imho. The wires in something like a DB9/PS2 cable were heavier grade, but the durability of the connector wasn't measurable better (at least not with a good USB-A cable). If you're continually breaking them, I recommend upgrading to braided cables. I abuse the heck out of them and still haven't had one fail that I can remember.
It's mostly the connector that breaks, not the cable itself. I've tried all that reinforced stuff to no avail. One broke yesterday when I put my phone in my pocket while charging, and sat down on a chair. Seems like a pretty common use case to me.
A usb-c to type A adapter should be fine in the early stages. I'm sure USB-c will be on it's way soon. Even with the backing of Intel, they still need to factor in development time and support for different hardware, etc. They would gain little from supporting USB-C at this stage.
To access the device, you need to install a sdk which contains python scripts that allow to manipulate it (so, it seems like it's a driver embedded in utilities programs). Source: https://developer.movidius.com/getting-started