From: Eman Copty Date: Wed, 1 Nov 2017 21:56:35 +0000 (-0700) Subject: Changes for 1.10.00 release X-Git-Tag: accepted/tizen/unified/20191125.223343~22 X-Git-Url: http://review.tizen.org/git/?a=commitdiff_plain;h=f914411d382ca159663a8f34ad2d1089ece4fbab;p=platform%2Fadaptation%2Fnpu%2Fintel-libmvnc.git Changes for 1.10.00 release --- diff --git a/README.md b/README.md index b1d79d7..ac2eca4 100644 --- a/README.md +++ b/README.md @@ -1,22 +1,26 @@ -# Movidius™ Neural Compute Software Development Kit -This SDK is provided for users of the [Movidius™ Neural Compute Stick (NCS)](https://developer.movidius.com/). It provides software tools, an API, and examples which enable developers to create software that takes advantage of the hardware the accelerated neural network capability provided by the NCS. +# Intel® Movidius™ Neural Compute SDK +This Intel® Movidius™ Neural Compute software developer kit (NCSDK) is provided for users of the [Intel® Movidius™ Neural Compute Stick](https://developer.movidius.com/) (Intel® Movidius™ NCS). It includes software tools, an API, and examples, so developers can create software that takes advantage of the accelerated neural network capability provided by the Intel Movidius NCS hardware. # Installation -The provided Makefile helps with installation. Clone this repository and then run the following command to install the SDK. +The provided Makefile helps with installation. Clone this repository and then run the following command to install the NCSDK: ``` make install ``` # Examples -Also included in the SDK are examples. After cloning and running 'make install' run the following command to install examples. +The Neural Compute SDK also includes examples. After cloning and running 'make install,' run the following command to install the examples: ``` make examples ``` -For additional examples please see the Neural Compute App Zoo here: [http://www.github.com/movidius/ncappzoo](http://www.github.com/movidius/ncappzoo). The ncappzoo is a valuable resource for NCS users that includes community developed applications and neural networks for the NCS. +## NCAPPZOO Examples +For additional examples, please see the Neural Compute App Zoo available at [http://www.github.com/movidius/ncappzoo](http://www.github.com/movidius/ncappzoo). The ncappzoo is a valuable resource for NCS users and includes community developed applications and neural networks for the NCS. # Documentation -The complete Neural Compute SDK documentation can be viewed at [https://movidius.github.io/ncsdk/](https://movidius.github.io/ncsdk/) +The complete Intel Movidius Neural Compute SDK documentation can be viewed at [https://movidius.github.io/ncsdk/](https://movidius.github.io/ncsdk/) + +# Getting Started Video +For installation and general instructions to get started with the NCSDK, take a look at this [video](https://www.youtube.com/watch?v=fESFVNcQVVA) diff --git a/api/src/mvnc_api.c b/api/src/mvnc_api.c index 8298bc6..b2743ee 100644 --- a/api/src/mvnc_api.c +++ b/api/src/mvnc_api.c @@ -137,21 +137,13 @@ static int is_device_opened(const char *name) return -1; } -mvncStatus mvncOpenDevice(const char *name, void **deviceHandle) +static mvncStatus load_fw_file(const char *name) { int rc; FILE *fp; char *tx_buf; unsigned file_size; char mv_cmd_file[MAX_PATH_LENGTH], *p; - char name2[MVNC_MAX_NAME_SIZE] = ""; - - if (!name || !deviceHandle) - return MVNC_INVALID_PARAMETERS; - - pthread_mutex_lock(&mm); - if (!initialized) - initialize(); // Search the mvnc executable in the same directory of this library, under mvnc Dl_info info; @@ -202,13 +194,73 @@ mvncStatus mvncOpenDevice(const char *name, void **deviceHandle) } PRINT_DEBUG(stderr, "Boot successful, device address %s\n", name); + return MVNC_OK; +} + +static void allocate_device(const char* name, void **deviceHandle, void* f) +{ + struct Device *d = calloc(1, sizeof(*d)); + d->dev_addr = strdup(name); + d->usb_link = f; + d->next = devices; + d->temp_lim_upper = 95; + d->temp_lim_lower = 85; + d->backoff_time_normal = 0; + d->backoff_time_high = 100; + d->backoff_time_critical = 10000; + d->temperature_debug = 0; + pthread_mutex_init(&d->mm, 0); + devices = d; + *deviceHandle = d; + + PRINT_DEBUG(stderr, "done\n"); + PRINT_INFO(stderr, "Booted %s -> %s\n", + d->dev_addr, + d->dev_file ? d->dev_file : "VSC"); +} + +mvncStatus mvncOpenDevice(const char *name, void **deviceHandle) +{ + int rc; + char name2[MVNC_MAX_NAME_SIZE] = ""; + char* device_name; + char* saved_name = NULL; + char* temp = NULL; //save to be able to free memory + int second_name_available = 0; + + if (!name || !deviceHandle) + return MVNC_INVALID_PARAMETERS; + + temp = saved_name = strdup(name); + + device_name = strtok_r(saved_name, ":", &saved_name); + if (device_name == NULL) { + free(temp); + return MVNC_INVALID_PARAMETERS; + } + + pthread_mutex_lock(&mm); + if (!initialized) + initialize(); + + + rc = load_fw_file(device_name); + if (rc != MVNC_OK) { + free(temp); + return rc; + } + if (strlen(saved_name) > 0) { + device_name = strtok_r(NULL, ":", &saved_name); + second_name_available = 1; + } // Now we should have a new /dev/ttyACM, try to open it double waittm = time_in_seconds() + STATUS_WAIT_TIMEOUT; while (time_in_seconds() < waittm) { - void *f = usblink_open(name); + void *f = usblink_open(device_name); - if (f == NULL) { //we might fail in case name changed after boot + //we might fail in case name changed after boot and we don't have it + if (f == NULL && !second_name_available) { int count = 0; while (1) { name2[0] = '\0'; @@ -232,25 +284,8 @@ mvncStatus mvncOpenDevice(const char *name, void **deviceHandle) myriadStatus_t status; if (!usblink_getmyriadstatus(f, &status) && status == MYRIAD_WAITING) { - struct Device *d = calloc(1, sizeof(*d)); - d->dev_addr = strlen(name2) > 0 ? strdup(name2) - : strdup(name); - d->usb_link = f; - d->next = devices; - d->temp_lim_upper = 95; - d->temp_lim_lower = 85; - d->backoff_time_normal = 0; - d->backoff_time_high = 100; - d->backoff_time_critical = 10000; - d->temperature_debug = 0; - pthread_mutex_init(&d->mm, 0); - devices = d; - *deviceHandle = d; - - PRINT_DEBUG(stderr, "done\n"); - PRINT_INFO(stderr, "Booted %s -> %s\n", - d->dev_addr, - d->dev_file ? d->dev_file : "VSC"); + allocate_device(strlen(name2) > 0 ? name2 : device_name, deviceHandle, f); + free(temp); pthread_mutex_unlock(&mm); return MVNC_OK; } else { @@ -262,7 +297,7 @@ mvncStatus mvncOpenDevice(const char *name, void **deviceHandle) // Error opening it, continue searching usleep(10000); } - + free(temp); pthread_mutex_unlock(&mm); return MVNC_ERROR; } diff --git a/docs/Caffe.md b/docs/Caffe.md index ce3c915..fea53db 100644 --- a/docs/Caffe.md +++ b/docs/Caffe.md @@ -1,22 +1,22 @@ # Caffe Support ## Introduction -[Caffe](http://caffe.berkeleyvision.org/) is a deep learning framework developed by Berkeley AI Research ([BAIR](http://bair.berkley.edu)) and by community contributors. The setup script currently downloads BVLC Caffe and installs it in a system location. Other versions of Caffe are not supported at this time. For more information please visit http://caffe.berkeleyvision.org/ +[Caffe](http://caffe.berkeleyvision.org/) is a deep learning framework developed by Berkeley AI Research ([BAIR](http://bair.berkley.edu)) and by community contributors. The setup script currently downloads Berkley Vision and Learning Center (BVLC) Caffe and installs it in a system location. Other versions of Caffe are not supported at this time. For more information, please visit http://caffe.berkeleyvision.org/. -Default Caffe Installation Location: /opt/movidius/caffe
-Checkout Berkley Vision's Web Image Classification [demo](http://demo.caffe.berkeleyvision.org/) +* Default Caffe installation location: /opt/movidius/caffe
+* Check out Berkley Vision's Web Image Classification [demo](http://demo.caffe.berkeleyvision.org/) ## Caffe Zoo -Berkley Vision hosts a Caffe Model Zoo for researchers and engineers to contribute Caffe models for various tasks. Please visit the [Berkley Caffe Zoo](http://caffe.berkeleyvision.org/model_zoo.html) page to learn more about the caffe zoo and how to create your own Caffe Zoo model and contribute. +Berkley Vision hosts a Caffe Model Zoo for researchers and engineers to contribute Caffe models for various tasks. Please visit the [Berkley Caffe Zoo](http://caffe.berkeleyvision.org/model_zoo.html) page to learn more about the caffe zoo, how to create your own Caffe Zoo model, and contribute. -Caffe Zoo has several models contributed including a model network that can classify images for Age and Gender. This network trained by [Gil Levi](https://gist.github.com/GilLevi) and Tal Hassner is at this [Gender Net Caffe Zoo Model on GitHub](https://gist.github.com/GilLevi/c9e99062283c719c03de) +Caffe Zoo has several models contributed, including a model network that can classify images for age and gender. This network, trained by [Gil Levi](https://gist.github.com/GilLevi) and Tal Hassner, is available at [Gender Net Caffe Zoo Model on GitHub](https://gist.github.com/GilLevi/c9e99062283c719c03de). -Caffe models consists of two files that are used for compiling the caffe model using the [Neural Compute Compiler](tools/compile.md) -* Caffe Network Description (.prototxt): Text file that describes the topology and layers of the network. -* Caffe Weights (.caffemodel): Contains the weights for each layer that are obtained after training a model. +Caffe models consists of two files that are used for compiling the caffe model using the [Neural Compute Compiler](tools/compile.md). +* Caffe Network Description (.prototxt): Text file that describes the topology and layers of the network +* Caffe Weights (.caffemodel): Contains the weights for each layer that are obtained after training a model ## Neural Compute Caffe Layer Support -The following layers are supported in Caffe by the Neural Compute SDK. The Neural Compute Stick does not support training, so some layers that are only required for training are not supported. +The following layers are supported in Caffe by the Intel® Movidius™ Neural Compute SDK. The Intel® Movidius™ Neural Compute Stick does not support training, so some layers that are only required for training are not supported. ### Activation/Neuron * bias @@ -45,7 +45,7 @@ The following layers are supported in Caffe by the Neural Compute SDK. The Neur ### Vision * conv - * Regular Convolution - 1x1s1, 3x3s1, 5x5s1, 7x7s1, 7x7s2, 7x7s4 + * Regular Convolution - 1x1s1, 3x3s1, 5x5s1, 7x7s1, 7x7s2, 7x7s4 * Group Convolution - <1024 groups total * deconv * pooling @@ -53,7 +53,7 @@ The following layers are supported in Caffe by the Neural Compute SDK. The Neur # Known Issues ### Caffe Input Layer -Limitation: Batch Size which is the first dimension must always be 1 +Limitation: Batch Size, which is the first dimension, must always be 1 Limitation: The number of inputs must be 1 @@ -80,14 +80,14 @@ input: "data" ### Input Name Input name should be always called "data" -This works +This works: ``` name: "GoogleNet" input: "data" input_shape { dim:1 dim:3 dim:224 dim:224 } ``` -This does not +This does not: ``` name: "GoogleNet" input: "data_x" @@ -96,7 +96,7 @@ input: "data_x" ``` ### Non-Square Convolutions -Limitation: We don't support non-square convolutions such as 1x20 +Limitation: We don't support non-square convolutions such as 1x20. ``` input: "data" input_shape @@ -115,7 +115,7 @@ layer { ``` ### Crop Layer -Limitation: Crop layer cannot take reference size layer from input:"data" +Limitation: Crop layer cannot take reference size layer from input:"data". ``` layer { @@ -132,24 +132,24 @@ layer { ``` ### Size Limitations -Compiled Movidius™ "graph" file < 320MB -Intermediate layer buffer size < 100MB +Compiled Movidius "graph" file < 320 MB; +Intermediate layer buffer size < 100 MB ``` [Error 35] Setup Error: Not enough resources on Myriad to process this network ``` -Scratch Memory size < 112KB +Scratch Memory size < 112 KB ``` [Error 25] Myriad Error: "Matmul scratch memory [112640] lower than required [165392]" ``` ## Caffe Networks -The following networks are validated and known to work on the Movidius™ Neural Compute SDK. +The following networks are validated and known to work on the Intel Movidius Neural Compute SDK: - GoogleNet V1 - SqueezeNet V1.1 - LeNet - CaffeNet - VGG (Sousmith VGG_A) - AlexNet - +- TinyYolo v1 diff --git a/docs/TensorFlow.md b/docs/TensorFlow.md index 4b01aec..2ddcd5d 100644 --- a/docs/TensorFlow.md +++ b/docs/TensorFlow.md @@ -1,14 +1,14 @@ # TensorFlow™ Support # Introduction -[TensorFlow™](https://www.tensorflow.org/) is a deep learning framework pionered by Google. The NCSDK introduced TensorFlow™ support with the 1.09.xx NCSDK release. Validation has been done with TensorFlow™ r1.3. The TensorFlow™ website describes it as "TensorFlow™ is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them." +[TensorFlow™](https://www.tensorflow.org/) is a deep learning framework pioneered by Google. The NCSDK introduced TensorFlow support with the 1.09.xx NCSDK release. Validation has been done with TensorFlow r1.3. As described on the TensorFlow website, "TensorFlow™ is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them." -Default Installation Location: /opt/movidius/tensorflow +* Default installation location: /opt/movidius/tensorflow -# TensorFlow™ Model Zoo -TensorFlow™ has a model GitHub repo at https://github.com/tensorflow/models similar to the Caffe Zoo for Caffe. The TensorFlow™ models GitHub repository contains several models which are maintained by the respective autors unlike Caffe which is not a single GitHub repo. +# TensorFlow Model Zoo +TensorFlow has a model GitHub repo at https://github.com/tensorflow/models similar to the Caffe Zoo for Caffe. The TensorFlow models GitHub repository contains several models that are maintained by the respective authors, unlike Caffe, which is not a single GitHub repo. -# Save Session with graph and checkpoint information +# Save Session with Graph and Checkpoint Information ```python import numpy as np @@ -33,20 +33,35 @@ def run(name, image_size, num_classes): run('inception-v1', 224, 1001) ``` -# Compile for TensorFlow™ +# Compile for TensorFlow ``` mvNCCompile output/inception-v1.meta -in=input -on=InceptionV1/Logits/Predictions/Reshape_1 -s12 ``` -# Neural Compute TensorFlow™ Layer Support - -# TensorFlow™ Networks Supported +# TensorFlow Networks Supported * Inception V1 +* Inception V2 * Inception V3 * Inception V4 * Inception ResNet V2 -* MobileNet +* MobileNet_v1_1.0 variants: + * MobileNet_v1_1.0_224 + * MobileNet_v1_1.0_192 + * MobileNet_v1_1.0_160 + * MobileNet_v1_1.0_128 + * MobileNet_v1_0.75_224 + * MobileNet_v1_0.75_192 + * MobileNet_v1_0.75_160 + * MobileNet_v1_0.75_128 + * MobileNet_v1_0.5_224 + * MobileNet_v1_0.5_192 + * MobileNet_v1_0.5_160 + * MobileNet_v1_0.5_128 + * MobileNet_v1_0.25_224 + * MobileNet_v1_0.25_192 + * MobileNet_v1_0.25_160 + * MobileNet_v1_0.25_128 diff --git a/docs/configure_network.md b/docs/configure_network.md index c57f455..745dd06 100644 --- a/docs/configure_network.md +++ b/docs/configure_network.md @@ -1,27 +1,24 @@ -# Configuring your network for NCS -This guide will help you get all of the configuration information correct when creating your network for the Movidius Neural Compute Stick. All of these parameters are critical, if you don't get them right, your network won't give you the accuracy that was achieved by the team that trained the model. The configuration parameters include: -* mean subtraction -* scale -* color channel configuration -* class prediction -* input image size +# Configuring Your Network for Intel® Movidius™ NCS +This guide will help you get all of the configuration information correct when creating your network for the Intel® Movidius™ Neural Compute Stick (Intel® Movidius™ NCS). All of these parameters are critical. If you don't get them right, your network won't give you the accuracy that was achieved by the team that trained the model. The configuration parameters are as follows: +* Mean subtraction +* Scale +* Color channel configuration +* Class prediction +* Input image size -**let's go through these one at a time** +**Let's go through these one at a time.** ## Mean Subtraction -mean substraction on the input data to a CNN is a common technique. The mean is calculated on the data set. For example on Imagenet the mean is calculated on a per channel basis to be: +Mean subtraction on the input data to a convolutional neural network (CNN) is a common technique. The mean is calculated on the data set. For example, the mean on Imagenet is calculated on a per channel basis to be: ``` 104, 117, 123 -these numbers are in BGR orientation +These numbers are in BGR orientation. ``` ### Caffe Specific Examples -this mean calculation can be calculated with a tool that comes with caffe: -[compute_image_mean.cpp](https://github.com/BVLC/caffe/blob/master/tools/compute_image_mean.cpp), -and Caffe provides a script to do it as well: -[make_imagenet_mean.sh](https://github.com/BVLC/caffe/blob/master/examples/imagenet/make_imagenet_mean.sh) +This mean calculation can be calculated with a tool that comes with caffe ([compute_image_mean.cpp](https://github.com/BVLC/caffe/blob/master/tools/compute_image_mean.cpp)). Caffe provides a script to do it, as well ([make_imagenet_mean.sh](https://github.com/BVLC/caffe/blob/master/examples/imagenet/make_imagenet_mean.sh)). --- -this will create an output file often called mean_binary.proto. You can see an example of this in the training prototxt file for AlexNet +This will create an output file often called mean_binary.proto. You can see an example of this in the training prototxt file for AlexNet: [train_val.prototxt](https://github.com/BVLC/caffe/blob/master/models/bvlc_alexnet/train_val.prototxt) ``` @@ -32,7 +29,7 @@ this will create an output file often called mean_binary.proto. You can see an } ``` --- -in the GoogLeNet prototxt file they have just put the values directly: +In the GoogLeNet prototxt file, they have put in the values directly: [train_val.prototxt](https://github.com/BVLC/caffe/blob/master/models/bvlc_googlenet/train_val.prototxt) ``` @@ -46,7 +43,7 @@ in the GoogLeNet prototxt file they have just put the values directly: } ``` --- -some models don't use mean subtraction, see below for LeNet as an example. There is no mean in the transform_param, but there is a scale which we'll get to later +Some models don't use mean subtraction. See the LeNet example below. There is no mean in the transform_param, but there is a scale that we'll get to later: [lenet_train_test.prototxt](https://github.com/BVLC/caffe/blob/master/examples/mnist/lenet_train_test.prototxt) ``` @@ -54,27 +51,28 @@ some models don't use mean subtraction, see below for LeNet as an example. Ther scale: 0.00390625 } ``` -### TensorFlow specific examples -TensorFlow documentation of mean is not as straight forward as Caffe. The TensorFlow Slim models for image classification are a great place to get high quality pre-trained models: +### TensorFlow™ Specific Examples +TensorFlow™ documentation of mean is not as straightforward as Caffe. The TensorFlow Slim models for image classification are a great place to get high quality pre-trained models: [slim models](https://github.com/tensorflow/models/tree/master/research/slim#pre-trained-models) -I could find this the following file the mean (and scale) for both Inception V3 and MobileNet V1 +The following file is the mean (and scale) for both Inception V3 and MobileNet V1: [retrain script](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/image_retraining/retrain.py#L872) ``` input_mean = 128 ``` -in the case of the InceptionV3 model there is not a per color channel mean, the same mean is used for all channels. This mean should apply to all of the Inception and MobileNet models, but other models there might be different. -for example, the VGG16 model just had the weights converted from Caffe. If we look at the link to the VGG16 for Caffe page, we see the means are done like the other Caffe models: +In the case of the Inception V3 model, there is not a per color channel mean. The same mean is used for all channels. This mean should apply to all of the Inception and MobileNet models, but other models might be different. + +For example, the VGG16 model had the weights converted from Caffe. If we look at the link to the VGG16 for Caffe page, we see the means are done like the other Caffe models: ``` https://gist.github.com/ksimonyan/211839e770f7b538e2d8#description -the following BGR values should be subtracted: [103.939, 116.779, 123.68] +The following BGR values should be subtracted: [103.939, 116.779, 123.68] ``` ## Scale -typical 8 bit per pixel per channel images will have a scale of 0-255. Many CNN networks use the native scale, but some don't. As was seen in a snippet of the Caffe prototxt file, the **transform_param** would show whether there was a scale. In the example of LeNet for Caffe, you can see it has a scale pameter of **0.00390625** +Typical 8-bit per pixel per channel images will have a scale of 0-255. Many CNN networks use the native scale, but some don't. As was seen in a snippet of the Caffe prototxt file, the **transform_param** would show whether there was a scale. In the example of LeNet for Caffe, you can see it has a scale pameter of **0.00390625**. [lenet_train_test.prototxt](https://github.com/BVLC/caffe/blob/master/examples/mnist/lenet_train_test.prototxt) ``` @@ -82,36 +80,37 @@ typical 8 bit per pixel per channel images will have a scale of 0-255. Many CNN scale: 0.00390625 } ``` - this may seem like a strange number, but it is actually just 1/256. So the input 8 bit image is being scaled down to an image from 0-1 instead of 0-255 +This may seem like a strange number, but it is actually just 1/256. The input 8-bit image is being scaled down to an image from 0-1 instead of 0-255. --- -Back to the example of TensorFlow for Inception V3. Below the **input_mean** the **input_std** is also listed. All this is is a scaling factor. You divide 255/128 and it's about 2. So in this case, the scale is two, but the mean subtraction is 128. So in the end the scale is actually -1 to 1 +Regarding the example of TensorFlow for Inception V3, the **input_mean** the **input_std** are listed below. This is a scaling factor. You divide 255/128, and it's about 2. In this case, the scale is two, but the mean subtraction is 128. In the end, the scale is actually -1 to 1. [retrain script](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/image_retraining/retrain.py#L872) ``` input_mean = 128 input_std = 128 ``` -## Color Channel configuration -different models may be trained with different color channel orientations (either RGB or BGR). Typically Caffe models seem to be trained with BGR whereas the Slim TensorFlow models (at least Inception and MobileNet) are trained in RGB. +## Color Channel Configuration +Different models may be trained with different color channel orientations (either RGB or BGR). Typically, Caffe models seem to be trained with BGR, whereas the Slim TensorFlow models (at least Inception and MobileNet) are trained in RGB. -Once you figure out the color channel orientation for your model, you will need to know the way the image is loaded. For example opencv will open images in BGR but skimiage will open the image in RGB. +Once you figure out the color channel orientation for your model, you will need to know the way the image is loaded. For example, opencv will open images in BGR, but skimiage will open the image in RGB. ``` -skimage.io.imread will open the image in RGB -cv2.imread will open the image in BGR -Caffe trained models will probably be BGR -TensorFlow trained models will probably be in RGB +* skimage.io.imread will open the image in RGB +* cv2.imread will open the image in BGR +* Caffe trained models will probably be BGR +* TensorFlow trained models will probably be in RGB ``` ## Categories -for models that are trained on the Imagenet database, some have 1000 output classes, and some have 1001 output classes. The extra output class is a background class. The list below has the list of the 1000 classes not including the background. +For models that are trained on the ImageNet database, some have 1000 output classes and some have 1001 output classes. The extra output class is a background class. The following list has the list of the 1000 classes not including the background: [synset_words.txt](https://github.com/HoldenCaulfieldRye/caffe/blob/master/data/ilsvrc12/synset_words.txt) -Most Caffe trained models seem to follow the 1000 class convention, and TensorFlow trained models follow the 1001 class convention. So for the TensorFlow models, an offset needs to be added. You can see this as documented in the TensorFlow github [here](https://github.com/tensorflow/models/tree/master/research/slim#the-resnet-and-vgg-models-have-1000-classes-but-the-imagenet-dataset-has-1001) -# Putting it all together -now with all of these factors, let's go through two examples +Most Caffe trained models seem to follow the 1000 class convention, and TensorFlow trained models follow the 1001 class convention. For the TensorFlow models, an offset needs to be added. You can see this documented in the [TensorFlow GitHub](https://github.com/tensorflow/models/tree/master/research/slim#the-resnet-and-vgg-models-have-1000-classes-but-the-imagenet-dataset-has-1001). + +# Putting It All Together +Now with all of these factors, let's go through two examples. ## Caffe Example -let's use the Berkeley Caffe GoogLeNet model as an example. the basic model parameters are: +Let's use the Berkeley Caffe GoogLeNet model as an example. The basic model parameters are: ``` Scale: 0-255 (before mean subtraction) Mean: based on mean_binary.proto file @@ -120,7 +119,7 @@ output categories: 1000 input size: 224x224 labels_offset=0 ``` -code snippet: +Code snippet: ``` #load the label files labels_offset=0 # no background class offset @@ -144,8 +143,8 @@ for i in range(0,5): print ('prediction ' + str(i) + ' is ' + labels[order[i]-labels_offset]) ``` -## TensorFlow example -let's use the TensorFlow Slim Inception V3 +## TensorFlow Example +Let's use the TensorFlow Slim Inception V3: ``` Scale: -1 to 1 (after mean subtraction) Mean: 128 @@ -154,7 +153,7 @@ output categories: 1001 input size: 299x299 labels_offset=1 ``` -code snippet: +Code snippet: ``` #load the label files labels_offset=1 # background class offset of 1 @@ -181,4 +180,4 @@ for i in range(0,5): print ('prediction ' + str(i) + ' is ' + labels[order[i]-labels_offset]) ``` -feedback or comments? let me know darren.s.crews@intel.com +Feedback or comments? Let me know at darren.s.crews@intel.com. diff --git a/docs/install.md b/docs/install.md index 8b1bdc7..cfeb4d2 100644 --- a/docs/install.md +++ b/docs/install.md @@ -1,56 +1,55 @@ # Installation and Configuration -This page provides installation and configuration information needed to use the NCS and the examples provided in this repository. To use the NCS you will need to have the Movidius™ Neural Compute SDK installed on your development computer. The SDK installation provides an option to install the examples in this repostitory. If you've already installed the SDK on your development computer you may have selected the option to also install these examples. If you have not already installed the SDK you should follow the instructions in the Example Installation with SDK section in this page, and when prompted select the option to install the examples. +This page provides installation and configuration information needed to use the Intel® Movidius™ Neural Compute Stick (Intel® Movidius™ NCS) and the examples provided in this repository. To use the Intel Movidius NCS, you will need to have the Intel Movidius Neural Compute SDK installed on your development computer. The SDK installation provides an option to install the examples in this repository. If you've already installed the SDK on your development computer, you may have selected the option to also install these examples. If you have not already installed the SDK, you should follow the instructions in the Installation of SDK and Examples section on this page. When prompted, select the option to install the examples. ## Prerequisites -To build and run the examples in this repository you will need to have the following. -- Movidius™ Neural Compute Stick (NCS) -- Movidius™ Neural Compute SDK -- Development Computer with Supported OS +To build and run the examples in this repository, you will need to have the following: +- Intel Movidius Neural Compute Stick +- Intel Movidius Neural Compute SDK +- Development computer with supported OS - x86-64 with Ubuntu (64 bit) 16.04 Desktop - Raspberry Pi 3 with Raspian Stretch (starting with SDK 1.09.xx) - See [Upgrade Raspian Jessie to Stretch](https://linuxconfig.org/how-to-upgrade-debian-8-jessie-to-debian-9-stretch) - Virtual Machine per the [supported VM configuration](VirtualMachineConfig.md) -- Internet Connection. -- USB Camera (optional) +- Internet connection +- USB camera (optional) -## Connecting the NCS to a development computer -The NCS connects to the development computer over a USB 2.0 High Speed interface. Plug the NCS directly to a USB port on your development computer or into a powered USB hub that is plugged into your development computer. +## Connecting the Intel Movidius NCS to a Development Computer +The Intel Movidius NCS connects to the development computer over a USB 2.0 High Speed interface. Plug the Intel Movidius NCS directly to a USB port on your development computer or into a powered USB hub that is plugged into your development computer. ![](images/ncs_plugged.jpg) -## Installation SDK and examples -To install the SDK along with the examples in this repository use the following command on your development computer. This is the typical installation. If you haven't already installed the SDK on your development computer you should use this command to install. +## Installation of SDK and Examples +To install the SDK along with the examples in this repository, use the following command on your development computer. This is the typical installation. If you haven't already installed the SDK on your development computer, you should use this command to install: ``` git clone http://github.com/Movidius/ncsdk && cd ncsdk && make install && make examples ``` -## Installation of examples without SDK -To install only the examples and not the SDK on your development computer use the following command to clone the repository and then make appropriate examples for your development computer. If you already have the SDK installed and only need the examples on your machine you should use this command to install. +## Installation of Examples without SDK +To install only the examples and not the SDK on your development computer, use the following command to clone the repository and then make appropriate examples for your development computer. If you already have the SDK installed and only need the examples on your machine, you should use this command to install the examples: ``` git clone http://github.com/Movidius/ncsdk && cd ncsdk && make examples ``` ## Building Individual Examples -Whether installing with the SDK or without it, both methods above will install and build the examples that are appropriate for your development system including prerequisite software. Each example comes with its own Makefile that will install only that specific example and any prerequisites that it requires. To install and build any individual example run the 'make' command from within that example's base directory. For example to build the GoogLeNet examples type the following command. +Whether installing with the SDK or without it, both methods above will install and build the examples that are appropriate for your development system, including prerequisite software. Each example comes with its own Makefile that will install only that specific example and any prerequisites that it requires. To install and build any individual example, run the 'make' command from within that example's base directory. For example, to build the GoogLeNet examples, type the following command: ``` cd examples/Caffe/GoogLeNet && make ``` -The Makefile for each example also has a 'help' target which will display all possible targets. To see all possible targets for any example use the following command from within the examples top directory. +The Makefile for each example also has a 'help' target that will display all possible targets. To see all possible targets for any example, use the following command from within the examples top directory: ``` make help ``` ## Uninstallation -To uninstall the SDK type the following command. +To uninstall the SDK, type the following command: ``` make uninstall ``` - ## Installation Manifest -For the list of files that 'make install' will modify on your system (outside of the repository) see the [installation manifest](manifest.md). +For the list of files that 'make install' will modify on your system (outside of the repository), see the [installation manifest](manifest.md). diff --git a/docs/ncs1arch.md b/docs/ncs1arch.md index 60e4339..c91b3eb 100644 --- a/docs/ncs1arch.md +++ b/docs/ncs1arch.md @@ -1,18 +1,18 @@ # Introduction -The following explains how the Neural Compute SDK works on compiling and executing a given Caffe or TensorFlow™ Neural Network on the Neural Compute Stick. +The Neural Compute SDK works on compiling and executing a given Caffe or TensorFlow™ Neural Network on the Intel® Movidius™ Neural Compute Stick (Intel® Movidius™ NCS). Read on for a detailed explanation. # Architecture Details -The following diagram shows the inner workings of the Neural Compute Stick. The Neural Compute Stick primarily contains Movidius™ Myriad 2 VPU (Vision Processing Unit), and some power delivery voltage regulators. The Myriad 2 VPU includes 4 Gbit of LPDDR3 DRAM and its architecture includes specific imaging and vision accelerators and an array of 12 VLIW vector processors called SHAVE processors, used to accelerate neural networks by running parts of the neural networks in parallel for achieving the highest performance. The Neural Compute Stick is connected to an Application Processor (AP) such as a Raspberry Pi or Up Squared board using the USB interface on the Myriad 2 VPU. The USB3 interface can be used both in Super Speed (5Gbps) or High Speed (480Mbps) modes. +The following diagram shows the inner workings of the Intel Movidius NCS. The Intel Movidius NCS primarily contains the Intel® Movidius™ Myriad™ 2 vision processing unit (VPU) and some power delivery voltage regulators. The Intel Movidius Myriad 2 VPU includes 4 Gbit of LPDDR3 DRAM, and its architecture includes specific imaging and vision accelerators and an array of 12 VLIW vector processors called SHAVE processors. These processors are used to accelerate neural networks by running parts of the neural networks in parallel for achieving the highest performance. The Intel Movidius NCS is connected to an application processor (AP), such as a Raspberry Pi or UP Squared board, using the USB interface on the Intel Movidius Myriad 2 VPU. The USB3 interface can be used both in Super Speed (5 Gbps) or High Speed (480 Mbps) modes. -The CPU in the Myriad 2 VPU is a SPARC microprocessor core that runs custom firmware. When the Neural Compute Stick is first plugged in there is no firmware loaded onto it. The Myriad 2 VPU boots from the internal ROM and connects to the host computer(application processor) as a USB2 device. +The CPU in the Intel Movidius Myriad 2 VPU is a SPARC microprocessor core that runs custom firmware. When the Intel Movidius Neural Compute Stick is first plugged in, there is no firmware loaded onto it. The Intel Movidius Myriad 2 VPU boots from the internal ROM and connects to the host computer (application processor) as a USB 2.0 device. -Applications executing on the host computer (AP) communicate to the Myriad SOC using the Neural Compute API. When the API initializes and opens a device, the firmware from the Neural Compute SDK is loaded onto the Neural Compute Stick. At this time, the Neural Compute Stick resets and now shows up to the host computer as a USB2 or USB3 device depending on the host type. It is now ready to accept the neural network graph files and commands to execute inferences on the graph files. +Applications executing on the host computer (AP) communicate to the Intel Movidius Myriad VPU SOC using the Neural Compute API. When the API initializes and opens a device, the firmware from the Neural Compute SDK is loaded onto the Intel Movidius Neural Compute Stick. At this time, the Intel Movidius NCS resets and now shows up to the host computer as a USB 2.0 or USB 3.0 device depending on the host type. It is now ready to accept the neural network graph files and commands to execute inferences on the graph files. ![](images/NCS1_ArchDiagram.jpg) -A graph file is loaded into the DRAM attached to the Myriad-2 VPU via the API. The Leon processor coordinates receiving the graph file and images for inference via the USB connection. It also parses the graph file and schedules kernels to the SHAVE neural compute accelerator engines. In addition, the Leon processor also takes care of monitoring die temperature and throttling processing on high temperature alerts. Statistics and the output of the neural network are sent back to the host computer via the USB connection and they are received by a host application via the API. +A graph file is loaded into the DRAM attached to the Intel Movidius Myriad 2 VPU via the API. The LEON processor coordinates receiving the graph file and images for inference via the USB connection. It also parses the graph file and schedules kernels to the SHAVE neural compute accelerator engines. In addition, the LEON processor also takes care of monitoring die temperature and throttling processing on high temperature alerts. Statistics and the output of the neural network are sent back to the host computer via the USB connection, and they are received by a host application via the API. -In addition to the API, the SDK provides the tools mvNCCompile, mvNCCheck, and mvNCProfile that run on the host computer during application and neural network development. The checker and profiler tools run an inference on the Neural Compute Stick to validate against Caffe/TensorFlow™ and generate per layer statistics respectively. +In addition to the API, the NCSDK provides the tools mvNCCompile, mvNCCheck, and mvNCProfile that run on the host computer during application and neural network development. The checker and profiler tools run an inference on the Intel Movidius Neural Compute Stick to validate against Caffe/TensorFlow and generate per layer statistics respectively. diff --git a/docs/readme.md b/docs/readme.md new file mode 100644 index 0000000..efd60ee --- /dev/null +++ b/docs/readme.md @@ -0,0 +1,97 @@ + + +# Introduction +The Intel® Movidius™ Neural Compute SDK (NCSDK) and Intel® Movidius™ Neural Compute Stick (Intel® Movidius™ NCS) enable rapid prototyping, validation, and deployment of deep neural networks (DNNs). + +The NCS is used in two primary scenarios: +- Profiling, tuning, and compiling a DNN on a development computer (host system) with the tools provided in the Intel Movidius Neural Compute SDK. In this scenario, the host system is typically a desktop or laptop machine running Ubuntu 16.04 desktop (x86, 64 bit), but you can use any supported platform for these steps. + +- Prototyping a user application on a development computer (host system), which accesses the hardware of the Intel Movidius NCS to accelerate DNN inferences via the API provided with the Intel Movidius Neural Compute SDK. In this scenario, the host system can be a developer workstation or any developer system that runs an operating system compatible with the API. + +The following diagram shows the typical workflow for development with the Intel Movidius NCS: +![](images/ncs_workflow.jpg) + +The training phase does not utilize the Intel Movidius NCS hardware or NCSDK, while the subsequent phases of “profiling, tuning, and compiling” and “prototyping” do require the Intel Movidius NCS hardware and the accompanying Intel Movidius Neural Compute SDK. + +The NCSDK contains a set of software tools to compile, profile, and check validity of your DNN as well as an API for both the C and Python programming languages. The API is provided to allow users to create software that offloads the neural network computation onto the Intel Movidius Neural Compute Stick. + +The following is more information on the [architecture](ncs1arch.md) of the Intel Movidius Neural Compute Stick: + + +# Frameworks +Neural Compute SDK currently supports two Deep Learning frameworks. +1. [Caffe](Caffe.md): Caffe is a deep learning framework from Berkeley Vision Labs. +2. [TensorFlow™](TensorFlow.md): TensorFlow™ is a deep learning framework from Google. + +[See how to use networks from these supported frameworks with Intel Movidius NCS.](configure_network.md) + + + +# Installation and Examples +The following commands install NCSDK and run examples. Detailed instructions for [installation and configuration](install.md): + +``` +git clone http://github.com/Movidius/ncsdk && cd ncsdk && make install && make examples + +``` + +# Intel® Movidius™ Neural Compute SDK Tools +The SDK comes with a set of tools to assist in development and deployment of applications that utilize hardware accelerated Deep Neural Networks via the Intel Movidius Neural Compute Stick. Each tool and its usage is described below: + +* [mvNCCompile](tools/compile.md): Converts Caffe/TF network and weights to Intel Movidius technology internal compiled format + +* [mvNCProfile](tools/profile.md): Provides layer-by-layer statistics to evaluate the performance of Caffe/TF networks on the NCS + +* [mvNCCheck](tools/check.md): Compares the results from an inference by running the network on the NCS and Caffe/TF + + +# Neural Compute API +Applications for inferencing with Neural Compute SDK can be developed either in C/C++ or Python. The API provides a software interface to Open/Close Neural Compute Sticks, load graphs into the Intel Movidius NCS, and run inferences on the stick. + +* [C API](c_api/readme.md) +* [Python API](py_api/readme.md) + + +# Intel® Movidius™ Neural Compute Stick User Forum + +There is an active user forum in which users of the Intel Movidius Neural Compute Stick discuss ideas and issues they have with regard to the Intel Movidius NCS. Access the Intel Movidius NCS User Forum with the following link: + +[https://ncsforum.movidius.com](https://ncsforum.movidius.com) + +The forum is a good place to go if you need help troubleshooting an issue. You may find other people who have figured out the issue, or get ideas for how to fix it. The forum is also monitored by Intel Movidius product engineers who provide solutions, as well. + + +# Examples + +There are several examples, including the following at GitHub: +* Caffe + * GoogLeNet + * AlexNet + * SqueezeNet +* TensorFlow™ + * Inception V1 + * Inception V3 +* Apps + * hello_ncs_py + * hello_ncs_cpp + * multistick_cpp + +The examples demonstrate compiling, profiling, and running inferences using the network on the Intel Movidius Neural Compute Stick. +Each example contains a Makefile. Running 'make help' in the example's base directory will give possible make targets. + +``` + +git clone http://github.com/Movidius/ncsdk # Already done during installation +(cd ncsdk/examples && make) # run all examples +(cd ncsdk/examples/caffe/GoogLeNet && make) # Run just one example + +``` + + +# Neural Compute App Zoo +The Neural Compute App Zoo is a GitHub repository at [http://github.com/Movidius/ncappzoo](http://github.com/Movidius/ncappzoo), which is designed for developers to contribute networks and applications written for the Intel Movidus Neural Compute Stick to the Intel Movidius NCS community. + +See [The Neural Compute App Zoo README](https://github.com/Movidius/ncappzoo/blob/master/README.md) for more information. + + +[Release Notes](release_notes.md) diff --git a/docs/release_notes.md b/docs/release_notes.md index 5cf6bd9..b7d6d97 100644 --- a/docs/release_notes.md +++ b/docs/release_notes.md @@ -1,24 +1,24 @@ ============================================================ # Movidius Neural Compute SDK Release Notes -# V1.09.00 2017-10-10 +# V1.10.00 2017-10-31 ============================================================ +###As of V1.09.00, SDK has been refactored and contains many new features and structural changes. It is recommended you read the documentation to familiarize with the new features and contents. Please see v1.09.00 release notes, using github tag https://github.com/movidius/ncsdk/tree/v1.09.00.06 + ## SDK Notes: -SDK has been refactored and contains many new features and structural changes. It is recommended you read the documentation to familiarize with the new features and contents. A partial list of new features: +New features: +Networks: + Inception-v2 + Optimized mobilenet (down to 42ms from 110 ms on 12 shaves, for the 224x224 version) +Layers: + Dilated convolution + Optimized depth convolution: still supporting only 3x3 convolution + 1x1 pooling with stride 2, acting as up-sampling + Fix for TF-like padding for convolution. May fail for large input channels (more than 200). -1. New, unified, faster installer and uninstaller. -2. Now supports complete SDK installation on Raspberry Pi. -3. System installation of tools and API libraries. -4. API support for Python 2.7. -5. Source code included for API, for porting to other architectures or Linux distributions. -6. Tools support for Raspberry Pi. -7. Tensorflow R1.3 support for tools (only on Ubuntu 16.04 LTS currently). -8. More network support, see documentation for details! -9. Support for SDK on Ubuntu 16.04 LTS as guest OS, and Win10, OSX, and Ubuntu 16.04 as host OS. See docs/VirtualMachineConfig. ## API Notes: -1. API supported on both python 2.7 and python 3.5. -2. Some APIs deprecated, will emit the "deprecated" warning if used. Users expected to move to using new APIs for these functions. +1. No change ## Network Notes: Support for the following networks has been tested. @@ -34,24 +34,42 @@ Support for the following networks has been tested. ### Tensorflow r1.3 1. inception-v1 -2. inception-v3 -3. inception-v4 -4. Inception ResNet v2 -5. Mobilenet_V1_1.0_224 (preview -- see erratum #3.) +2. inception-v2 +3. inception-v3 +4. inception-v4 +5. Inception ResNet v2 +6. Mobilenet_V1_1.0 variants: +MobileNet_v1_1.0_224 +MobileNet_v1_1.0_192 +MobileNet_v1_1.0_160 +MobileNet_v1_1.0_128 +MobileNet_v1_0.75_224 +MobileNet_v1_0.75_192 +MobileNet_v1_0.75_160 +MobileNet_v1_0.75_128 +MobileNet_v1_0.5_224 +MobileNet_v1_0.5_192 +MobileNet_v1_0.5_160 +MobileNet_v1_0.5_128 +MobileNet_v1_0.25_224 +MobileNet_v1_0.25_192 +MobileNet_v1_0.25_160 +MobileNet_v1_0.25_128 ## Firmware Features: 1. Convolutions - NxN Convolution with Stride S. - The following cases have been extensively tested: 1x1s1,3x3s1,5x5s1,7x7s1, 7x7s2, 7x7s4 - Group convolution - - Depth Convolution (limited support -- see erratum #10.) + - Depth Convolution + - Dilated convolution 2. Max Pooling Radix NxM with Stride S -3. Average Pooling Radix NxM with Stride S +3. Average Pooling: Radix NxM with Stride S, Global average pooling 4. Local Response Normalization -5. Relu, Relu-X, Prelu +5. Relu, Relu-X, Prelu (see erattum #10) 6. Softmax 7. Sigmoid -8. Tanh +8. Tanh (see erratum #10) 9. Deconvolution 10. Slice 11. Scale @@ -63,20 +81,22 @@ Support for the following networks has been tested. 17. Crop 18. ELU 19. Batch Normalization + ## Bug Fixes: -1. USB protocol bug fixes, for expanded compatibility with hubs and hosts. In particular, fix for devices with maxpacket of 64. -2. Fixed -- when a graph execution fails, the result for a previous execution is erroneously returned. +1. Fixed -- mvNCProfile gives wrong results for MFLOPs for depthwise_convolution layer. +2. Fixed -- apps report NaN or wrong results, with input dimension not multiple of 8. +3. Fixed -- API reports timeout error if input dimension < 8. +4. Fixed -- SDK can not be installed by root user in Ubuntu. ## Errata: 1. Python 2.7 is fully supported for making user applications, but only the helloworld_py example runs as-is in both python 2.7 and 3.5 due to dependencies on modules. -2. SDK tools for tensorflow on Rasbpian Stretch are not supported for this release, due to lack of an integrated tensorflow installer for Rasbpian in the SDK. TF examples are provided with pre-compiled graph files to allow them to run on Rasperry Pi, however the compile, profile, and check functions will not be available on Raspberry Pi, and 'make examples' will generate failures for the tensorflow examples on Raspberry Pi. -3. Depth-wise convolution is not optimized, leading to low performance of Mobilenet, and does not support channel multiplier >1. +2. SDK tools for tensorflow on Rasbpian Stretch are not supported for this release, due to erlack of an integrated tensorflow installer for Rasbpian in the SDK. TF examples are provided with pre-compiled graph files to allow them to run on Rasperry Pi, however the compile, profile, and check functions will not be available on Raspberry Pi, and 'make examples' will generate failures for the tensorflow examples on Raspberry Pi. +3. Depth-wise convolution may not be supported if channel multiplier > 1. 4. If working behind proxy, proper proxy settings must be applied for the installer to succeed. 5. Although improved, the installer is known to take a long time on Raspberry Pi. Date/time must be correct for SDK installation to succeed on Raspberry Pi. 6. Default system virtual memory swap file size is too small to compile AlexNet on Raspberry Pi. -7. Raspberry Pi users will need to upgrade to Raspbian Stretch for this release. -8. Fully Connected Layers may produce erroneous results if input size is not a multiple of 8. -9. Convolution may fail to find a solution for very large inputs. -10. Depth convolution is tested for 3x3 kernels. -11. TensorFlow-like padding not correctly supported in some convolution cases, such as when stride=2 and even input size for 3x3 convolution. +7. Raspberry Pi users will need to upgrade to Raspbian Stretch for releases after 1.09. +8. Convolution may fail to find a solution for very large inputs. +9. Depth convolution is tested for 3x3 kernels. +10. A TanH layer’s “top” & “bottom” blobs must have different names.  This is different from a ReLU layer, whose “top” & “bottom” should be named the same as its previous layer. diff --git a/install.sh b/install.sh index 1547d8b..cbf23ae 100644 --- a/install.sh +++ b/install.sh @@ -11,10 +11,10 @@ then cd /tmp else cd /tmp - wget --no-cache http://ncs-forum-uploads.s3.amazonaws.com/ncsdk/ncsdk_01_09/ncsdk_redirector.txt + wget --no-cache http://ncs-forum-uploads.s3.amazonaws.com/ncsdk/ncsdk_01_10/ncsdk_redirector.txt fi -download_filename=NCSDK-1.09.tar.gz +download_filename=NCSDK-1.10.tar.gz # redirector is the url from redirector text file redirector=$(