add convert_imageset option to resize images; use in

author Jeff Donahue <jeff.donahue@gmail.com>

Fri, 23 May 2014 21:57:02 +0000 (14:57 -0700)

committer Jeff Donahue <jeff.donahue@gmail.com>

Fri, 23 May 2014 21:58:45 +0000 (14:58 -0700)
author Jeff Donahue <jeff.donahue@gmail.com>
Fri, 23 May 2014 21:57:02 +0000 (14:57 -0700)
committer Jeff Donahue <jeff.donahue@gmail.com>
Fri, 23 May 2014 21:58:45 +0000 (14:58 -0700)
diff --git a/docs/imagenet_training.md b/docs/imagenet_training.md

index 9e0076c..f628f79 100644 (file)
--- a/docs/imagenet_training.md
+++ b/docs/imagenet_training.md
@@ -30,7 +30,7 @@ You will first need to prepare some auxiliary data for training. This data can b
  
  The training and validation input are described in `train.txt` and `val.txt` as text listing all the files and their labels. Note that we use a different indexing for labels than the ILSVRC devkit: we sort the synset names in their ASCII order, and then label them from 0 to 999. See `synset_words.txt` for the synset/name mapping.
  
-You will also need to resize the images to 256x256: we do not explicitly do this because in a cluster environment, one may benefit from resizing images in a parallel fashion, using mapreduce. For example, Yangqing used his lightedweighted [mincepie](https://github.com/Yangqing/mincepie) package to do mapreduce on the Berkeley cluster. If you would things to be rather simple and straightforward, you can also use shell commands, something like:
+You may want to resize the images to 256x256 in advance. By default, we do not explicitly do this because in a cluster environment, one may benefit from resizing images in a parallel fashion, using mapreduce. For example, Yangqing used his lightedweighted [mincepie](https://github.com/Yangqing/mincepie) package to do mapreduce on the Berkeley cluster. If you would things to be rather simple and straightforward, you can also use shell commands, something like:
  
      for name in /path/to/imagenet/val/*.JPEG; do
          convert -resize 256x256\! $name $name
@@ -38,7 +38,7 @@ You will also need to resize the images to 256x256: we do not explicitly do this
  
  Go to `$CAFFE_ROOT/examples/imagenet/` for the rest of this guide.
  
-Take a look at `create_imagenet.sh`. Set the paths to the train and val dirs as needed. Now simply create the leveldbs with `./create_imagenet.sh`. Note that `imagenet_train_leveldb` and `imagenet_val_leveldb` should not exist before this execution. It will be created by the script. `GLOG_logtostderr=1` simply dumps more information for you to inspect, and you can safely ignore it.
+Take a look at `create_imagenet.sh`. Set the paths to the train and val dirs as needed, and set "RESIZE=true" to resize all images to 256x256 if you haven't resized the images in advance. Now simply create the leveldbs with `./create_imagenet.sh`. Note that `imagenet_train_leveldb` and `imagenet_val_leveldb` should not exist before this execution. It will be created by the script. `GLOG_logtostderr=1` simply dumps more information for you to inspect, and you can safely ignore it.
  
  Compute Image Mean
  ------------------
diff --git a/examples/imagenet/create_imagenet.sh b/examples/imagenet/create_imagenet.sh

index a1bcb7b..8767f12 100755 (executable)
--- a/examples/imagenet/create_imagenet.sh
+++ b/examples/imagenet/create_imagenet.sh
@@ -5,16 +5,48 @@
  TOOLS=../../build/tools
  DATA=../../data/ilsvrc12
  
-echo "Creating leveldb..."
+TRAIN_DATA_ROOT=/path/to/imagenet/train/
+VAL_DATA_ROOT=/path/to/imagenet/val/
+
+# Set RESIZE=true to resize the images to 256x256. Leave as false if images have
+# already been resized using another tool.
+RESIZE=false
+if $RESIZE; then
+  RESIZE_HEIGHT=256
+  RESIZE_WIDTH=256
+else
+  RESIZE_HEIGHT=0
+  RESIZE_WIDTH=0
+fi
+
+if [ ! -d "$TRAIN_DATA_ROOT" ]; then
+  echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT"
+  echo "Set the TRAIN_DATA_ROOT variable in create_imagenet.sh to the path" \
+       "where the ImageNet training data is stored."
+  exit 1
+fi
+
+if [ ! -d "$VAL_DATA_ROOT" ]; then
+  echo "Error: VAL_DATA_ROOT is not a path to a directory: $VAL_DATA_ROOT"
+  echo "Set the VAL_DATA_ROOT variable in create_imagenet.sh to the path" \
+       "where the ImageNet validation data is stored."
+  exit 1
+fi
+
+echo "Creating train leveldb..."
  
  GLOG_logtostderr=1 $TOOLS/convert_imageset.bin \
-    /path/to/imagenet/train/ \
+    $TRAIN_DATA_ROOT \
      $DATA/train.txt \
-    imagenet_train_leveldb 1
+    imagenet_train_leveldb 1 \
+    $RESIZE_HEIGHT $RESIZE_WIDTH
+
+echo "Creating val leveldb..."
  
  GLOG_logtostderr=1 $TOOLS/convert_imageset.bin \
-    /path/to/imagenet/val/ \
+    $VAL_DATA_ROOT \
      $DATA/val.txt \
-    imagenet_val_leveldb 1
+    imagenet_val_leveldb 1 \
+    $RESIZE_HEIGHT $RESIZE_WIDTH
  
  echo "Done."
diff --git a/tools/convert_imageset.cpp b/tools/convert_imageset.cpp

index 2d4d4c7..900183a 100644 (file)
--- a/tools/convert_imageset.cpp
+++ b/tools/convert_imageset.cpp
@@ -2,7 +2,8 @@
  // This program converts a set of images to a leveldb by storing them as Datum
  // proto buffers.
  // Usage:
-//    convert_imageset ROOTFOLDER/ LISTFILE DB_NAME [0/1]
+//    convert_imageset ROOTFOLDER/ LISTFILE DB_NAME [0/1] \
+//                     [resize_height] [resize_width]
  // where ROOTFOLDER is the root folder that holds all the images, and LISTFILE
  // should be a list of files as well as their labels, in the format as
  //   subfolder1/file1.JPEG 7
@@ -29,12 +30,12 @@ using std::string;
  
  int main(int argc, char** argv) {
    ::google::InitGoogleLogging(argv[0]);
-  if (argc < 4 || argc > 5) {
+  if (argc < 4 || argc > 7) {
      printf("Convert a set of images to the leveldb format used\n"
          "as input for Caffe.\n"
          "Usage:\n"
          "    convert_imageset ROOTFOLDER/ LISTFILE DB_NAME"
-        " RANDOM_SHUFFLE_DATA[0 or 1]\n"
+        " RANDOM_SHUFFLE_DATA[0 or 1] [resize_height] [resize_width]\n"
          "The ImageNet dataset for the training demo is at\n"
          "    http://www.image-net.org/download-images\n");
      return 1;
@@ -46,12 +47,20 @@ int main(int argc, char** argv) {
    while (infile >> filename >> label) {
      lines.push_back(std::make_pair(filename, label));
    }
-  if (argc == 5 && argv[4][0] == '1') {
+  if (argc >= 5 && argv[4][0] == '1') {
      // randomly shuffle data
      LOG(INFO) << "Shuffling data";
      std::random_shuffle(lines.begin(), lines.end());
    }
    LOG(INFO) << "A total of " << lines.size() << " images.";
+  int resize_height = 0;
+  int resize_width = 0;
+  if (argc >= 6) {
+    resize_height = atoi(argv[5]);
+  }
+  if (argc >= 7) {
+    resize_width = atoi(argv[6]);
+  }
  
    leveldb::DB* db;
    leveldb::Options options;
@@ -73,7 +82,7 @@ int main(int argc, char** argv) {
    bool data_size_initialized = false;
    for (int line_id = 0; line_id < lines.size(); ++line_id) {
      if (!ReadImageToDatum(root_folder + lines[line_id].first,
-                          lines[line_id].second, &datum)) {
+        lines[line_id].second, resize_height, resize_width, &datum)) {
        continue;
      }
      if (!data_size_initialized) {
author	Jeff Donahue <jeff.donahue@gmail.com>
	Fri, 23 May 2014 21:57:02 +0000 (14:57 -0700)
committer	Jeff Donahue <jeff.donahue@gmail.com>
	Fri, 23 May 2014 21:58:45 +0000 (14:58 -0700)
docs/imagenet_training.md		patch \| blob \| history
examples/imagenet/create_imagenet.sh		patch \| blob \| history
tools/convert_imageset.cpp		patch \| blob \| history