The training and validation input are described in `train.txt` and `val.txt` as text listing all the files and their labels. Note that we use a different indexing for labels than the ILSVRC devkit: we sort the synset names in their ASCII order, and then label them from 0 to 999. See `synset_words.txt` for the synset/name mapping.
-You will also need to resize the images to 256x256: we do not explicitly do this because in a cluster environment, one may benefit from resizing images in a parallel fashion, using mapreduce. For example, Yangqing used his lightedweighted [mincepie](https://github.com/Yangqing/mincepie) package to do mapreduce on the Berkeley cluster. If you would things to be rather simple and straightforward, you can also use shell commands, something like:
+You may want to resize the images to 256x256 in advance. By default, we do not explicitly do this because in a cluster environment, one may benefit from resizing images in a parallel fashion, using mapreduce. For example, Yangqing used his lightedweighted [mincepie](https://github.com/Yangqing/mincepie) package to do mapreduce on the Berkeley cluster. If you would things to be rather simple and straightforward, you can also use shell commands, something like:
for name in /path/to/imagenet/val/*.JPEG; do
convert -resize 256x256\! $name $name
Go to `$CAFFE_ROOT/examples/imagenet/` for the rest of this guide.
-Take a look at `create_imagenet.sh`. Set the paths to the train and val dirs as needed. Now simply create the leveldbs with `./create_imagenet.sh`. Note that `imagenet_train_leveldb` and `imagenet_val_leveldb` should not exist before this execution. It will be created by the script. `GLOG_logtostderr=1` simply dumps more information for you to inspect, and you can safely ignore it.
+Take a look at `create_imagenet.sh`. Set the paths to the train and val dirs as needed, and set "RESIZE=true" to resize all images to 256x256 if you haven't resized the images in advance. Now simply create the leveldbs with `./create_imagenet.sh`. Note that `imagenet_train_leveldb` and `imagenet_val_leveldb` should not exist before this execution. It will be created by the script. `GLOG_logtostderr=1` simply dumps more information for you to inspect, and you can safely ignore it.
Compute Image Mean
------------------
TOOLS=../../build/tools
DATA=../../data/ilsvrc12
-echo "Creating leveldb..."
+TRAIN_DATA_ROOT=/path/to/imagenet/train/
+VAL_DATA_ROOT=/path/to/imagenet/val/
+
+# Set RESIZE=true to resize the images to 256x256. Leave as false if images have
+# already been resized using another tool.
+RESIZE=false
+if $RESIZE; then
+ RESIZE_HEIGHT=256
+ RESIZE_WIDTH=256
+else
+ RESIZE_HEIGHT=0
+ RESIZE_WIDTH=0
+fi
+
+if [ ! -d "$TRAIN_DATA_ROOT" ]; then
+ echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT"
+ echo "Set the TRAIN_DATA_ROOT variable in create_imagenet.sh to the path" \
+ "where the ImageNet training data is stored."
+ exit 1
+fi
+
+if [ ! -d "$VAL_DATA_ROOT" ]; then
+ echo "Error: VAL_DATA_ROOT is not a path to a directory: $VAL_DATA_ROOT"
+ echo "Set the VAL_DATA_ROOT variable in create_imagenet.sh to the path" \
+ "where the ImageNet validation data is stored."
+ exit 1
+fi
+
+echo "Creating train leveldb..."
GLOG_logtostderr=1 $TOOLS/convert_imageset.bin \
- /path/to/imagenet/train/ \
+ $TRAIN_DATA_ROOT \
$DATA/train.txt \
- imagenet_train_leveldb 1
+ imagenet_train_leveldb 1 \
+ $RESIZE_HEIGHT $RESIZE_WIDTH
+
+echo "Creating val leveldb..."
GLOG_logtostderr=1 $TOOLS/convert_imageset.bin \
- /path/to/imagenet/val/ \
+ $VAL_DATA_ROOT \
$DATA/val.txt \
- imagenet_val_leveldb 1
+ imagenet_val_leveldb 1 \
+ $RESIZE_HEIGHT $RESIZE_WIDTH
echo "Done."
// This program converts a set of images to a leveldb by storing them as Datum
// proto buffers.
// Usage:
-// convert_imageset ROOTFOLDER/ LISTFILE DB_NAME [0/1]
+// convert_imageset ROOTFOLDER/ LISTFILE DB_NAME [0/1] \
+// [resize_height] [resize_width]
// where ROOTFOLDER is the root folder that holds all the images, and LISTFILE
// should be a list of files as well as their labels, in the format as
// subfolder1/file1.JPEG 7
int main(int argc, char** argv) {
::google::InitGoogleLogging(argv[0]);
- if (argc < 4 || argc > 5) {
+ if (argc < 4 || argc > 7) {
printf("Convert a set of images to the leveldb format used\n"
"as input for Caffe.\n"
"Usage:\n"
" convert_imageset ROOTFOLDER/ LISTFILE DB_NAME"
- " RANDOM_SHUFFLE_DATA[0 or 1]\n"
+ " RANDOM_SHUFFLE_DATA[0 or 1] [resize_height] [resize_width]\n"
"The ImageNet dataset for the training demo is at\n"
" http://www.image-net.org/download-images\n");
return 1;
while (infile >> filename >> label) {
lines.push_back(std::make_pair(filename, label));
}
- if (argc == 5 && argv[4][0] == '1') {
+ if (argc >= 5 && argv[4][0] == '1') {
// randomly shuffle data
LOG(INFO) << "Shuffling data";
std::random_shuffle(lines.begin(), lines.end());
}
LOG(INFO) << "A total of " << lines.size() << " images.";
+ int resize_height = 0;
+ int resize_width = 0;
+ if (argc >= 6) {
+ resize_height = atoi(argv[5]);
+ }
+ if (argc >= 7) {
+ resize_width = atoi(argv[6]);
+ }
leveldb::DB* db;
leveldb::Options options;
bool data_size_initialized = false;
for (int line_id = 0; line_id < lines.size(); ++line_id) {
if (!ReadImageToDatum(root_folder + lines[line_id].first,
- lines[line_id].second, &datum)) {
+ lines[line_id].second, resize_height, resize_width, &datum)) {
continue;
}
if (!data_size_initialized) {