From: Billy Lamberta Date: Thu, 24 May 2018 01:46:20 +0000 (-0700) Subject: Moves estimator getting started docs into programmer's guide. X-Git-Tag: upstream/v1.9.0_rc1~38^2~4^2~136 X-Git-Url: http://review.tizen.org/git/?a=commitdiff_plain;h=cd468ceee10646c5e023661537a20915f52677f9;p=platform%2Fupstream%2Ftensorflow.git Moves estimator getting started docs into programmer's guide. Update path references and magic links. Remove getting started with estimators doc. Add redirects. PiperOrigin-RevId: 197826223 --- diff --git a/tensorflow/contrib/estimator/python/estimator/hooks.py b/tensorflow/contrib/estimator/python/estimator/hooks.py index 4808b9e..ddd6aa4 100644 --- a/tensorflow/contrib/estimator/python/estimator/hooks.py +++ b/tensorflow/contrib/estimator/python/estimator/hooks.py @@ -72,7 +72,7 @@ class InMemoryEvaluatorHook(training.SessionRunHook): estimator: A `tf.estimator.Estimator` instance to call evaluate. input_fn: Equivalent to the `input_fn` arg to `estimator.evaluate`. A function that constructs the input data for evaluation. - See @{$get_started/premade_estimators#create_input_functions} for more + See @{$premade_estimators#create_input_functions} for more information. The function should construct and return one of the following: diff --git a/tensorflow/docs_src/get_started/datasets_quickstart.md b/tensorflow/docs_src/get_started/datasets_quickstart.md index c972e5e..020e40d 100644 --- a/tensorflow/docs_src/get_started/datasets_quickstart.md +++ b/tensorflow/docs_src/get_started/datasets_quickstart.md @@ -14,7 +14,7 @@ introduces the API by walking through two simple examples: Taking slices from an array is the simplest way to get started with `tf.data`. -The @{$get_started/premade_estimators$Premade Estimators} chapter describes +The @{$premade_estimators$Premade Estimators} chapter describes the following `train_input_fn`, from [`iris_data.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/iris_data.py), to pipe the data into the Estimator: @@ -377,7 +377,7 @@ Now you have the basic idea of how to efficiently load data into an Estimator. Consider the following documents next: -* @{$get_started/custom_estimators}, which demonstrates how to build your own +* @{$custom_estimators}, which demonstrates how to build your own custom `Estimator` model. * The @{$low_level_intro#datasets$Low Level Introduction}, which demonstrates how to experiment directly with `tf.data.Datasets` using TensorFlow's low diff --git a/tensorflow/docs_src/get_started/get_started_for_beginners.md b/tensorflow/docs_src/get_started/get_started_for_beginners.md deleted file mode 100644 index d5a80e2..0000000 --- a/tensorflow/docs_src/get_started/get_started_for_beginners.md +++ /dev/null @@ -1,751 +0,0 @@ -# Get Started with Graph Execution - -This document explains how to use machine learning to classify (categorize) -Iris flowers by species. This document dives deeply into the TensorFlow -code to do exactly that, explaining ML fundamentals along the way. - -If the following list describes you, then you are in the right place: - -* You know little to nothing about machine learning. -* You want to learn how to write TensorFlow programs. -* You can code (at least a little) in Python. - -If you are already familiar with basic machine learning concepts -but are new to TensorFlow, read -@{$premade_estimators$Getting Started with TensorFlow: for ML Experts}. - -If you'd like to learn a lot about the basics of Machine Learning, -consider taking -[Machine Learning Crash Course](https://developers.google.com/machine-learning/crash-course/). - - -## The Iris classification problem - -Imagine you are a botanist seeking an automated way to classify each -Iris flower you find. Machine learning provides many ways to classify flowers. -For instance, a sophisticated machine learning program could classify flowers -based on photographs. Our ambitions are more modest--we're going to classify -Iris flowers based solely on the length and width of their -[sepals](https://en.wikipedia.org/wiki/Sepal) and -[petals](https://en.wikipedia.org/wiki/Petal). - -The Iris genus entails about 300 species, but our program will classify only -the following three: - -* Iris setosa -* Iris virginica -* Iris versicolor - -
-Petal geometry compared for three iris species: Iris setosa, Iris virginica, and Iris versicolor -
- -**From left to right, -[*Iris setosa*](https://commons.wikimedia.org/w/index.php?curid=170298) (by -[Radomil](https://commons.wikimedia.org/wiki/User:Radomil), CC BY-SA 3.0), -[*Iris versicolor*](https://commons.wikimedia.org/w/index.php?curid=248095) (by -[Dlanglois](https://commons.wikimedia.org/wiki/User:Dlanglois), CC BY-SA 3.0), -and [*Iris virginica*](https://www.flickr.com/photos/33397993@N05/3352169862) -(by [Frank Mayfield](https://www.flickr.com/photos/33397993@N05), CC BY-SA -2.0).** -

 

- -Fortunately, someone has already created [a data set of 120 Iris -flowers](https://en.wikipedia.org/wiki/Iris_flower_data_set) -with the sepal and petal measurements. This data set has become -one of the canonical introductions to machine learning classification problems. -(The [MNIST database](https://en.wikipedia.org/wiki/MNIST_database), -which contains handwritten digits, is another popular classification -problem.) The first 5 entries of the Iris data set -look as follows: - -| Sepal length | sepal width | petal length | petal width | species -| --- | --- | --- | --- | --- -|6.4 | 2.8 | 5.6 | 2.2 | 2 -|5.0 | 2.3 | 3.3 | 1.0 | 1 -|4.9 | 2.5 | 4.5 | 1.7 | 2 -|4.9 | 3.1 | 1.5 | 0.1 | 0 -|5.7 | 3.8 | 1.7 | 0.3 | 0 - -Let's introduce some terms: - -* The last column (species) is called the - [**label**](https://developers.google.com/machine-learning/glossary/#label); - the first four columns are called - [**features**](https://developers.google.com/machine-learning/glossary/#feature). - Features are characteristics of an example, while the label is - the thing we're trying to predict. - -* An [**example**](https://developers.google.com/machine-learning/glossary/#example) - consists of the set of features and the label for one sample - flower. The preceding table shows 5 examples from a data set of - 120 examples. - -Each label is naturally a string (for example, "setosa"), but machine learning -typically relies on numeric values. Therefore, someone mapped each string to -a number. Here's the representation scheme: - -* 0 represents setosa -* 1 represents versicolor -* 2 represents virginica - -For a look at other examples of labels and examples, see the -[ML Terminology section of Machine Learning Crash Course](https://developers.google.com/machine-learning/crash-course/framing/ml-terminology). - - -## Models and training - -A **model** is the relationship between features -and the label. For the Iris problem, the model defines the relationship -between the sepal and petal measurements and the predicted Iris species. Some -simple models can be described with a few lines of algebra, but complex machine -learning models have a large number of parameters that are difficult to -summarize. - -Could you determine the relationship between the four features and the -Iris species *without* using machine learning? That is, could you use -traditional programming techniques (for example, a lot of conditional -statements) to create a model? Maybe. You could play with the data set -long enough to determine the right relationships of petal and sepal -measurements to particular species. However, a good machine learning -approach *determines the model for you*. That is, if you feed enough -representative examples into the right machine learning model type, the program -will determine the relationship between sepals, petals, and species. - -**Training** is the stage of machine learning in which the model is -gradually optimized (learned). The Iris problem is an example -of [**supervised machine -learning**](https://developers.google.com/machine-learning/glossary/#supervised_machine_learning) -in which a model is trained from examples that contain labels. (In -[**unsupervised machine -learning**](https://developers.google.com/machine-learning/glossary/#unsupervised_machine_learning), -the examples don't contain labels. Instead, the model typically finds -patterns among the features.) - - - - -## Get the sample program - -Prior to playing with the sample code in this document, do the following: - -1. @{$install$Install TensorFlow}. -2. If you installed TensorFlow with virtualenv or Anaconda, activate your - TensorFlow environment. -3. Install or upgrade pandas by issuing the following command: - - `pip install pandas` - - -Take the following steps to get the sample program: - -1. Clone the TensorFlow Models repository from github by entering the following - command: - - `git clone https://github.com/tensorflow/models` - -2. Change directory within that branch to the location containing the examples - used in this document: - - `cd models/samples/core/get_started/` - -In that `get_started` directory, you'll find a program -named `premade_estimator.py`. - - -## Run the sample program - -You run TensorFlow programs as you would run any Python program. Therefore, -issue the following command from a command line to -run `premade_estimators.py`: - -``` bash -python premade_estimator.py -``` - -Running the program should output a whole bunch of information ending with -three prediction lines like the following: - -```None -... -Prediction is "Setosa" (99.6%), expected "Setosa" - -Prediction is "Versicolor" (99.8%), expected "Versicolor" - -Prediction is "Virginica" (97.9%), expected "Virginica" -``` - -If the program generates errors instead of predictions, ask yourself the -following questions: - -* Did you install TensorFlow properly? -* Are you using the correct version of TensorFlow? The `premade_estimators.py` - program requires at least TensorFlow v1.4. -* If you installed TensorFlow with virtualenv or Anaconda, did you activate - the environment? - - - -## The TensorFlow programming stack - -As the following illustration shows, TensorFlow -provides a programming stack consisting of multiple API layers: - -
- -
- -**The TensorFlow Programming Environment.** -

 

- -As you start writing TensorFlow programs, we strongly recommend focusing on -the following two high-level APIs: - -* Estimators -* Datasets - -Although we'll grab an occasional convenience function from other APIs, -this document focuses on the preceding two APIs. - - -## The program itself - -Thanks for your patience; let's dig into the code. -The general outline of `premade_estimator.py`--and many other TensorFlow -programs--is as follows: - -* Import and parse the data sets. -* Create feature columns to describe the data. -* Select the type of model -* Train the model. -* Evaluate the model's effectiveness. -* Let the trained model make predictions. - -The following subsections detail each part. - - -### Import and parse the data sets - -The Iris program requires the data from the following two .csv files: - -* `http://download.tensorflow.org/data/iris_training.csv`, which contains - the training set. -* `http://download.tensorflow.org/data/iris_test.csv`, which contains the - test set. - -The **training set** contains the examples that we'll use to train the model; -the **test set** contains the examples that we'll use to evaluate the trained -model's effectiveness. - -The training set and test set started out as a -single data set. Then, someone split the examples, with the majority going into -the training set and the remainder going into the test set. Adding -examples to the training set usually builds a better model; however, adding -more examples to the test set enables us to better gauge the model's -effectiveness. Regardless of the split, the examples in the test set -must be separate from the examples in the training set. Otherwise, you can't -accurately determine the model's effectiveness. - -The `premade_estimators.py` program relies on the `load_data` function -in the adjacent [`iris_data.py`]( -https://github.com/tensorflow/models/blob/master/samples/core/get_started/iris_data.py) -file to read in and parse the training set and test set. -Here is a heavily commented version of the function: - -```python -TRAIN_URL = "http://download.tensorflow.org/data/iris_training.csv" -TEST_URL = "http://download.tensorflow.org/data/iris_test.csv" - -CSV_COLUMN_NAMES = ['SepalLength', 'SepalWidth', - 'PetalLength', 'PetalWidth', 'Species'] - -... - -def load_data(label_name='Species'): - """Parses the csv file in TRAIN_URL and TEST_URL.""" - - # Create a local copy of the training set. - train_path = tf.keras.utils.get_file(fname=TRAIN_URL.split('/')[-1], - origin=TRAIN_URL) - # train_path now holds the pathname: ~/.keras/datasets/iris_training.csv - - # Parse the local CSV file. - train = pd.read_csv(filepath_or_buffer=train_path, - names=CSV_COLUMN_NAMES, # list of column names - header=0 # ignore the first row of the CSV file. - ) - # train now holds a pandas DataFrame, which is data structure - # analogous to a table. - - # 1. Assign the DataFrame's labels (the right-most column) to train_label. - # 2. Delete (pop) the labels from the DataFrame. - # 3. Assign the remainder of the DataFrame to train_features - train_features, train_label = train, train.pop(label_name) - - # Apply the preceding logic to the test set. - test_path = tf.keras.utils.get_file(TEST_URL.split('/')[-1], TEST_URL) - test = pd.read_csv(test_path, names=CSV_COLUMN_NAMES, header=0) - test_features, test_label = test, test.pop(label_name) - - # Return four DataFrames. - return (train_features, train_label), (test_features, test_label) -``` - -Keras is an open-sourced machine learning library; `tf.keras` is a TensorFlow -implementation of Keras. The `premade_estimator.py` program only accesses -one `tf.keras` function; namely, the `tf.keras.utils.get_file` convenience -function, which copies a remote CSV file to a local file system. - -The call to `load_data` returns two `(feature,label)` pairs, for the training -and test sets respectively: - -```python - # Call load_data() to parse the CSV file. - (train_feature, train_label), (test_feature, test_label) = load_data() -``` - -Pandas is an open-source Python library leveraged by several -TensorFlow functions. A pandas -[**DataFrame**](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html) -is a table with named columns headers and numbered rows. -The features returned by `load_data` are packed in `DataFrames`. -For example, the `test_feature` DataFrame looks as follows: - -```none - SepalLength SepalWidth PetalLength PetalWidth -0 5.9 3.0 4.2 1.5 -1 6.9 3.1 5.4 2.1 -2 5.1 3.3 1.7 0.5 -... -27 6.7 3.1 4.7 1.5 -28 6.7 3.3 5.7 2.5 -29 6.4 2.9 4.3 1.3 -``` - - -### Describe the data - -A **feature column** is a data structure that tells your model -how to interpret the data in each feature. In the Iris problem, -we want the model to interpret the data in each -feature as its literal floating-point value; that is, we want the -model to interpret an input value like 5.4 as, well, 5.4. However, -in other machine learning problems, it is often desirable to interpret -data less literally. Using feature columns to -interpret data is such a rich topic that we devote an entire -@{$feature_columns$document} to it. - -From a code perspective, you build a list of `feature_column` objects by calling -functions from the @{tf.feature_column} module. Each object describes an input -to the model. To tell the model to interpret data as a floating-point value, -call @{tf.feature_column.numeric_column}. In `premade_estimator.py`, all -four features should be interpreted as literal floating-point values, so -the code to create a feature column looks as follows: - -```python -# Create feature columns for all features. -my_feature_columns = [] -for key in train_x.keys(): - my_feature_columns.append(tf.feature_column.numeric_column(key=key)) -``` - -Here is a less elegant, but possibly clearer, alternative way to -encode the preceding block: - -```python -my_feature_columns = [ - tf.feature_column.numeric_column(key='SepalLength'), - tf.feature_column.numeric_column(key='SepalWidth'), - tf.feature_column.numeric_column(key='PetalLength'), - tf.feature_column.numeric_column(key='PetalWidth') -] -``` - - -### Select the type of model - -We need to select the kind of model that will be trained. -Lots of model types exist; picking the ideal type takes experience. -We've selected a neural network to solve the Iris problem. [**Neural -networks**](https://developers.google.com/machine-learning/glossary/#neural_network) -can find complex relationships between features and the label. -A neural network is a highly-structured graph, organized into one or more -[**hidden layers**](https://developers.google.com/machine-learning/glossary/#hidden_layer). -Each hidden layer consists of one or more -[**neurons**](https://developers.google.com/machine-learning/glossary/#neuron). -There are several categories of neural networks. -We'll be using a [**fully connected neural -network**](https://developers.google.com/machine-learning/glossary/#fully_connected_layer), -which means that the neurons in one layer take inputs from *every* neuron in -the previous layer. For example, the following figure illustrates a -fully connected neural network consisting of three hidden layers: - -* The first hidden layer contains four neurons. -* The second hidden layer contains three neurons. -* The third hidden layer contains two neurons. - -
- -
- -**A neural network with three hidden layers.** -

 

- -For a more detailed introduction to neural networks, see the -[Introduction to Neural Nets section of Machine Learning Crash Course](https://developers.google.com/machine-learning/crash-course/introduction-to-neural-networks/anatomy). - -To specify a model type, instantiate an -[**Estimator**](https://developers.google.com/machine-learning/glossary/#Estimators) -class. TensorFlow provides two categories of Estimators: - -* [**pre-made - Estimators**](https://developers.google.com/machine-learning/glossary/#pre-made_Estimator), - which someone else has already written for you. -* [**custom - Estimators**](https://developers.google.com/machine-learning/glossary/#custom_estimator), - which you must code yourself, at least partially. - -To implement a neural network, the `premade_estimators.py` program uses -a pre-made Estimator named @{tf.estimator.DNNClassifier}. This Estimator -builds a neural network that classifies examples. The following call -instantiates `DNNClassifier`: - -```python - classifier = tf.estimator.DNNClassifier( - feature_columns=my_feature_columns, - hidden_units=[10, 10], - n_classes=3) -``` - -Use the `hidden_units` parameter to define the number of neurons -in each hidden layer of the neural network. Assign this parameter -a list. For example: - -```python - hidden_units=[10, 10], -``` - -The length of the list assigned to `hidden_units` identifies the number of -hidden layers (2, in this case). -Each value in the list represents the number of neurons in a particular -hidden layer (10 in the first hidden layer and 10 in the second hidden layer). -To change the number of hidden layers or neurons, simply assign a different -list to the `hidden_units` parameter. - -The ideal number of hidden layers and neurons depends on the problem -and the data set. Like many aspects of machine learning, -picking the ideal shape of the neural network requires some mixture -of knowledge and experimentation. -As a rule of thumb, increasing the number of hidden layers and neurons -*typically* creates a more powerful model, which requires more data to -train effectively. - -The `n_classes` parameter specifies the number of possible values that the -neural network can predict. Since the Iris problem classifies 3 Iris species, -we set `n_classes` to 3. - -The constructor for `tf.Estimator.DNNClassifier` takes an optional argument -named `optimizer`, which our sample code chose not to specify. The -[**optimizer**](https://developers.google.com/machine-learning/glossary/#optimizer) -controls how the model will train. As you develop more expertise in machine -learning, optimizers and -[**learning -rate**](https://developers.google.com/machine-learning/glossary/#learning_rate) -will become very important. - - - -### Train the model - -Instantiating a `tf.Estimator.DNNClassifier` creates a framework for learning -the model. Basically, we've wired a network but haven't yet let data flow -through it. To train the neural network, call the Estimator object's `train` -method. For example: - -```python - classifier.train( - input_fn=lambda:train_input_fn(train_feature, train_label, args.batch_size), - steps=args.train_steps) -``` - -The `steps` argument tells `train` to stop training after the specified -number of iterations. Increasing `steps` increases the amount of time -the model will train. Counter-intuitively, training a model longer -does not guarantee a better model. The default value of `args.train_steps` -is 1000. The number of steps to train is a -[**hyperparameter**](https://developers.google.com/machine-learning/glossary/#hyperparameter) -you can tune. Choosing the right number of steps usually -requires both experience and experimentation. - -The `input_fn` parameter identifies the function that supplies the -training data. The call to the `train` method indicates that the -`train_input_fn` function will supply the training data. Here's that -method's signature: - -```python -def train_input_fn(features, labels, batch_size): -``` - -We're passing the following arguments to `train_input_fn`: - -* `train_feature` is a Python dictionary in which: - * Each key is the name of a feature. - * Each value is an array containing the values for each example in the - training set. -* `train_label` is an array containing the values of the label for every - example in the training set. -* `args.batch_size` is an integer defining the [**batch - size**](https://developers.google.com/machine-learning/glossary/#batch_size). - -The `train_input_fn` function relies on the **Dataset API**. This is a -high-level TensorFlow API for reading data and transforming it into a form -that the `train` method requires. The following call converts the -input features and labels into a `tf.data.Dataset` object, which is the base -class of the Dataset API: - -```python - dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels)) -``` - -The `tf.dataset` class provides many useful functions for preparing examples -for training. The following line calls three of those functions: - -```python - dataset = dataset.shuffle(buffer_size=1000).repeat(count=None).batch(batch_size) -``` - -Training works best if the training examples are in -random order. To randomize the examples, call -`tf.data.Dataset.shuffle`. Setting the `buffer_size` to a value -larger than the number of examples (120) ensures that the data will -be well shuffled. - -During training, the `train` method typically processes the -examples multiple times. Calling the -`tf.data.Dataset.repeat` method without any arguments ensures -that the `train` method has an infinite supply of (now shuffled) -training set examples. - -The `train` method processes a -[**batch**](https://developers.google.com/machine-learning/glossary/#batch) -of examples at a time. -The `tf.data.Dataset.batch` method creates a batch by -concatenating multiple examples. -This program sets the default [**batch -size**](https://developers.google.com/machine-learning/glossary/#batch_size) -to 100, meaning that the `batch` method will concatenate groups of -100 examples. The ideal batch size depends on the problem. As a rule -of thumb, smaller batch sizes usually enable the `train` method to train -the model faster at the expense (sometimes) of accuracy. - -The following `return` statement passes a batch of examples back to -the caller (the `train` method). - -```python - return dataset.make_one_shot_iterator().get_next() -``` - - -### Evaluate the model - -**Evaluating** means determining how effectively the model makes -predictions. To determine the Iris classification model's effectiveness, -pass some sepal and petal measurements to the model and ask the model -to predict what Iris species they represent. Then compare the model's -prediction against the actual label. For example, a model that picked -the correct species on half the input examples would have an -[accuracy](https://developers.google.com/machine-learning/glossary/#accuracy) -of 0.5. The following suggests a more effective model: - - - - - - - - - - - - - - - - - - - - - -
- Test Set
FeaturesLabelPrediction
5.9 3.0 4.3 1.5 11
6.9 3.1 5.4 2.1 22
5.1 3.3 1.7 0.5 00
6.0 3.4 4.5 1.6 12
5.5 2.5 4.0 1.3 11
- -**A model that is 80% accurate.** -

 

- -To evaluate a model's effectiveness, each Estimator provides an `evaluate` -method. The `premade_estimator.py` program calls `evaluate` as follows: - -```python -# Evaluate the model. -eval_result = classifier.evaluate( - input_fn=lambda:eval_input_fn(test_x, test_y, args.batch_size)) - -print('\nTest set accuracy: {accuracy:0.3f}\n'.format(**eval_result)) -``` - -The call to `classifier.evaluate` is similar to the call to `classifier.train`. -The biggest difference is that `classifier.evaluate` must get its examples -from the test set rather than the training set. In other words, to -fairly assess a model's effectiveness, the examples used to -*evaluate* a model must be different from the examples used to *train* -the model. The `eval_input_fn` function serves a batch of examples from -the test set. Here's the `eval_input_fn` method: - -```python -def eval_input_fn(features, labels=None, batch_size=None): - """An input function for evaluation or prediction""" - if labels is None: - # No labels, use only features. - inputs = features - else: - inputs = (features, labels) - - # Convert inputs to a tf.dataset object. - dataset = tf.data.Dataset.from_tensor_slices(inputs) - - # Batch the examples - assert batch_size is not None, "batch_size must not be None" - dataset = dataset.batch(batch_size) - - # Return the read end of the pipeline. - return dataset.make_one_shot_iterator().get_next() -``` - -In brief, `eval_input_fn` does the following when called by -`classifier.evaluate`: - -1. Converts the features and labels from the test set to a `tf.dataset` - object. -2. Creates a batch of test set examples. (There's no need to shuffle - or repeat the test set examples.) -3. Returns that batch of test set examples to `classifier.evaluate`. - -Running this code yields the following output (or something close to it): - -```none -Test set accuracy: 0.967 -``` - -An accuracy of 0.967 implies that our trained model correctly classified 29 -out of the 30 Iris species in the test set. - -To get a deeper understanding of different metrics for evaluating -models, see the -[Classification section of Machine Learning Crash Course](https://developers.google.com/machine-learning/crash-course/classification). - - -### Predicting - -We've now trained a model and "proven" that it is good--but not -perfect--at classifying Iris species. Now let's use the trained -model to make some predictions on [**unlabeled -examples**](https://developers.google.com/machine-learning/glossary/#unlabeled_example); -that is, on examples that contain features but not a label. - -In real-life, the unlabeled examples could come from lots of different -sources including apps, CSV files, and data feeds. For now, we're simply -going to manually provide the following three unlabeled examples: - -```python - predict_x = { - 'SepalLength': [5.1, 5.9, 6.9], - 'SepalWidth': [3.3, 3.0, 3.1], - 'PetalLength': [1.7, 4.2, 5.4], - 'PetalWidth': [0.5, 1.5, 2.1], - } -``` - -Every Estimator provides a `predict` method, which `premade_estimator.py` -calls as follows: - -```python -predictions = classifier.predict( - input_fn=lambda:eval_input_fn(predict_x, - labels=None, - batch_size=args.batch_size)) -``` - -As with the `evaluate` method, our `predict` method also gathers examples -from the `eval_input_fn` method. - -When doing predictions, we're *not* passing labels to `eval_input_fn`. -Therefore, `eval_input_fn` does the following: - -1. Converts the features from the 3-element manual set we just created. -2. Creates a batch of 3 examples from that manual set. -3. Returns that batch of examples to `classifier.predict`. - -The `predict` method returns a python iterable, yielding a dictionary of -prediction results for each example. This dictionary contains several keys. -The `probabilities` key holds a list of three floating-point values, -each representing the probability that the input example is a particular -Iris species. For example, consider the following `probabilities` list: - -```none -'probabilities': array([ 1.19127117e-08, 3.97069454e-02, 9.60292995e-01]) -``` - -The preceding list indicates: - -* A negligible chance of the Iris being Setosa. -* A 3.97% chance of the Iris being Versicolor. -* A 96.0% chance of the Iris being Virginica. - -The `class_ids` key holds a one-element array that identifies the most -probable species. For example: - -```none -'class_ids': array([2]) -``` - -The number `2` corresponds to Virginica. The following code iterates -through the returned `predictions` to report on each prediction: - -``` python -for pred_dict, expec in zip(predictions, expected): - template = ('\nPrediction is "{}" ({:.1f}%), expected "{}"') - - class_id = pred_dict['class_ids'][0] - probability = pred_dict['probabilities'][class_id] - print(template.format(iris_data.SPECIES[class_id], 100 * probability, expec)) -``` - -Running the program yields the following output: - - -``` None -... -Prediction is "Setosa" (99.6%), expected "Setosa" - -Prediction is "Versicolor" (99.8%), expected "Versicolor" - -Prediction is "Virginica" (97.9%), expected "Virginica" -``` - - -## Summary - -This document provides a short introduction to machine learning. - -Because `premade_estimators.py` relies on high-level APIs, much of the -mathematical complexity in machine learning is hidden. -If you intend to become more proficient in machine learning, we recommend -ultimately learning more about [**gradient -descent**](https://developers.google.com/machine-learning/glossary/#gradient_descent), -batching, and neural networks. - -We recommend reading the @{$feature_columns$Feature Columns} document next, -which explains how to represent different kinds of data in machine learning. diff --git a/tensorflow/docs_src/get_started/index.md b/tensorflow/docs_src/get_started/index.md index 746126c..55579d5 100644 --- a/tensorflow/docs_src/get_started/index.md +++ b/tensorflow/docs_src/get_started/index.md @@ -15,26 +15,8 @@ The easiest way to get started with TensorFlow is using Eager Execution. * @{$get_started/eager}, is for anyone new to machine learning or TensorFlow. TensorFlow provides many APIs. The remainder of this section focuses on the -Estimator API which provide scalable, high-performance models. -To get started with Estimators begin by reading one of the following documents: - - * @{$get_started/get_started_for_beginners}, which is aimed at readers - new to machine learning. - * @{$get_started/premade_estimators}, which is aimed at readers who have - experience in machine learning. - -Then, read the following documents, which demonstrate the key features -in the high-level APIs: - - * @{$get_started/checkpoints}, which explains how to save training progress - and resume where you left off. - * @{$get_started/feature_columns}, which shows how an - Estimator can handle a variety of input data types without changes to the - model. - * @{$get_started/datasets_quickstart}, which introduces TensorFlow's - input pipelines. - * @{$get_started/custom_estimators}, which demonstrates how - to build and train models you design yourself. +Estimator API which provide scalable, high-performance models. See the +@{$estimators} guide. For more advanced users: diff --git a/tensorflow/docs_src/get_started/leftnav_files b/tensorflow/docs_src/get_started/leftnav_files index 4c12f0d..e6cc8d5 100644 --- a/tensorflow/docs_src/get_started/leftnav_files +++ b/tensorflow/docs_src/get_started/leftnav_files @@ -1,15 +1,4 @@ index.md -### Beginners eager.md -get_started_for_beginners.md -premade_estimators.md - -### Estimators -get_started_for_beginners.md: For Beginners -premade_estimators.md: Premade Estimators ->>> -checkpoints.md -feature_columns.md datasets_quickstart.md -custom_estimators.md diff --git a/tensorflow/docs_src/install/install_mac.md b/tensorflow/docs_src/install/install_mac.md index 90d9ea0..0906b55 100644 --- a/tensorflow/docs_src/install/install_mac.md +++ b/tensorflow/docs_src/install/install_mac.md @@ -403,10 +403,8 @@ writing TensorFlow programs: If the system outputs an error message instead of a greeting, see [Common installation problems](#common_installation_problems). -If you are new to machine learning, we recommend the following: - -* [Machine Learning Crash Course](https://developers.google.com/machine-learning/crash-course) -* @{$get_started/get_started_for_beginners$Getting Started for ML Beginners} +If you are new to machine learning, we recommend the +[Machine Learning Crash Course](https://developers.google.com/machine-learning/crash-course). If you are experienced with machine learning but new to TensorFlow, see @{$get_started/eager}. diff --git a/tensorflow/docs_src/install/install_windows.md b/tensorflow/docs_src/install/install_windows.md index a139a49..6c4f5b8 100644 --- a/tensorflow/docs_src/install/install_windows.md +++ b/tensorflow/docs_src/install/install_windows.md @@ -157,10 +157,8 @@ TensorFlow programs: If the system outputs an error message instead of a greeting, see [Common installation problems](#common_installation_problems). -If you are new to machine learning, we recommend the following: - -* [Machine Learning Crash Course](https://developers.google.com/machine-learning/crash-course) -* @{$get_started/get_started_for_beginners$Getting Started for ML Beginners} +If you are new to machine learning, we recommend the +[Machine Learning Crash Course](https://developers.google.com/machine-learning/crash-course). If you are experienced with machine learning but new to TensorFlow, see @{$get_started/eager}. diff --git a/tensorflow/docs_src/get_started/checkpoints.md b/tensorflow/docs_src/programmers_guide/checkpoints.md similarity index 100% rename from tensorflow/docs_src/get_started/checkpoints.md rename to tensorflow/docs_src/programmers_guide/checkpoints.md diff --git a/tensorflow/docs_src/get_started/custom_estimators.md b/tensorflow/docs_src/programmers_guide/custom_estimators.md similarity index 98% rename from tensorflow/docs_src/get_started/custom_estimators.md rename to tensorflow/docs_src/programmers_guide/custom_estimators.md index 275cda1..fb20b35 100644 --- a/tensorflow/docs_src/get_started/custom_estimators.md +++ b/tensorflow/docs_src/programmers_guide/custom_estimators.md @@ -5,7 +5,7 @@ This document introduces custom Estimators. In particular, this document demonstrates how to create a custom @{tf.estimator.Estimator$Estimator} that mimics the behavior of the pre-made Estimator @{tf.estimator.DNNClassifier$`DNNClassifier`} in solving the Iris problem. See -the @{$get_started/premade_estimators$Pre-Made Estimators chapter} for details +the @{$premade_estimators$Pre-Made Estimators chapter} for details on the Iris problem. To download and access the example code invoke the following two commands: @@ -84,7 +84,7 @@ and a logits output layer. ## Write an Input function Our custom Estimator implementation uses the same input function as our -@{$get_started/premade_estimators$pre-made Estimator implementation}, from +@{$premade_estimators$pre-made Estimator implementation}, from [`iris_data.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/iris_data.py). Namely: @@ -106,8 +106,8 @@ This input function builds an input pipeline that yields batches of ## Create feature columns -As detailed in the @{$get_started/premade_estimators$Premade Estimators} and -@{$get_started/feature_columns$Feature Columns} chapters, you must define +As detailed in the @{$premade_estimators$Premade Estimators} and +@{$feature_columns$Feature Columns} chapters, you must define your model's feature columns to specify how the model should use each feature. Whether working with pre-made Estimators or custom Estimators, you define feature columns in the same fashion. @@ -145,7 +145,7 @@ to the constructor are in turn passed on to the `model_fn`. In [`custom_estimator.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/custom_estimator.py) the following lines create the estimator and set the params to configure the model. This configuration step is similar to how we configured the @{tf.estimator.DNNClassifier} in -@{$get_started/premade_estimators}. +@{$premade_estimators}. ```python classifier = tf.estimator.Estimator( @@ -489,7 +489,7 @@ configure your Estimator without modifying the code in the `model_fn`. The rest of the code to train, evaluate, and generate predictions using our Estimator is the same as in the -@{$get_started/premade_estimators$Premade Estimators} chapter. For +@{$premade_estimators$Premade Estimators} chapter. For example, the following line will train the model: ```python diff --git a/tensorflow/docs_src/programmers_guide/estimators.md b/tensorflow/docs_src/programmers_guide/estimators.md index ffadf29..c4aae1d 100644 --- a/tensorflow/docs_src/programmers_guide/estimators.md +++ b/tensorflow/docs_src/programmers_guide/estimators.md @@ -134,7 +134,7 @@ The heart of every Estimator--whether pre-made or custom--is its evaluation, and prediction. When you are using a pre-made Estimator, someone else has already implemented the model function. When relying on a custom Estimator, you must write the model function yourself. A -@{$get_started/custom_estimators$companion document} +@{$custom_estimators$companion document} explains how to write the model function. diff --git a/tensorflow/docs_src/get_started/feature_columns.md b/tensorflow/docs_src/programmers_guide/feature_columns.md similarity index 99% rename from tensorflow/docs_src/get_started/feature_columns.md rename to tensorflow/docs_src/programmers_guide/feature_columns.md index 79c2667..845194f 100644 --- a/tensorflow/docs_src/get_started/feature_columns.md +++ b/tensorflow/docs_src/programmers_guide/feature_columns.md @@ -5,7 +5,7 @@ intermediaries between raw data and Estimators. Feature columns are very rich, enabling you to transform a diverse range of raw data into formats that Estimators can use, allowing easy experimentation. -In @{$get_started/premade_estimators$Premade Estimators}, we used the premade +In @{$premade_estimators$Premade Estimators}, we used the premade Estimator, @{tf.estimator.DNNClassifier$`DNNClassifier`} to train a model to predict different types of Iris flowers from four input features. That example created only numerical feature columns (of type diff --git a/tensorflow/docs_src/programmers_guide/index.md b/tensorflow/docs_src/programmers_guide/index.md index 648d001..9ebfd39 100644 --- a/tensorflow/docs_src/programmers_guide/index.md +++ b/tensorflow/docs_src/programmers_guide/index.md @@ -11,6 +11,23 @@ works. The units are as follows: * @{$programmers_guide/datasets}, which explains how to set up data pipelines to read data sets into your TensorFlow program. +## Estimators + +* @{$estimators} provides an introduction. +* @{$premade_estimators}, introduces Estimators for machine learning. +* @{$custom_estimators}, which demonstrates how to build and train models you + design yourself. +* @{$feature_columns}, which shows how an Estimator can handle a variety of input + data types without changes to the model. +* @{$checkpoints}, which explains how to save training progress and resume where + you left off. + +## Accelerators + + * @{$using_gpu} explains how TensorFlow assigns operations to + devices and how you can change the arrangement manually. + * @{$using_tpu} explains how to modify `Estimator` programs to run on a TPU. + ## Low Level APIs * @{$programmers_guide/low_level_intro}, which introduces the @@ -32,13 +49,6 @@ works. The units are as follows: * @{$programmers_guide/saved_model}, which explains how to save and restore variables and models. -## Accelerators - - * @{$using_gpu} explains how TensorFlow assigns operations to - devices and how you can change the arrangement manually. - * @{$using_tpu} explains how to modify `Estimator` programs to run on a TPU. - - ## ML Concepts * @{$programmers_guide/embedding}, which introduces the concept diff --git a/tensorflow/docs_src/programmers_guide/leftnav_files b/tensorflow/docs_src/programmers_guide/leftnav_files index 7ac63bf..3313174 100644 --- a/tensorflow/docs_src/programmers_guide/leftnav_files +++ b/tensorflow/docs_src/programmers_guide/leftnav_files @@ -3,7 +3,17 @@ index.md ### High Level APIs eager.md datasets.md -estimators.md + +### Estimators +estimators.md: Introduction to Estimators +premade_estimators.md +custom_estimators.md +feature_columns.md +checkpoints.md + +### Accelerators +using_gpu.md +using_tpu.md ### Low Level APIs low_level_intro.md @@ -12,10 +22,6 @@ variables.md graphs.md saved_model.md -### Accelerators -using_gpu.md -using_tpu.md - ### ML Concepts embedding.md diff --git a/tensorflow/docs_src/programmers_guide/low_level_intro.md b/tensorflow/docs_src/programmers_guide/low_level_intro.md index 05709ad..478e2bb 100644 --- a/tensorflow/docs_src/programmers_guide/low_level_intro.md +++ b/tensorflow/docs_src/programmers_guide/low_level_intro.md @@ -9,7 +9,7 @@ This guide gets you started programming in the low-level TensorFlow APIs * Use high level components ([datasets](#datasets), [layers](#layers), and [feature_columns](#feature_columns)) in this low level environment. * Build your own training loop, instead of using the one - @{$get_started/premade_estimators$provided by Estimators}. + @{$premade_estimators$provided by Estimators}. We recommend using the higher level APIs to build models when possible. Knowing TensorFlow Core is valuable for the following reasons: @@ -398,7 +398,7 @@ and layer reuse impossible. The easiest way to experiment with feature columns is using the @{tf.feature_column.input_layer} function. This function only accepts -@{$get_started/feature_columns$dense columns} as inputs, so to view the result +@{$feature_columns$dense columns} as inputs, so to view the result of a categorical column you must wrap it in an @{tf.feature_column.indicator_column}. For example: @@ -589,7 +589,7 @@ print(sess.run(y_pred)) To learn more about building models with TensorFlow consider the following: -* @{$get_started/custom_estimators$Custom Estimators}, to learn how to build +* @{$custom_estimators$Custom Estimators}, to learn how to build customized models with TensorFlow. Your knowledge of TensorFlow Core will help you understand and debug your own models. diff --git a/tensorflow/docs_src/get_started/premade_estimators.md b/tensorflow/docs_src/programmers_guide/premade_estimators.md similarity index 98% rename from tensorflow/docs_src/get_started/premade_estimators.md rename to tensorflow/docs_src/programmers_guide/premade_estimators.md index 4be7e50..e5eca44 100644 --- a/tensorflow/docs_src/get_started/premade_estimators.md +++ b/tensorflow/docs_src/programmers_guide/premade_estimators.md @@ -289,7 +289,7 @@ for key in train_x.keys(): ``` Feature columns can be far more sophisticated than those we're showing here. We -detail feature columns @{$get_started/feature_columns$later on} in our Getting +detail feature columns @{$feature_columns$later on} in our Getting Started guide. Now that we have the description of how we want the model to represent the raw @@ -425,11 +425,10 @@ Pre-made Estimators are an effective way to quickly create standard models. Now that you've gotten started writing TensorFlow programs, consider the following material: -* @{$get_started/checkpoints$Checkpoints} to learn how to save and restore - models. +* @{$checkpoints$Checkpoints} to learn how to save and restore models. * @{$get_started/datasets_quickstart$Datasets} to learn more about importing data into your model. -* @{$get_started/custom_estimators$Creating Custom Estimators} to learn how to +* @{$custom_estimators$Creating Custom Estimators} to learn how to write your own Estimator, customized for a particular problem. diff --git a/tensorflow/docs_src/programmers_guide/using_tpu.md b/tensorflow/docs_src/programmers_guide/using_tpu.md index 5e3e49d..44aabf0 100644 --- a/tensorflow/docs_src/programmers_guide/using_tpu.md +++ b/tensorflow/docs_src/programmers_guide/using_tpu.md @@ -22,8 +22,8 @@ Standard `Estimators` can drive models on CPU and GPUs. You must use @{tf.contrib.tpu.TPUEstimator} to drive a model on TPUs. Refer to TensorFlow's Getting Started section for an introduction to the basics -of using a @{$get_started/premade_estimators$pre-made `Estimator`}, and -@{$get_started/custom_estimators$custom `Estimator`s}. +of using a @{$premade_estimators$pre-made `Estimator`}, and +@{$custom_estimators$custom `Estimator`s}. The `TPUEstimator` class differs somewhat from the `Estimator` class. diff --git a/tensorflow/docs_src/tutorials/kernel_methods.md b/tensorflow/docs_src/tutorials/kernel_methods.md index 73e5c51..205e2a2 100644 --- a/tensorflow/docs_src/tutorials/kernel_methods.md +++ b/tensorflow/docs_src/tutorials/kernel_methods.md @@ -53,7 +53,7 @@ In order to feed data to a `tf.contrib.learn Estimator`, it is helpful to conver it to Tensors. For this, we will use an `input function` which adds Ops to the TensorFlow graph that, when executed, create mini-batches of Tensors to be used downstream. For more background on input functions, check -@{$get_started/premade_estimators#create_input_functions$this section on input functions}. +@{$premade_estimators#create_input_functions$this section on input functions}. In this example, we will use the `tf.train.shuffle_batch` Op which, besides converting numpy arrays to Tensors, allows us to specify the batch_size and whether to randomize the input every time the input_fn Ops are executed diff --git a/tensorflow/docs_src/tutorials/layers.md b/tensorflow/docs_src/tutorials/layers.md index 37cd2bb..ead5a63 100644 --- a/tensorflow/docs_src/tutorials/layers.md +++ b/tensorflow/docs_src/tutorials/layers.md @@ -190,7 +190,7 @@ def cnn_model_fn(features, labels, mode): The following sections (with headings corresponding to each code block above) dive deeper into the `tf.layers` code used to create each layer, as well as how to calculate loss, configure the training op, and generate predictions. If -you're already experienced with CNNs and @{$get_started/custom_estimators$TensorFlow `Estimator`s}, +you're already experienced with CNNs and @{$custom_estimators$TensorFlow `Estimator`s}, and find the above code intuitive, you may want to skim these sections or just skip ahead to ["Training and Evaluating the CNN MNIST Classifier"](#train_eval_mnist). @@ -535,8 +535,8 @@ if mode == tf.estimator.ModeKeys.TRAIN: ``` > Note: For a more in-depth look at configuring training ops for Estimator model -> functions, see @{$get_started/custom_estimators#defining-the-training-op-for-the-model$"Defining the training op for the model"} -> in the @{$get_started/custom_estimators$"Creating Estimations in tf.estimator"} tutorial. +> functions, see @{$custom_estimators#defining-the-training-op-for-the-model$"Defining the training op for the model"} +> in the @{$custom_estimators$"Creating Estimations in tf.estimator"} tutorial. ### Add evaluation metrics @@ -601,7 +601,7 @@ be saved (here, we specify the temp directory `/tmp/mnist_convnet_model`, but feel free to change to another directory of your choice). > Note: For an in-depth walkthrough of the TensorFlow `Estimator` API, see the -> tutorial @{$get_started/custom_estimators$"Creating Estimators in tf.estimator."} +> tutorial @{$custom_estimators$"Creating Estimators in tf.estimator."} ### Set Up a Logging Hook {#set_up_a_logging_hook} @@ -720,7 +720,7 @@ Here, we've achieved an accuracy of 97.3% on our test data set. To learn more about TensorFlow Estimators and CNNs in TensorFlow, see the following resources: -* @{$get_started/custom_estimators$Creating Estimators in tf.estimator} +* @{$custom_estimators$Creating Estimators in tf.estimator} provides an introduction to the TensorFlow Estimator API. It walks through configuring an Estimator, writing a model function, calculating loss, and defining a training op. diff --git a/tensorflow/docs_src/tutorials/linear.md b/tensorflow/docs_src/tutorials/linear.md index 265ded8..3f247ad 100644 --- a/tensorflow/docs_src/tutorials/linear.md +++ b/tensorflow/docs_src/tutorials/linear.md @@ -17,7 +17,7 @@ tutorial walks through the code in greater detail. To understand this overview it will help to have some familiarity with basic machine learning concepts, and also with -@{$get_started/premade_estimators$Estimators}. +@{$premade_estimators$Estimators}. [TOC] diff --git a/tensorflow/docs_src/tutorials/recurrent_quickdraw.md b/tensorflow/docs_src/tutorials/recurrent_quickdraw.md index 5d83fbe..1afd861 100644 --- a/tensorflow/docs_src/tutorials/recurrent_quickdraw.md +++ b/tensorflow/docs_src/tutorials/recurrent_quickdraw.md @@ -220,7 +220,7 @@ length 2. ### Defining the model To define the model we create a new `Estimator`. If you want to read more about -estimators, we recommend @{$get_started/custom_estimators$this tutorial}. +estimators, we recommend @{$custom_estimators$this tutorial}. To build the model, we: diff --git a/tensorflow/python/estimator/estimator.py b/tensorflow/python/estimator/estimator.py index ecb5659..9b4b866 100644 --- a/tensorflow/python/estimator/estimator.py +++ b/tensorflow/python/estimator/estimator.py @@ -302,7 +302,7 @@ class Estimator(object): Args: input_fn: A function that provides input data for training as minibatches. - See @{$get_started/premade_estimators#create_input_functions} for more + See @{$premade_estimators#create_input_functions} for more information. The function should construct and return one of the following: @@ -398,7 +398,7 @@ class Estimator(object): Args: input_fn: A function that constructs the input data for evaluation. - See @{$get_started/premade_estimators#create_input_functions} for more + See @{$premade_estimators#create_input_functions} for more information. The function should construct and return one of the following: @@ -477,7 +477,7 @@ class Estimator(object): input_fn: A function that constructs the features. Prediction continues until `input_fn` raises an end-of-input exception (`OutOfRangeError` or `StopIteration`). - See @{$get_started/premade_estimators#create_input_functions} for more + See @{$premade_estimators#create_input_functions} for more information. The function should construct and return one of the following: diff --git a/tensorflow/python/estimator/training.py b/tensorflow/python/estimator/training.py index 4f90bcf..08fff3b 100644 --- a/tensorflow/python/estimator/training.py +++ b/tensorflow/python/estimator/training.py @@ -129,7 +129,7 @@ class TrainSpec( Args: input_fn: A function that provides input data for training as minibatches. - See @{$get_started/premade_estimators#create_input_functions} for more + See @{$premade_estimators#create_input_functions} for more information. The function should construct and return one of the following: * A 'tf.data.Dataset' object: Outputs of `Dataset` object must be a @@ -193,7 +193,7 @@ class EvalSpec( Args: input_fn: A function that constructs the input data for evaluation. - See @{$get_started/premade_estimators#create_input_functions} for more + See @{$premade_estimators#create_input_functions} for more information. The function should construct and return one of the following: * A 'tf.data.Dataset' object: Outputs of `Dataset` object must be a