platform/upstream/tensorflow.git
7 years agoMake bfloat16 works with complex
A. Unique TensorFlower [Wed, 13 Dec 2017 18:01:47 +0000 (10:01 -0800)]
Make bfloat16 works with complex

PiperOrigin-RevId: 178917043

7 years agoFix 'tags' parameter in predictor_factories.load_from_model.
A. Unique TensorFlower [Wed, 13 Dec 2017 17:54:52 +0000 (09:54 -0800)]
Fix 'tags' parameter in predictor_factories.load_from_model.

tags was incorrectly being mapped to inputs.
Added basic unit tests.

PiperOrigin-RevId: 178916192

7 years agoAutomated g4 rollback of changelist 178759398
Derek Murray [Wed, 13 Dec 2017 16:56:20 +0000 (08:56 -0800)]
Automated g4 rollback of changelist 178759398

PiperOrigin-RevId: 178909147

7 years agoCheck that all the inputs to a Concat op are of the same rank.
Benoit Steiner [Wed, 13 Dec 2017 16:52:40 +0000 (08:52 -0800)]
Check that all the inputs to a Concat op are of the same rank.

PiperOrigin-RevId: 178908773

7 years agoStream::BlockHostUntilDone now returns Status rather than bool.
A. Unique TensorFlower [Wed, 13 Dec 2017 16:43:09 +0000 (08:43 -0800)]
Stream::BlockHostUntilDone now returns Status rather than bool.

The now-deprecated Stream::BlockHostUntilDoneWithStatus remains, to facilitate a
multi-CL renaming transition.  Once all callers have been renamed to
BlockHostUntilDone, *WithStatus will be removed.

The StreamExecutor (private) method has also been renamed to BlockHostUntilDone.
It's only used by Stream.

The StreamExecutorInterface method will be renamed in a separate atomic CL.
It's harder to perform that transition gradually, and we've already performed an
atomic change previously, so we might as well fix it up in one shot.

PiperOrigin-RevId: 178907807

7 years agoStandardize attribute naming for operators specifying a dimension to "axis". This...
A. Unique TensorFlower [Wed, 13 Dec 2017 16:03:03 +0000 (08:03 -0800)]
Standardize attribute naming for operators specifying a dimension to "axis". This mirrors TensorFlow's attribute naming.

PiperOrigin-RevId: 178903728

7 years agoCreate global_step when recording summaries if needed
Igor Ganichev [Wed, 13 Dec 2017 05:12:14 +0000 (21:12 -0800)]
Create global_step when recording summaries if needed

User might have not created global_step prior to using
some summary method.

PiperOrigin-RevId: 178857144

7 years agoDisable a test case in params_test for CPU.
A. Unique TensorFlower [Wed, 13 Dec 2017 04:19:12 +0000 (20:19 -0800)]
Disable a test case in params_test for CPU.

This test has thousands of parameters, and the resulting graph takes take too long
to compile on the CPU backend.

PiperOrigin-RevId: 178853687

7 years agoSimplify tf.case implementation.
Alexander Gorban [Wed, 13 Dec 2017 04:09:45 +0000 (20:09 -0800)]
Simplify tf.case implementation.

PiperOrigin-RevId: 178853258

7 years agoCorrectly pass name in layers.util.smart_cond
Igor Ganichev [Wed, 13 Dec 2017 03:28:44 +0000 (19:28 -0800)]
Correctly pass name in layers.util.smart_cond

Before this change arguments were passed positionally and "name"
argument was wrongly mapped to "strict" argument of tf.cond instead
of the itended "name". Such a fix could potentially change operation
names and cause an error when restoring a graph, but it seems like
this particular change is safe for the following reasons.

 - smart_cond is not a public API. So users should not be calling it
 directly.

 - smart_cond is used in 38 places internally. All of them, except for
 in tf.contrib.summary don't use the "name" parameter. Such usage leads
 to the same names before and after this change. The names will change
 for users of tf.contrib.summary. Luckily, this is a very recent
 addition and has utility only in context of eager execution, which is
 in pre-alpha stage yet.

Because this change reroutes the wrong "name" -> "strict" mapping to
"name" -> "name", the value of "strict" is changing from "None" to
"False". Luckily, this has no effect on the function's behavior.

PiperOrigin-RevId: 178850766

7 years agoFully-qualify function call in TF_CHECK_OK macro implementation, so that it can
A. Unique TensorFlower [Wed, 13 Dec 2017 02:44:25 +0000 (18:44 -0800)]
Fully-qualify function call in TF_CHECK_OK macro implementation, so that it can
be safely used outside of the tensorflow namespace.

Note that the StreamExecutor SE_CHECK_OK simply uses TF_CHECK_OK, so this helps
those cases.

PiperOrigin-RevId: 178847904

7 years agoAllow Tensor::bit_casted_shaped() to take type parameter T with different size
A. Unique TensorFlower [Wed, 13 Dec 2017 02:19:29 +0000 (18:19 -0800)]
Allow Tensor::bit_casted_shaped() to take type parameter T with different size
from the buffer data type size.

PiperOrigin-RevId: 178845870

7 years agoMove more contrib RNN objects to be Layers.
Eugene Brevdo [Wed, 13 Dec 2017 01:45:50 +0000 (17:45 -0800)]
Move more contrib RNN objects to be Layers.

PiperOrigin-RevId: 178842373

7 years ago[TF] Mark DT_STRING and DT_RESOURCE types as always sitting on host memory.
Eugene Brevdo [Wed, 13 Dec 2017 01:01:02 +0000 (17:01 -0800)]
[TF] Mark DT_STRING and DT_RESOURCE types as always sitting on host memory.

This is important when these arguments may appear in op input lists or output lists,
where the signature may not be able to declare them as sitting on host.

For DT_RESOURCE types, just the handles are marked as sitting on host memory;
the actual data may reside on GPU.

PiperOrigin-RevId: 178837213

7 years agoBUGFIX: MVN Full Covariance: Use dtype dependent tolerance to verify symmetric.
Ian Langmore [Wed, 13 Dec 2017 00:30:09 +0000 (16:30 -0800)]
BUGFIX: MVN Full Covariance:  Use dtype dependent tolerance to verify symmetric.

PiperOrigin-RevId: 178833453

7 years agoReturn unimplemented error when trying to use dilated rate > 1 combined with NHWC...
Yangzihao Wang [Wed, 13 Dec 2017 00:21:26 +0000 (16:21 -0800)]
Return unimplemented error when trying to use dilated rate > 1 combined with NHWC format on the CPU.
Add test for unimplemented errors in Conv2D op.

PiperOrigin-RevId: 178832407

7 years ago[XLA] Remove a source of nondeterminism in HLO clustering.
A. Unique TensorFlower [Wed, 13 Dec 2017 00:09:47 +0000 (16:09 -0800)]
[XLA] Remove a source of nondeterminism in HLO clustering.

Record the HLO clusters with std::set instead of std::unordered_set to ensure
that the algorithm to assign each cluster a sequence number during a set
traversal is deterministic.

PiperOrigin-RevId: 178830794

7 years agoFor many requests, the GCS filesystem client did not provide DNS lookup hints. This...
A. Unique TensorFlower [Wed, 13 Dec 2017 00:06:31 +0000 (16:06 -0800)]
For many requests, the GCS filesystem client did not provide DNS lookup hints.  This change allows all GCS HTTP requests to use the GCS DNS cache.  It also simplifies the code, and eliminates a lot of redundant code.

The GCS DNS cache has been simplified and made more general. It is now easy to add more DNS names, simply by adding an entry to the GcsDnsCache::names_ list.

PiperOrigin-RevId: 178830317

7 years agoOnly require validation that a fetch is requested for tf2xla::Config.
A. Unique TensorFlower [Tue, 12 Dec 2017 23:58:15 +0000 (15:58 -0800)]
Only require validation that a fetch is requested for tf2xla::Config.
It is legitimate to convert a graph with only fetches, eg in that case where
the inputs to the graph are supplied by the infeed rather than by a feed node.

PiperOrigin-RevId: 178828952

7 years agoRefactor helper functions a bit for virtual gpu changes later.
Guangda Lai [Tue, 12 Dec 2017 23:39:52 +0000 (15:39 -0800)]
Refactor helper functions a bit for virtual gpu changes later.

PiperOrigin-RevId: 178826426

7 years agoSupport permutation from NCHW to NHWC.
Yao Zhang [Tue, 12 Dec 2017 23:29:16 +0000 (15:29 -0800)]
Support permutation from NCHW to NHWC.

PiperOrigin-RevId: 178824999

7 years agoFix bug in kernel creation with functions marked "stateful".
Derek Murray [Tue, 12 Dec 2017 22:57:55 +0000 (14:57 -0800)]
Fix bug in kernel creation with functions marked "stateful".

The CallOp kernel caches a handle for invoking the function. This
handle is only valid in a single subgraph (it is scoped to the
FunctionLibraryRuntime). Marking a function as stateful causes its
CallOp kernel to be shared between multiple subgraphs. Therefore, this
change overrides the kernel creation logic to ensure that each
subgraph gets its own CallOp.

PiperOrigin-RevId: 178820064

7 years agoAdd test case for record_summaries_every_n_global_steps
Igor Ganichev [Tue, 12 Dec 2017 22:33:55 +0000 (14:33 -0800)]
Add test case for record_summaries_every_n_global_steps

This test case illustrates how to use
record_summaries_every_n_global_steps and tf.all_summaries()
in graph mode. There are no tests using
record_summaries_every_n_global_steps. All existing graph
based tests don't use tf.all_summaries() creating the impression
that summary ops will somehow always run, which is not the case.

PiperOrigin-RevId: 178816316

7 years agoAdded a debug mode to the model analyzer to make it easier to figure out why shapes...
Benoit Steiner [Tue, 12 Dec 2017 22:15:09 +0000 (14:15 -0800)]
Added a debug mode to the model analyzer to make it easier to figure out why shapes are missing.

PiperOrigin-RevId: 178813305

7 years agoAdds XLA support for tf.nn.dynamic_rnn
A. Unique TensorFlower [Tue, 12 Dec 2017 21:03:52 +0000 (13:03 -0800)]
Adds XLA support for tf.nn.dynamic_rnn

Changes tf.nn.dynamic_rnn to specify `maximum_iterations` argument for the while_loop.

When `maximum_iterations` argument is supplied to tf.while_loop, use this to provide an upper bound on the size of Stacks used for gradient computation.
By specifying the stack limit we can generate gradient code for while loops that uses fixed shape TensorArrays and hence can be compiled with XLA.

PiperOrigin-RevId: 178802710

7 years agoAutomated g4 rollback of changelist 177619402
Brennan Saeta [Tue, 12 Dec 2017 20:51:41 +0000 (12:51 -0800)]
Automated g4 rollback of changelist 177619402

PiperOrigin-RevId: 178800980

7 years ago[XLA] Always fold transposes into convs or dots regardless of use count
David Majnemer [Tue, 12 Dec 2017 19:35:20 +0000 (11:35 -0800)]
[XLA] Always fold transposes into convs or dots regardless of use count

PiperOrigin-RevId: 178790193

7 years agoSliced Wasserstein Distance metric for GANs evaluation.
A. Unique TensorFlower [Tue, 12 Dec 2017 19:26:41 +0000 (11:26 -0800)]
Sliced Wasserstein Distance metric for GANs evaluation.

PiperOrigin-RevId: 178788810

7 years agoAdd CompositeNodeManager for Grappler VirtualScheduler.
A. Unique TensorFlower [Tue, 12 Dec 2017 19:18:28 +0000 (11:18 -0800)]
Add CompositeNodeManager for Grappler VirtualScheduler.

CompositeNodeManager has per-device LIFO manager, FirstReadyManagers for _Send
and _Recv ops, and chooses FirstReady among the ops from per-device LIFOManager
and _Send and _Recv FirstReadyManagers.

This one can maximizes producer-consumer locality within a device (with LIFO),
but does not introduce previously reported scheduling inefficiency w.r.t.
multi-device execution with separately managing _Send and _Recv ops and global
FirstReady policy across devices.

It's implemented, but not enabled; VirtualScheduler still uses
FirstReadyManager.

PiperOrigin-RevId: 178787352

7 years agodisabling flaky test
Olivia Nordquist [Tue, 12 Dec 2017 19:17:00 +0000 (11:17 -0800)]
disabling flaky test

PiperOrigin-RevId: 178787158

7 years agoAssociative operator optimization:
A. Unique TensorFlower [Tue, 12 Dec 2017 19:04:58 +0000 (11:04 -0800)]
Associative operator optimization:

Push constants down add/mul to canonicalize chains and possibly create constant nodes at the bottom. Example:

      +                +             +
     / \              / \           / \
    c1   +     -->   x   +    -->  x c1+c2
        / \             / \
       c2  x           c2 c1

Small cleanup: Consolidate code for manipulating names of nodes added or modified during constant folding.

PiperOrigin-RevId: 178785218

7 years agoRaise exception on missing unused input_map keys with C API enabled.
Skye Wanderman-Milne [Tue, 12 Dec 2017 18:58:31 +0000 (10:58 -0800)]
Raise exception on missing unused input_map keys with C API enabled.

Without this change, the C++ ImportGraphDef API returns unused
input_map keys (which are plumbed through to the C API as
well). However, the Python import_graph_def API requires slightly
different semantics: it throws an error for unused input_map keys that
are missing from the GraphDef.

This change modifies the C and C++ APIs to limit the returned keys to
those missing from the GraphDef, and plumbs this through to the C
API-enabled import_graph_def implementation.

Note that this is a change to the existing C API. Luckily the modified
method hasn't been released yet, so it's ok to change it.

PiperOrigin-RevId: 178783957

7 years agoIntegrate tensor pool feature to `gan_loss` function.
A. Unique TensorFlower [Tue, 12 Dec 2017 17:19:42 +0000 (09:19 -0800)]
Integrate tensor pool feature to `gan_loss` function.

PiperOrigin-RevId: 178769850

7 years agoDisable neutral element and reciprocal optimizations again.
A. Unique TensorFlower [Tue, 12 Dec 2017 17:01:37 +0000 (09:01 -0800)]
Disable neutral element and reciprocal optimizations again.

PiperOrigin-RevId: 178767676

7 years agoThis CL makes two improvements to the `map_and_batch` transformation:
Jiri Simsa [Tue, 12 Dec 2017 16:31:48 +0000 (08:31 -0800)]
This CL makes two improvements to the `map_and_batch` transformation:

1) It fixes a bug that manifested as `OutOfRange` being returned prematurely.

2) It changes the behavior on sequences of elements whose size is not a multiple of `batch_size`. Previously, the implementation would drop the last small batch (similar to `batch_and_drop_remainder). Newly, the implementation returns the last small batch (similar to `batch`).

PiperOrigin-RevId: 178764508

7 years agoAutomated g4 rollback of changelist 178675527
Derek Murray [Tue, 12 Dec 2017 15:43:24 +0000 (07:43 -0800)]
Automated g4 rollback of changelist 178675527

PiperOrigin-RevId: 178759398

7 years agoAdd get started `Datasets` doc
Mark Daoust [Tue, 12 Dec 2017 14:02:03 +0000 (06:02 -0800)]
Add get started `Datasets` doc

PiperOrigin-RevId: 178751067

7 years agoSimplifying tfe function.py
Alexandre Passos [Tue, 12 Dec 2017 11:24:43 +0000 (03:24 -0800)]
Simplifying tfe function.py

PiperOrigin-RevId: 178740804

7 years agoRemove real-data shape check in GANEstimator. Fixes github issue #14257.
A. Unique TensorFlower [Tue, 12 Dec 2017 10:33:55 +0000 (02:33 -0800)]
Remove real-data shape check in GANEstimator. Fixes github issue #14257.

PiperOrigin-RevId: 178737278

7 years agoDisable flaky //tensorflow/contrib/learn:dnn_linear_combined_test
Yifei Feng [Tue, 12 Dec 2017 10:02:28 +0000 (02:02 -0800)]
Disable flaky //tensorflow/contrib/learn:dnn_linear_combined_test

PiperOrigin-RevId: 178734940

7 years ago[XLA] Properly set layout requirements in Hlo parser.
A. Unique TensorFlower [Tue, 12 Dec 2017 07:47:10 +0000 (23:47 -0800)]
[XLA] Properly set layout requirements in Hlo parser.

PiperOrigin-RevId: 178724659

7 years agoUse BlockHostUntilDoneWithStatus in various places.
A. Unique TensorFlower [Tue, 12 Dec 2017 07:35:29 +0000 (23:35 -0800)]
Use BlockHostUntilDoneWithStatus in various places.

PiperOrigin-RevId: 178723711

7 years agoDisable flaky random ops test.
Gunhan Gulsoy [Tue, 12 Dec 2017 07:26:35 +0000 (23:26 -0800)]
Disable flaky random ops test.

PiperOrigin-RevId: 178723108

7 years ago * HloTestBase now prints out the HLO parser error message when there is one.
Bjarke Hammersholt Roune [Tue, 12 Dec 2017 06:31:27 +0000 (22:31 -0800)]
 * HloTestBase now prints out the HLO parser error message when there is one.
 * TestUtils now supports generating random literals with more than one constraint.
     There is still an error if the constraints conflict.

PiperOrigin-RevId: 178720092

7 years ago[XLA] Add stringification to BatchNormTestParam.
Justin Lebar [Tue, 12 Dec 2017 06:26:00 +0000 (22:26 -0800)]
[XLA] Add stringification to BatchNormTestParam.

This way when a test fails, it prints out useful information about the
failure, instead of

  "<48-byte object with these bytes: de ad be ef ...>"

PiperOrigin-RevId: 178719733

7 years ago[XLA] Don't call timer->Nanoseconds() on a not-ok stream.
Justin Lebar [Tue, 12 Dec 2017 05:38:05 +0000 (21:38 -0800)]
[XLA] Don't call timer->Nanoseconds() on a not-ok stream.

If the stream is not OK, the timer might not have been initialized and
finalized, in which case calling timer->Nanoseconds() is illegal and
will crash.

PiperOrigin-RevId: 178717089

7 years agoUpdate ops-related pbtxt files.
A. Unique TensorFlower [Tue, 12 Dec 2017 05:04:20 +0000 (21:04 -0800)]
Update ops-related pbtxt files.

PiperOrigin-RevId: 178715353

7 years ago[XLA:CPU] Teach the CPU layout assignment about dot dimension numbers
Sanjoy Das [Tue, 12 Dec 2017 04:34:08 +0000 (20:34 -0800)]
[XLA:CPU] Teach the CPU layout assignment about dot dimension numbers

There is no great need for this yet, but I noticed that the test cases were
broken (they were constructing dots with unset dimension numbers), and one thing
led to another.

PiperOrigin-RevId: 178713597

7 years agoprefer_static_* functions added to CORE/distributions/util.py
Ian Langmore [Tue, 12 Dec 2017 03:44:08 +0000 (19:44 -0800)]
prefer_static_* functions added to CORE/distributions/util.py

PiperOrigin-RevId: 178710439

7 years agoFix the handling of unknown rank. Previous code would wrongly treat a tensor of unknown
Yao Zhang [Tue, 12 Dec 2017 03:39:35 +0000 (19:39 -0800)]
Fix the handling of unknown rank. Previous code would wrongly treat a tensor of unknown
rank as a scalar.

PiperOrigin-RevId: 178710185

7 years agoIncludes <cstdio> in the TF Lite kernels/op_macros.h to fix a compile errors
A. Unique TensorFlower [Tue, 12 Dec 2017 03:19:26 +0000 (19:19 -0800)]
Includes <cstdio> in the TF Lite kernels/op_macros.h to fix a compile errors
when building externally using either the Makefile or Bazel.  The macros use
stderr and fprintf which may not be defined depending on the order of
headers included by the .cc files.

PiperOrigin-RevId: 178708839

7 years agoDon't materialize BroadcastGradientArgs by default.
Benoit Steiner [Tue, 12 Dec 2017 02:09:51 +0000 (18:09 -0800)]
Don't materialize BroadcastGradientArgs by default.

PiperOrigin-RevId: 178703180

7 years agocloses #15281
Gunhan Gulsoy [Tue, 12 Dec 2017 01:50:44 +0000 (17:50 -0800)]
closes #15281

PiperOrigin-RevId: 178701096

7 years agoAutomated g4 rollback of changelist 178634559
A. Unique TensorFlower [Tue, 12 Dec 2017 01:03:54 +0000 (17:03 -0800)]
Automated g4 rollback of changelist 178634559

PiperOrigin-RevId: 178695724

7 years agoInitialize local_resources during session initialization.
A. Unique TensorFlower [Tue, 12 Dec 2017 00:57:51 +0000 (16:57 -0800)]
Initialize local_resources during session initialization.

PiperOrigin-RevId: 178694869

7 years ago[XLA] Optimize dot(concat(..), constant)
Sanjoy Das [Tue, 12 Dec 2017 00:31:19 +0000 (16:31 -0800)]
[XLA] Optimize dot(concat(..), constant)

dot(concat(..), constant) and dot(constant, concat(..)) can be rewritten to
avoid the concatenate.  This can itself be a win, but can also help unlock other
optimization opportunities.

PiperOrigin-RevId: 178691585

7 years agoStrength reduce division by a constant to multiplication by the reciprocal constant.
A. Unique TensorFlower [Tue, 12 Dec 2017 00:11:26 +0000 (16:11 -0800)]
Strength reduce division by a constant to multiplication by the reciprocal constant.

PiperOrigin-RevId: 178689056

7 years agoFix the remote call graph construction so that the created _Send ops for returning...
A. Unique TensorFlower [Tue, 12 Dec 2017 00:01:17 +0000 (16:01 -0800)]
Fix the remote call graph construction so that the created _Send ops for returning the results point to the correct return value.

List available workers when the remote call target is not available.

PiperOrigin-RevId: 178687525

7 years ago[XLA:CPU] Minor refactor to the CPU layout assignment code
Sanjoy Das [Mon, 11 Dec 2017 23:57:19 +0000 (15:57 -0800)]
[XLA:CPU] Minor refactor to the CPU layout assignment code

Lifts unnecessary lambdas to reduce clutter.  This will make a later change
more readable.

PiperOrigin-RevId: 178686976

7 years ago[tf.data] Use a more efficient dispatch mechanism for functions in datasets.
Derek Murray [Mon, 11 Dec 2017 23:39:00 +0000 (15:39 -0800)]
[tf.data] Use a more efficient dispatch mechanism for functions in datasets.

This change adds an overload of the `FunctionLibraryRuntime::Run()` method
that allows users to pass argument and return value containers in a
`CallFrameInterface` object, rather than using the current (and expensive for
large arities) default `FunctionCallFrame` implementation. It also specializes
`CapturedFunction` to use this interface.

Note that the new overload currently only supports local function execution,
and more restructuring will be required to take advantage of it in the remote
function execution case.

This change should especially benefit datasets where each element has a large
number of components (typically when training data have many features).

PiperOrigin-RevId: 178684431

7 years agoFix for variable naming when executing eagerly
Allen Lavoie [Mon, 11 Dec 2017 22:49:48 +0000 (14:49 -0800)]
Fix for variable naming when executing eagerly

name_scope bypassed the Graph.name_scope slash-stripping logic (strip a trailing
slash if it exists, then add one back unconditionally) when executing eagerly,
leading to extra slashes at the end of some variable names and a failure to break
out of nested name scopes.

PiperOrigin-RevId: 178676873

7 years agoFix definition of tflite_smartreply
Justine Tunney [Mon, 11 Dec 2017 22:41:46 +0000 (14:41 -0800)]
Fix definition of tflite_smartreply

PiperOrigin-RevId: 178675580

7 years agoMark a FunctionDef's signature as stateful when it contains a stateful node.
Derek Murray [Mon, 11 Dec 2017 22:41:17 +0000 (14:41 -0800)]
Mark a FunctionDef's signature as stateful when it contains a stateful node.

This fixes a bug where two calls to the same stateful function will erroneously be eliminated as common subexpressions. It is also a step towards pruning nodes from function bodies, which is necessary for a variety of `Dataset` optimizations.

PiperOrigin-RevId: 178675527

7 years ago[XLA] Move BatchDot unrolling from TF2XLA bridge to AlgebraicSimplifier so that unrol...
A. Unique TensorFlower [Mon, 11 Dec 2017 21:48:51 +0000 (13:48 -0800)]
[XLA] Move BatchDot unrolling from TF2XLA bridge to AlgebraicSimplifier so that unrolling can be selectively enabled/disabled per backend (should be no performance change).

PiperOrigin-RevId: 178666990

7 years agoTune SQLite
Justine Tunney [Mon, 11 Dec 2017 21:44:26 +0000 (13:44 -0800)]
Tune SQLite

This change makes sure the b-tree page size isn't 1024 bytes. It also enables
WAL mode. This means TensorBoard can perform reads at the same time as
TensorFlow is performing writes.

We now also fsync() less often. This shouldn't carry any risk of database
corruption in WAL mode. Since WAL mode uses shared memory, writes become
immediately available to other processes, but they won't become durable until
after the OS decides to flush the FS cache.

This makes the DB writer faster than the file writer, at least in cases where
the DB is tiny. We probably make it go faster still, once we find a way to use
transactions.

Name                      Cold ?s   Average ?s  Flushing ?s       Size B
?i.i                        1,920           69            0            0
Scalar 1.0 FS               1,623          337        4,258       11,348
Scalar 1.0 TB FS            3,137          527        4,213       17,023
Scalar 2.0 FS               3,319          681        3,917       11,348
Scalar 2.0 DB               2,601          578          217      118,784
Tensor 1.0 FS 4             6,397          558        4,276       14,215
Tensor 2.0 FS 4             1,678          613        3,971       24,455
Tensor 2.0 DB 4             3,605          278          313      118,784
Tensor 1.0 FS 128           1,857          289        4,397       47,111
Tensor 2.0 FS 128           3,558          721       10,894       57,351
Tensor 2.0 DB 128           3,508          585          203      118,784
Tensor 1.0 FS 8192          2,677          525        4,400    2,119,816
Tensor 2.0 FS 8192          2,248          822        4,006    2,130,056
Tensor 2.0 DB 8192          4,346          370          449      126,976

PiperOrigin-RevId: 178666363

7 years ago[XLA:CPU] Error on unsupported dot instructions
Sanjoy Das [Mon, 11 Dec 2017 21:43:25 +0000 (13:43 -0800)]
[XLA:CPU] Error on unsupported dot instructions

This is a stopgap measure to avoid silently miscompiling dot operations.

PiperOrigin-RevId: 178666218

7 years agoEnable optimizations of operations with neutral/absorbing elements by default. We...
A. Unique TensorFlower [Mon, 11 Dec 2017 21:38:28 +0000 (13:38 -0800)]
Enable optimizations of operations with neutral/absorbing elements by default. We leave removal of addition and subtraction with zero out for now, since it is used as a "hack" to force a copy of a tensor in a few places. Once we have fixed this code, we can enable it.

PiperOrigin-RevId: 178665567

7 years agoSupport different threading modes in GPU device.
Xiaoqiang Zheng [Mon, 11 Dec 2017 21:14:17 +0000 (13:14 -0800)]
Support different threading modes in GPU device.
All modes are experimental for now. The goal is to find the best setting, and
change the default to pick that.

PiperOrigin-RevId: 178662212

7 years agoSupport all unary ops.
Yao Zhang [Mon, 11 Dec 2017 21:14:01 +0000 (13:14 -0800)]
Support all unary ops.

PiperOrigin-RevId: 178662178

7 years agoAlways inline functions when creating an item.
Benoit Steiner [Mon, 11 Dec 2017 21:10:26 +0000 (13:10 -0800)]
Always inline functions when creating an item.

PiperOrigin-RevId: 178661624

7 years agoAdd ReverseDFSFrom variant that works with const Node*.
A. Unique TensorFlower [Mon, 11 Dec 2017 20:51:27 +0000 (12:51 -0800)]
Add ReverseDFSFrom variant that works with const Node*.

PiperOrigin-RevId: 178658907

7 years agoRename ABSL Macros
A. Unique TensorFlower [Mon, 11 Dec 2017 20:24:41 +0000 (12:24 -0800)]
Rename ABSL Macros

This LSC will rename ABSL macros. Most macro will be renamed with ABSL_ prefix.
Some might have completely new names. Please see the list of the macros
affected. For example, MUST_USE_RESULT will be renamed ABSL_MUST_USE_RESULT

The purpose of this LSC is to avoid name conflicts for the ABSL release. To see
the details go/absl-macros.

PiperOrigin-RevId: 178655181

7 years agoBe more conservative when optimizing full reductions
Benoit Steiner [Mon, 11 Dec 2017 20:04:39 +0000 (12:04 -0800)]
Be more conservative when optimizing full reductions

PiperOrigin-RevId: 178652323

7 years agoOptimized specializations for 3-channel depthwise with multiplier 2 and 4.
A. Unique TensorFlower [Mon, 11 Dec 2017 19:35:35 +0000 (11:35 -0800)]
Optimized specializations for 3-channel depthwise with multiplier 2 and 4.

PiperOrigin-RevId: 178647824

7 years agoAdds remaining tests to _shared_embedding_column.
A. Unique TensorFlower [Mon, 11 Dec 2017 19:17:20 +0000 (11:17 -0800)]
Adds remaining tests to _shared_embedding_column.

PiperOrigin-RevId: 178644910

7 years agoAdd hvx/nnapi supports for Tensorflow Lite benchmark
Zhixian Yan [Mon, 11 Dec 2017 19:11:33 +0000 (11:11 -0800)]
Add hvx/nnapi supports for Tensorflow Lite benchmark

PiperOrigin-RevId: 178643959

7 years agoFix incorrect parameter order in recall_at_precision.
A. Unique TensorFlower [Mon, 11 Dec 2017 19:02:18 +0000 (11:02 -0800)]
Fix incorrect parameter order in recall_at_precision.

PiperOrigin-RevId: 178642393

7 years agoUse DataFormatVecPermute instead Gather, which is very slow.
Yao Zhang [Mon, 11 Dec 2017 18:59:54 +0000 (10:59 -0800)]
Use DataFormatVecPermute instead Gather, which is very slow.

PiperOrigin-RevId: 178641878

7 years agoMake sure we serialize the creation and deletion of clusters from python to avoid...
Benoit Steiner [Mon, 11 Dec 2017 18:43:52 +0000 (10:43 -0800)]
Make sure we serialize the creation and deletion of clusters from python to avoid race conditions.
Added a cluster context manager.
Only check that we have a unique SingleMachine running during the provisioning phase.

PiperOrigin-RevId: 178638954

7 years agoFix-ups for XLA docs.
Eli Bendersky [Mon, 11 Dec 2017 18:40:19 +0000 (10:40 -0800)]
Fix-ups for XLA docs.

- Fix wording/grammar
- Remove obsolete "not implemented" notes on some ops

PiperOrigin-RevId: 178638405

7 years ago[XLA] And window reversal to the parser.
Blake Hechtman [Mon, 11 Dec 2017 18:29:43 +0000 (10:29 -0800)]
[XLA] And window reversal to the parser.

PiperOrigin-RevId: 178636821

7 years agoUse the Snapshot kernel to force a copy of global step instead of the ugly "x + 0...
A. Unique TensorFlower [Mon, 11 Dec 2017 18:15:59 +0000 (10:15 -0800)]
Use the Snapshot kernel to force a copy of global step instead of the ugly "x + 0" hack.

PiperOrigin-RevId: 178634559

7 years agoRemove using-directives
A. Unique TensorFlower [Mon, 11 Dec 2017 18:01:55 +0000 (10:01 -0800)]
Remove using-directives

PiperOrigin-RevId: 178632103

7 years agoEnriching some C64 test coverage.
A. Unique TensorFlower [Mon, 11 Dec 2017 16:57:11 +0000 (08:57 -0800)]
Enriching some C64 test coverage.

PiperOrigin-RevId: 178624364

7 years ago[XLA] Improve dot strength reductions to support transposes of the right and
Blake Hechtman [Mon, 11 Dec 2017 16:09:11 +0000 (08:09 -0800)]
[XLA] Improve dot strength reductions to support transposes of the right and
left hand side of a dot.

PiperOrigin-RevId: 178619673

7 years agoAdd `grpc_enabled` optional argument to various Python test rules.
Derek Murray [Mon, 11 Dec 2017 15:58:48 +0000 (07:58 -0800)]
Add `grpc_enabled` optional argument to various Python test rules.

PiperOrigin-RevId: 178618409

7 years agoCleanup: Remove unused declarations and unnecessary conversions
A. Unique TensorFlower [Mon, 11 Dec 2017 15:50:08 +0000 (07:50 -0800)]
Cleanup: Remove unused declarations and unnecessary conversions

PiperOrigin-RevId: 178617606

7 years agoFix mismatched argument comments to match parameter names
A. Unique TensorFlower [Mon, 11 Dec 2017 15:49:41 +0000 (07:49 -0800)]
Fix mismatched argument comments to match parameter names

PiperOrigin-RevId: 178617543

7 years agoMake control_flow_op_py_test "medium" to avoid ASAN timeouts.
Skye Wanderman-Milne [Mon, 11 Dec 2017 01:48:15 +0000 (17:48 -0800)]
Make control_flow_op_py_test "medium" to avoid ASAN timeouts.

It takes longer to run now that it runs with and without the C API
enabled.

PiperOrigin-RevId: 178561206

7 years agoAdd a library for the cwise ops headers and common source.
A. Unique TensorFlower [Sun, 10 Dec 2017 23:11:07 +0000 (15:11 -0800)]
Add a library for the cwise ops headers and common source.

PiperOrigin-RevId: 178554846

7 years agoAdd IsDataTypeComplex helper function. In numerous places, we needlessly depend on...
A. Unique TensorFlower [Sun, 10 Dec 2017 18:38:54 +0000 (10:38 -0800)]
Add IsDataTypeComplex helper function. In numerous places, we needlessly depend on Eigen templates to determine if a datatype is complex. Cleaning up these instances will be done in a separate CL.

PiperOrigin-RevId: 178544917

7 years agoReplace StreamExecutorInterface::BlockHostUntilDone with BlockHostUntilDoneWithStatus
A. Unique TensorFlower [Sat, 9 Dec 2017 18:07:17 +0000 (10:07 -0800)]
Replace StreamExecutorInterface::BlockHostUntilDone with BlockHostUntilDoneWithStatus

All known overrides of StreamExecutorInterface::BlockHostUntilDone are changed
by this CL.

PiperOrigin-RevId: 178492517

7 years agoSet Operation._id_value before adding to control flow context.
Skye Wanderman-Milne [Sat, 9 Dec 2017 03:27:55 +0000 (19:27 -0800)]
Set Operation._id_value before adding to control flow context.

There is a comment indicating that Operation IDs should be in
topological order, and thus the ID should be set after calling
ControlFlowContext.AddOp since it may add input ops. I believe the
comment is stale and this invariant on the IDs isn't necessary
(testing corroborates this, and also while loops cannot maintain this
since there's a cycle).

Changing this will make it easier to refactor control flow processing
in the future, since we don't have to worry about the ID not being
set.

PiperOrigin-RevId: 178457622

7 years ago[XLA] Hlo parser: support reporting error messages with locations pointed out. And...
A. Unique TensorFlower [Sat, 9 Dec 2017 02:53:56 +0000 (18:53 -0800)]
[XLA] Hlo parser: support reporting error messages with locations pointed out. And fix the bug that some errors were reported at the token after the actual errors.

PiperOrigin-RevId: 178455738

7 years agoSupport non-const input sizes for Conv2DBackpropInput.
Yao Zhang [Sat, 9 Dec 2017 02:34:10 +0000 (18:34 -0800)]
Support non-const input sizes for Conv2DBackpropInput.

PiperOrigin-RevId: 178454629

7 years agoClear existing layouts before running the layout assignment.
HyoukJoong Lee [Sat, 9 Dec 2017 01:28:52 +0000 (17:28 -0800)]
Clear existing layouts before running the layout assignment.

PiperOrigin-RevId: 178449701

7 years agoAlways create a Rendezvous in RemoteCallOp.
Derek Murray [Sat, 9 Dec 2017 01:26:49 +0000 (17:26 -0800)]
Always create a Rendezvous in RemoteCallOp.

This change does not affect existing functionality, and enables RemoteCallOp to work in environments where a Rendezvous is not necessarily available (e.g. in a function called from an IteratorContext).

PiperOrigin-RevId: 178449551

7 years agoCheck that Rendezvous is not null.
A. Unique TensorFlower [Sat, 9 Dec 2017 01:23:00 +0000 (17:23 -0800)]
Check that Rendezvous is not null.

PiperOrigin-RevId: 178449247

7 years agoPreserve symbolic shape information as much as possible during shape creation
Benoit Steiner [Sat, 9 Dec 2017 01:10:40 +0000 (17:10 -0800)]
Preserve symbolic shape information as much as possible during shape creation

PiperOrigin-RevId: 178448208

7 years agoIn MovingAverageOptimizer, delegate compute_gradients() to the wrapped optimizer,
A. Unique TensorFlower [Sat, 9 Dec 2017 01:07:39 +0000 (17:07 -0800)]
In MovingAverageOptimizer, delegate compute_gradients() to the wrapped optimizer,
which is a bug fix in case the wrapper optimizer (or any other optimizer in the stack)
does something non-standard in its compute_gradients() method.

PiperOrigin-RevId: 178447959