A. Unique TensorFlower [Thu, 24 May 2018 10:48:24 +0000 (03:48 -0700)]
Automated g4 rollback of changelist
197477959
PiperOrigin-RevId:
197868028
A. Unique TensorFlower [Thu, 24 May 2018 09:54:37 +0000 (02:54 -0700)]
Allow to generate fake infeed buffers with shapes derived from the computation.
When replaying a computation from a HloSnapshot, we want to be able to provide fake
infeed data. This was already possible when the infeed shape is known by providing
it with the --fake_infeed_shape flag. With this CL, we add the option to derive it
from the provided HloSnapshot. Also, we transfer the infeed shape a fixed number of
times instead of infinitely many times (configurable with a flag).
Otherwise we will definitely run out of memory at some point.
PiperOrigin-RevId:
197863412
A. Unique TensorFlower [Thu, 24 May 2018 08:03:36 +0000 (01:03 -0700)]
[XLA:GPU] Basic multi-output fusion for GPU.
Take a conservative approach and attempt multi-output fusion in cases where "regular" fusion is not an option.
PiperOrigin-RevId:
197852598
Sanjoy Das [Thu, 24 May 2018 05:59:22 +0000 (22:59 -0700)]
Implement support for reshape in IndexedArrayAnalysis
PiperOrigin-RevId:
197843589
A. Unique TensorFlower [Thu, 24 May 2018 05:33:53 +0000 (22:33 -0700)]
Add unit tests to tflite kernels
PiperOrigin-RevId:
197842122
Karmel Allison [Thu, 24 May 2018 03:53:15 +0000 (20:53 -0700)]
Resolve name collisions with assets in SavedModels by deduplicating names that
point to distinct files.
PiperOrigin-RevId:
197835288
A. Unique TensorFlower [Thu, 24 May 2018 03:39:31 +0000 (20:39 -0700)]
Add support for is_recompute optional kwarg to functions decorated with recompute_grad
PiperOrigin-RevId:
197834316
A. Unique TensorFlower [Thu, 24 May 2018 03:03:20 +0000 (20:03 -0700)]
Set the correct shape in transformed distribution.
Also add distribution_util.maybe_get_static_event_ndims to be reused in bijector and transformed distribution classes.
PiperOrigin-RevId:
197831651
Billy Lamberta [Thu, 24 May 2018 01:46:20 +0000 (18:46 -0700)]
Moves estimator getting started docs into programmer's guide.
Update path references and magic links.
Remove getting started with estimators doc.
Add redirects.
PiperOrigin-RevId:
197826223
Shashi Shekhar [Thu, 24 May 2018 01:45:30 +0000 (18:45 -0700)]
Add back some public interface methods.
PiperOrigin-RevId:
197826136
A. Unique TensorFlower [Thu, 24 May 2018 01:38:34 +0000 (18:38 -0700)]
HloSharding parsing from string, used by new Sharding HloMatcher for ease of use.
PiperOrigin-RevId:
197825588
A. Unique TensorFlower [Thu, 24 May 2018 01:13:23 +0000 (18:13 -0700)]
Extracts the SimplifyReduction optimization into its own method.
PiperOrigin-RevId:
197823183
Priya Gupta [Thu, 24 May 2018 00:58:42 +0000 (17:58 -0700)]
Aggregating IndexedSlices: Do not require first element to be IndexedSlices.
PiperOrigin-RevId:
197821479
Justin Lebar [Thu, 24 May 2018 00:52:29 +0000 (17:52 -0700)]
[XLA] Speed up SliceTest.
- Use parameters rather than constants, because LLVM and ptxas are slow
with large constants.
- Use iota rather than filling with random values, because the latter is
slow.
PiperOrigin-RevId:
197820897
Sanjoy Das [Thu, 24 May 2018 00:49:42 +0000 (17:49 -0700)]
Cache generated LLVM IR for GEBP
After this change all generated GEBPs with the same shape will share a single
llvm::Function.
This is NFC for any actual workloads because the GEBP emitter isn't exercised by
normal code-paths yet.
PiperOrigin-RevId:
197820606
Nupur Garg [Thu, 24 May 2018 00:44:32 +0000 (17:44 -0700)]
Add import.
PiperOrigin-RevId:
197820050
Derek Murray [Thu, 24 May 2018 00:32:54 +0000 (17:32 -0700)]
[tf.data] Split out the `tf.contrib.data.sample_from_datasets()` tests.
These were previously broken and disabled in CI builds; this change also fixes
them up.
PiperOrigin-RevId:
197818554
A. Unique TensorFlower [Thu, 24 May 2018 00:26:04 +0000 (17:26 -0700)]
Internal cleanup to remove a difference from the code on github.
PiperOrigin-RevId:
197817738
Anna R [Thu, 24 May 2018 00:15:54 +0000 (17:15 -0700)]
Internal change.
PiperOrigin-RevId:
197816560
Shashi Shekhar [Thu, 24 May 2018 00:14:39 +0000 (17:14 -0700)]
Refactor StatSummarizer extract common functionality without proto dependencies.
PiperOrigin-RevId:
197816405
Benoit Steiner [Wed, 23 May 2018 23:45:26 +0000 (16:45 -0700)]
Simplify the remapper code and added support for non scalar mean, variance, scale and offset.
PiperOrigin-RevId:
197812268
A. Unique TensorFlower [Wed, 23 May 2018 23:34:00 +0000 (16:34 -0700)]
Open source rewrite_for_inference().
PiperOrigin-RevId:
197810460
A. Unique TensorFlower [Wed, 23 May 2018 23:33:27 +0000 (16:33 -0700)]
Make depthwiseconv handler handle filter ranges beyond 255
PiperOrigin-RevId:
197810361
Priya Gupta [Wed, 23 May 2018 23:10:30 +0000 (16:10 -0700)]
Add support for IndexedSlices in Distribution Strategy all reduce. Issue reported in #19069
PiperOrigin-RevId:
197806955
Justin Lebar [Wed, 23 May 2018 23:06:03 +0000 (16:06 -0700)]
Run only small and medium tests in CI builds.
PiperOrigin-RevId:
197806292
A. Unique TensorFlower [Wed, 23 May 2018 23:02:19 +0000 (16:02 -0700)]
added support for calling fit on Dataset objects
PiperOrigin-RevId:
197805615
Nick Felt [Wed, 23 May 2018 22:58:28 +0000 (15:58 -0700)]
Fix CurlHttpRequest handling unexpectedly large responses
This fixes a few issues with CurlHttpRequest (and correspondingly GcsFileSystem):
- Return status FAILED_PRECONDITION (i.e. "your buffer was too small") when CurlHttpRequest has a direct response buffer and the response is too large for the buffer, instead of UNAVAILABLE, since if the server resource is actually a fixed size, retrying automatically won't help at all. Also, include the message about the too-small buffer size in the returned Status as opposed to logging it, making it more obvious that it's treated as a message about a hard failure versus just a warning.
- If the response was actually a 416 Range Not Satisfied response, fully pretend that the response had no body even if we got one (I'm looking at you GCS... it returns a 177-byte error message). This means:
- Ignore a "buffer too small" error produced by the logic described above
- Don't report the length of that body in GetResultBufferDirectBytesTransferred(), which looks to the client like data corruption, just report 0 (this fix makes it match the behavior of the non-direct-buffer response handling)
I also tweaked the error messages, e.g. the message that includes an HTTP response code shouldn't report the CURLcode since it will always be CURLE_OK at that point.
PiperOrigin-RevId:
197805003
A. Unique TensorFlower [Wed, 23 May 2018 22:46:03 +0000 (15:46 -0700)]
Add support for partitioned variables to SDCA.
PiperOrigin-RevId:
197803127
A. Unique TensorFlower [Wed, 23 May 2018 22:24:47 +0000 (15:24 -0700)]
Adding scatter_nd* ops to Andrtoid build.
PiperOrigin-RevId:
197799974
Dimitris Vardoulakis [Wed, 23 May 2018 22:17:03 +0000 (15:17 -0700)]
[TF:XLA] Add tests to show that the List scheduler handles tuples correctly (in and out of fusions).
PiperOrigin-RevId:
197798787
Justin Lebar [Wed, 23 May 2018 21:58:08 +0000 (14:58 -0700)]
[XLA] Fix exhaustive_f32_elementwise_test's size marker.
"enormous" is a size, not a tag.
PiperOrigin-RevId:
197795125
A. Unique TensorFlower [Wed, 23 May 2018 21:57:23 +0000 (14:57 -0700)]
Allow vars_to_warm_start to be a list of strings or Variables, which allows for non-TRAINABLE_VARIABLES to be warm-started.
PiperOrigin-RevId:
197794989
Justin Lebar [Wed, 23 May 2018 21:56:55 +0000 (14:56 -0700)]
[XLA] Draw hollow arrowheads for small arrays in dumped HLO graphs.
The intent is to make it easier to tell what's "big" and what's "small".
PiperOrigin-RevId:
197794902
Anjali Sridhar [Wed, 23 May 2018 21:36:23 +0000 (14:36 -0700)]
Modify model output names to not be unique when in distribution context.
PiperOrigin-RevId:
197791115
A. Unique TensorFlower [Wed, 23 May 2018 21:33:59 +0000 (14:33 -0700)]
Add NNAPI delegation for EMBEDING_LOOKUP, RNN, SVDF
PiperOrigin-RevId:
197790679
A. Unique TensorFlower [Wed, 23 May 2018 21:12:40 +0000 (14:12 -0700)]
New quantized log(x) for x > 1. Used for LogSoftmax.
PiperOrigin-RevId:
197786738
Mark Daoust [Wed, 23 May 2018 20:55:32 +0000 (13:55 -0700)]
Clear docstrings for auto-generated module files, and detach github links from generated files.
PiperOrigin-RevId:
197783520
Igor Ganichev [Wed, 23 May 2018 20:45:14 +0000 (13:45 -0700)]
Add a test to reproduce copy-on-read bug for variables
PiperOrigin-RevId:
197781741
A. Unique TensorFlower [Wed, 23 May 2018 20:22:27 +0000 (13:22 -0700)]
Internal Change
PiperOrigin-RevId:
197778159
Jiri Simsa [Wed, 23 May 2018 20:17:39 +0000 (13:17 -0700)]
Adding utility class for manipulating a GraphDef.
PiperOrigin-RevId:
197777416
A. Unique TensorFlower [Wed, 23 May 2018 19:35:05 +0000 (12:35 -0700)]
Extracts the SimplifyReshape optimization into its own method.
PiperOrigin-RevId:
197770994
Anna R [Wed, 23 May 2018 19:26:50 +0000 (12:26 -0700)]
Automated g4 rollback of changelist
197741984
PiperOrigin-RevId:
197769770
A. Unique TensorFlower [Wed, 23 May 2018 19:18:23 +0000 (12:18 -0700)]
Introduce Encoder and Decoder classes so that platform/*coding* doesn't have to
depend on framework/resource_handler and framework/variant.
PiperOrigin-RevId:
197768387
Peter Hawkins [Wed, 23 May 2018 18:31:36 +0000 (11:31 -0700)]
[TF:XLA] Register a real implementation of ControlTrigger on XLA devices.
PiperOrigin-RevId:
197759239
Allen Lavoie [Wed, 23 May 2018 17:43:28 +0000 (10:43 -0700)]
Add a checkpointable list data structure
Allows tracking of Layers and other checkpointable objects by number.
Fixes #19250.
PiperOrigin-RevId:
197749961
Peter Hawkins [Wed, 23 May 2018 17:29:58 +0000 (10:29 -0700)]
Update build visibility of //third_party/tensorflow/contrib/signal
PiperOrigin-RevId:
197747430
A. Unique TensorFlower [Wed, 23 May 2018 17:05:58 +0000 (10:05 -0700)]
Combine op-profiles collected from individual TPUs.
PiperOrigin-RevId:
197743291
Mark Daoust [Wed, 23 May 2018 17:01:15 +0000 (10:01 -0700)]
Keep column order in make_csv_dataset.
PiperOrigin-RevId:
197742412
Mark Daoust [Wed, 23 May 2018 16:59:25 +0000 (09:59 -0700)]
Add a "--no_search_hints" flag to the api-docs generator.
PiperOrigin-RevId:
197742114
A. Unique TensorFlower [Wed, 23 May 2018 16:58:30 +0000 (09:58 -0700)]
PiperOrigin-RevId:
197741984
Patrick Nguyen [Wed, 23 May 2018 16:54:06 +0000 (09:54 -0700)]
Fix typo in error message.
PiperOrigin-RevId:
197741341
Bjarke Hammersholt Roune [Wed, 23 May 2018 16:44:39 +0000 (09:44 -0700)]
Quick fix for Kokoro breakage.
PiperOrigin-RevId:
197739982
A. Unique TensorFlower [Wed, 23 May 2018 16:20:12 +0000 (09:20 -0700)]
Add 'platform_' libraries in core/BUILD.
PiperOrigin-RevId:
197736600
A. Unique TensorFlower [Wed, 23 May 2018 16:16:52 +0000 (09:16 -0700)]
Support batch size > 1 in L2Normalization 8 bit quantized implementations.
PiperOrigin-RevId:
197736184
Peter Hawkins [Wed, 23 May 2018 13:45:12 +0000 (06:45 -0700)]
Add a method XlaTensor:ReleaseShapedBuffer() to relinquish the shaped buffer owned by an XlaTensor.
Add an equality operator for xla::ShapeIndexView.
PiperOrigin-RevId:
197716313
Peter Hawkins [Wed, 23 May 2018 12:19:00 +0000 (05:19 -0700)]
[TF:XLA:GPU] Relax test tolerance due to flakiness.
PiperOrigin-RevId:
197708758
Benoit Steiner [Wed, 23 May 2018 04:57:14 +0000 (21:57 -0700)]
Use the right attributes when creating placeholder nodes.
PiperOrigin-RevId:
197673355
A. Unique TensorFlower [Wed, 23 May 2018 02:07:13 +0000 (19:07 -0700)]
Internal Change
PiperOrigin-RevId:
197661636
Bjarke Hammersholt Roune [Wed, 23 May 2018 01:22:37 +0000 (18:22 -0700)]
Add interfaces to Compiler that are sufficient to implement a backend-independent offline auto-tuner for backend configurations of ops as well as automatic testing across candidate configurations.
Also add a simple Scanner class that is handy for parsing things.
PiperOrigin-RevId:
197657512
A. Unique TensorFlower [Wed, 23 May 2018 00:17:17 +0000 (17:17 -0700)]
Fix an issue when mixing sparse and dense features in the same model.
PiperOrigin-RevId:
197650140
A. Unique TensorFlower [Wed, 23 May 2018 00:16:44 +0000 (17:16 -0700)]
Add convolution with NHWC layout to stream executor.
PiperOrigin-RevId:
197650067
Sanjoy Das [Tue, 22 May 2018 23:36:22 +0000 (16:36 -0700)]
[TF:XLA] Bump open source llvm revision to r333002
PiperOrigin-RevId:
197644290
Yu-Cheng Ling [Tue, 22 May 2018 23:31:32 +0000 (16:31 -0700)]
Fix the LSTM test in TFLite.
PiperOrigin-RevId:
197643581
A. Unique TensorFlower [Tue, 22 May 2018 23:03:16 +0000 (16:03 -0700)]
Expose the new collective reduce and broacast ops as non-public
python interface functions. Note that they are not yet fully
implemented; this change is to facilitate further development.
PiperOrigin-RevId:
197639372
Ruoxin Sang [Tue, 22 May 2018 22:51:17 +0000 (15:51 -0700)]
Always append the trailing slash when look up or insert a directory path in the stat cache.
PiperOrigin-RevId:
197637482
Justine Tunney [Tue, 22 May 2018 22:30:02 +0000 (15:30 -0700)]
Remove reservoir sampling from SummaryDbWriter
PiperOrigin-RevId:
197634162
A. Unique TensorFlower [Tue, 22 May 2018 22:24:01 +0000 (15:24 -0700)]
Adds a kernel that checks whether vector is zero or not.
PiperOrigin-RevId:
197633182
Dimitris Vardoulakis [Tue, 22 May 2018 22:01:15 +0000 (15:01 -0700)]
[TF:XLA] Add clarification to the DFS scheduler.
PiperOrigin-RevId:
197629355
Sanjoy Das [Tue, 22 May 2018 21:55:12 +0000 (14:55 -0700)]
Extract out common code and make things safer; NFC
RowMajorMatrixVectorProductEmitter and ColumnMajorMatrixVectorProductEmitter
both cache* the generated LLVM IR by keying off the dimensions of the operation,
the primitive type etc. Before this CL the code computing the cache key lived
separately from the GEMV emitters. This pattern introduces a risk that the GEMV
emitters will end up with some state not modeled in the cache key, resulting in
a subtle bug.
This CL reduces the risk by escapsulating the cache key generation and the input
configuration to the GEMV emitters in a single class.
* In the sense that two different dot operations with the same M,K,N will share
a single LLVM IR function body.
PiperOrigin-RevId:
197628423
A. Unique TensorFlower [Tue, 22 May 2018 21:52:36 +0000 (14:52 -0700)]
[TF:XLA] Add a helper to update HLO reachability.
This can be used if the user does not care if reachability changed after an
update.
PiperOrigin-RevId:
197628007
Dimitris Vardoulakis [Tue, 22 May 2018 21:39:47 +0000 (14:39 -0700)]
[TF:XLA] Roll back the functionality change of cl/
197458260 to unbreak test.
PiperOrigin-RevId:
197625888
Nick Desaulniers [Tue, 22 May 2018 21:08:57 +0000 (14:08 -0700)]
[TF:XLA] make miscomparison error messages more readable
PiperOrigin-RevId:
197620560
Yuanzhong Xu [Tue, 22 May 2018 20:59:48 +0000 (13:59 -0700)]
[XLA] Skip BF16 output conversion folding when CRS is the root.
PiperOrigin-RevId:
197618934
A. Unique TensorFlower [Tue, 22 May 2018 20:49:08 +0000 (13:49 -0700)]
Collective Ops Part 7
Complete just enough of the core implementation to run
multi-device collectives locally within a single process.
Interfaces are still private and not availble for general use.
PiperOrigin-RevId:
197617132
Derek Murray [Tue, 22 May 2018 20:14:18 +0000 (13:14 -0700)]
Move executor_test.cc to tensorflow/core/common_runtime/.
PiperOrigin-RevId:
197611583
Akshay Modi [Tue, 22 May 2018 19:46:30 +0000 (12:46 -0700)]
Fix memory leak when going from the fast path to the slow path in eager
Fixes #19385
PiperOrigin-RevId:
197607384
Jianwei Xie [Tue, 22 May 2018 19:36:35 +0000 (12:36 -0700)]
Detect unknown batch size in predictions dict
PiperOrigin-RevId:
197606059
Benjamin Kramer [Tue, 22 May 2018 19:34:51 +0000 (12:34 -0700)]
[XLA:GPU] Emit fused reduces from batchnorm expander
This is an intermediate step until we have working multi-output fusion. Once
we have it, this change should be reverted as it might interfere with fusion.
PiperOrigin-RevId:
197605814
Benjamin Kramer [Tue, 22 May 2018 18:52:51 +0000 (11:52 -0700)]
[XLA:GPU] Add lowering for input fusions with multiple reduce outputs
This is limited to reduces that have the same shapes and reduced dimensions.
Most of the code is making the individual emission code emit multiple reduction
in the same loop. This requires multi-output fusion to provide a speedup.
PiperOrigin-RevId:
197599248
A. Unique TensorFlower [Tue, 22 May 2018 18:44:52 +0000 (11:44 -0700)]
Actually return the value from train_and_evaluate.
PiperOrigin-RevId:
197597953
A. Unique TensorFlower [Tue, 22 May 2018 18:02:30 +0000 (11:02 -0700)]
* Remove the bias centering graph if it is turned off.
* Create consts once. Otherwise each time the constant is passed to an Op, a new Const op is created.
* Speed up the graph construction by using a functions to build splits.
PiperOrigin-RevId:
197590220
Mustafa Ispir [Tue, 22 May 2018 17:42:31 +0000 (10:42 -0700)]
Adding stop request capability to CheckpointSaverListener. An example usage of it is stopping training based on evaluation metrics. Example usage is as follows:
my-estimator = tf.estimator.DNNClassifier(...)
stopper = StopTrainingBasedOnEvaluateMetrics(my-estimator)
my-estimator.train(..., saving_listeners=[stopper])
where:
class StopTrainingBasedOnEvaluateMetrics(tf.train.CheckpointSaverListener):
"""A saver listener to run evaluate with every checkpoint."""
def __init__(self, estimator):
self._estimator = estimator
def after_save(self, session, global_step_value):
eval_results = self._estimator.evaluate(...)
if stop-if-started-overfitting(eval_results):
return True
PiperOrigin-RevId:
197586515
Akshay Agrawal [Tue, 22 May 2018 17:26:00 +0000 (10:26 -0700)]
Make init_scope preserve the inner device stack when lifting into a graph.
Eager execution doesn't implement device stacks and in particular it doesn't support device functions (which determine the device on a per-op basis), so in general it's not possible to do the same when lifting into the eager context.
PiperOrigin-RevId:
197583446
Dan Moldovan [Tue, 22 May 2018 16:43:06 +0000 (09:43 -0700)]
Special case the 'dict' call, which trips other mechanisms for built-ins.
PiperOrigin-RevId:
197576297
Benjamin Kramer [Tue, 22 May 2018 16:08:06 +0000 (09:08 -0700)]
[TF:XLA] Fix xla_interpreter_device build
PiperOrigin-RevId:
197571618
A. Unique TensorFlower [Tue, 22 May 2018 15:18:11 +0000 (08:18 -0700)]
Contributing guidelines, style guide and README updates
PiperOrigin-RevId:
197564905
A. Unique TensorFlower [Tue, 22 May 2018 15:14:49 +0000 (08:14 -0700)]
Update calls to addPassesToEmitFile
PiperOrigin-RevId:
197564506
A. Unique TensorFlower [Tue, 22 May 2018 15:12:41 +0000 (08:12 -0700)]
Fix a couple of broken links in the Swift For TensorFlow page.
PiperOrigin-RevId:
197564254
A. Unique TensorFlower [Tue, 22 May 2018 15:02:39 +0000 (08:02 -0700)]
Automated g4 rollback of changelist
197527651
PiperOrigin-RevId:
197562826
Benjamin Kramer [Tue, 22 May 2018 14:06:08 +0000 (07:06 -0700)]
[XLA:TF] Run buildifier on llvm.BUILD
Buildifier recently started sorting load args
https://github.com/bazelbuild/buildtools/commit/
3ac5f85b22bc44820c041d0cacd3bc2ed54e7742 which causes diffs in the output.
PiperOrigin-RevId:
197556554
A. Unique TensorFlower [Tue, 22 May 2018 12:50:34 +0000 (05:50 -0700)]
[XLA] Optimize ShapeTree<T>
This optimizes ShapeTree quite significantly. In particular this optimizes for the common case of querying/iterating, copying and moving ShapeTrees.
* Allocate all ShapeTreeNodes inside a single, owned, vector. This reduces the number of memory allocations and improves cache performance.
* Instead of storing children nodes as unique_ptrs, store them as indices into the owning container's vector. This allows cheap copy-construction (a std::vector POD copy) and doesn't change the fast path (dereferencing a pointer is just as fast as dereferencing a base + offset).
* Instead of a unique_ptr<Shape>, use a shared_ptr<Shape>. This removes a load of copy-construction overhead at the cost of a shared_ptr over a unique_ptr (one extra allocation).
* Instead of computing ShapeIndexes on-demand in the iterators/ForEach*, precompute them during construction time. This adds a few more bytes per ShapeTree, but now we can...
* ... store a std::pair<ShapeIndex, T> as the ShapeTreeNode's data element. This allows us to provide a std::pair<K,V>&, STL-like interface from iterators without going through any of the previous unique_ptr hacks around storage lifetimes.
* Because we no longer need to iterate from the beginning to build up the ShapeIndex, we can now offer a ::find() function to return an iterator for a ShapeIndex in O(K) time. As the iteration order is guaranteed to be pre-order, this can be used (and will be, later) to speed up the fast-path of mutating a subtree of a ShapeTree from tf2xla::ExtractSubBuffers.
* Similarly because we now have a very standard, cheap STL interface with no performance cliffs, we can hopefully improve ShapedBuffer's copy and move constructors to be cheaper.
PiperOrigin-RevId:
197548717
A. Unique TensorFlower [Tue, 22 May 2018 09:27:45 +0000 (02:27 -0700)]
internal change
PiperOrigin-RevId:
197533162
A. Unique TensorFlower [Tue, 22 May 2018 09:21:30 +0000 (02:21 -0700)]
batch_util.h is generally useful so moved to util/ from kernels/ where it will be included in the pip package.
PiperOrigin-RevId:
197532524
A. Unique TensorFlower [Tue, 22 May 2018 08:35:36 +0000 (01:35 -0700)]
convert Pow op into something that is more recognizable, so we can have further
optimizations.
PiperOrigin-RevId:
197527651
A. Unique TensorFlower [Tue, 22 May 2018 08:01:01 +0000 (01:01 -0700)]
Automated g4 rollback of changelist
197487461
PiperOrigin-RevId:
197523867
A. Unique TensorFlower [Tue, 22 May 2018 07:44:47 +0000 (00:44 -0700)]
Unifiy the cuda toolchain definition of gcc/nvcc and cuda-clang.
gcc >= 7 will change how it treats -pie [1]; passing -pie after -shared
on the command line is not possible any more; given that the legacy way to
configure flags in the gcc/nvcc toolchain does not allow control over where
the flags go or how to provide -pie only for linking of binaries, we can
prevent this from breaking in the future by also using the new feature
mechanism for gcc/nvcc.
In addition to moving the gcc-specific workarounds in the toolchain to
cuda_configure.bzl, document them, so we don't need to rediscover them in the
future.
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77464
PiperOrigin-RevId:
197522719
A. Unique TensorFlower [Tue, 22 May 2018 06:37:12 +0000 (23:37 -0700)]
Enable tpu.rewrite to work on XLA CPU/GPU backends.
PiperOrigin-RevId:
197517946
Justin Lebar [Tue, 22 May 2018 03:41:26 +0000 (20:41 -0700)]
[XLA:GPU] Implement trivial (one-replica) cross-replica-sum on XLA:GPU.
Also fix the CPU implementation to work in the case when there are
multiple operands to the cross-replica-sum op.
PiperOrigin-RevId:
197506311
A. Unique TensorFlower [Tue, 22 May 2018 03:27:53 +0000 (20:27 -0700)]
Update scan benchmarks to have a range of 16K-128K iterations. As of https://github.com/tensorflow/tensorflow/commit/
5802096c267c805f6a69798aac10aefef759bb9f, TensorFlow Eager no longer exhibits quadratic behavior. The benchmark is still ~5x slower in eager mode vs. graph mode, and maybe slightly worse than linear:
n Graph Time (s) Eager Time (s) Ratio
-----------------------------------------------
16K 0.35 1.8 5.1
32K 0.64 3.6 5.6
64K 1.1 7.2 6.5
128K 2.4 14.8 6.2
PiperOrigin-RevId:
197505257
Michael Kuperstein [Tue, 22 May 2018 03:06:39 +0000 (20:06 -0700)]
Internal Change
PiperOrigin-RevId:
197503560