zero.stream¶
Smart Python loops.
Stream¶
-
class
zero.stream.
Stream
(loader)[source]¶ Smart wrapper for iterables.
Stream
simplifies managing loops, especially in typical deep learning scenarios (it is usually used to wraptrain_dataloader
or any other data source).simplifies management of the “epoch” and “iteration” variables
allows to customize the size of epoch
allows to change the underlying data loader on the fly
enables useful patterns
(not implemented: issue) allows to dump and restore loop’s state: epoch, iteration, etc.
- Parameters
loader – any kind of iterable (DataLoader, list, iterator, generator, …)
- Raises
AssertionError – if
loader
is not an iterator and is empty
Examples
stream = Stream([0, 1, 2, 3]) stream = Stream(range(10)) import itertools stream = Stream(itertools.repeat(0)) from torch.utils.data import DataLoader, TensorDataset dataset = TensorDataset(torch.randn(10, 2)) stream = Stream(DataLoader(dataset, batch_size=3, shuffle=True))
Tutorial
Let’s revise the conventional approach without
Stream
:loader = DataLoader(...) iteration = 0 for epoch in range(n_epoches): if need_custom_epoch_size(): assert False, 'It is possible, but not convenient' for x in loader: iteration += 1 print('Epoch:', epoch, 'Iteration:', iteration) ... if need_new_loader(): assert False, 'It is possible, but not convenient'
There are several ways how you can use
Stream
to enhance this loop. Let’s start with creating a stream:stream = Stream(DataLoader(...))
The dataloader is accessible via
Stream.loader
. Now, let’s reproduce the loop above:for epoch in range(n_epoches): for x in stream.data(): print('Epoch:', epoch, 'Iteration:', stream.iteration) # or while stream.increment_epoch(n_epoches): for x in stream.data(): print('Epoch:', stream.epoch, 'Iteration:', stream.iteration)
Firstly, we see that
Stream.iteration
is created and incremented automatically. We also see thatwhile
loop can be used instead of more “conventional”for
. It brings the following differences:restoring stream’s state via the
state_dict
mechanism becomes possibleterminating the loop by adding more conditions to the
while
statement becomes possible; for example, withzero.training.ProgressTracker
early stopping can look like this:while not progress.fail and stream.increment_epoch(n_epoches):
epoches numeration effectively starts from 1; it is consistent with iterations numeration (also starts from 1)
In order to customize the epoch size, pass the size to
Stream.data
:while stream.increment_epoch(n_epoches): for x in stream.data(custom_epoch_size): ...
Changing the underlying loader on the fly is possible at any moment (even in the middle of epoch) via
Stream.set_loader
. For example:while stream.increment_epoch(n_epoches): for x in stream.data(custom_epoch_size): ... if need_new_loader(): stream.set_loader(new_loader)
Additionally, two new forms of infinite loop become possible:
for x in stream.data(math.inf): ... if stream.iteration % frequency: ... while True: x = stream.next() ... if stream.iteration % frequency: ...
Note
For better technical understanding, keep in mind that
Stream
simply incapsulates an “infinite iterator” that is constantly moving forward. The behavior is absolutely the same for both finite and infinite iterables and can be expressed with the following loop:while True: for item in loader: # loader which is passed in the constructor ...
Documentation for
Stream.next
andStream.data
provide helpful examples.See also
ManualStream
: likeStream
, but for cases when one logical step (e.g. training step) does not correspond to one iteration.
Current iteration. |
|
Current epoch. |
|
The underlying loader. |
|
|
(Try to) increment epoch. |
|
Iterate over the loader. |
Get the next item and increment iteration. |
|
Set the underlying iterator to |
|
|
Set new loader. |
ManualStream¶
-
class
zero.stream.
ManualStream
(*args, **kwargs)[source]¶ Like
Stream
, but with additional fine-graded control.ManualStream
can be useful when one logical step does not correspond to one iteration (for example, you collect data from several iterations to build one training batch). The class inherits fromStream
and adds some features (see documentation for details).
Current manual step. |
|
Increment manual step. |
|
|
Iterate over the loader. |