Porting MXNet's Neural-Style Python example to MXNet.jl
The following was copied from https://github.com/dmlc/MXNet.jl/issues/56 to document what was needed to port Neural Style (which transfers the “style” of one image to antoher using an image recognition neural network), and may be modified from the original to add context.
These notes intend to document what I’ve had to do to get a working implementation of A Neural Algorithm of Artistic Style from within Julia. These may or may not be applicable for other examples ported from Python to Julia, but are here for reference.
MXNet.jl
- Official documentation might be a good place to start.
argparse to ArgParse.jl
This library is used to turn it into a command-line program.
Named tuples to Composite types
This translation is pretty easy, and Julia’s built-in composite types are very straightforward to work with, compared to Python’s named tuples which require from collections import namedtuple
. For the ConvExecutor
, it means going from
Executor = namedtuple('Executor', ['executor', 'data', 'data_grad'])
to
type SGExecutor
executor :: mx.Executor
data :: mx.NDArray
data_grad :: mx.NDArray
end
I also renamed it just to avoid ambiguity. The type signatures are optional, but I included them for type safety.
Row-major order to Column-major order arrays
If interested, read the wikipedia article here, but summarized, Julia tends to think of a matrix as a array of column-vectors, while Python natively stores it as a list of row lists. The major difference when it comes to dealing with multi-dimensional arrays is that the ordering of shape is inverted.
Example:
out.infer_shape(data=(1, 3, input_size[0], input_size[1]))
would become the following in Julia
mx.infer_shape(out, data=(input_size[1], input_size[2], 3, 1))
Memory leaks
As of Julia v0.4, some methods cannot be directly overwritten such as assignment operators (_ ⊗= _
for some binary operation ⊗ in [-,+,/,*]
), so for lines that take the form a[:] ⊗= ...
, you can take one of several approaches to avoid unnecessary use of graphics memory, all of which are described in the NDArray source code.
b=copy(a::NDArray)
to a julia array then perform your operations and thencopy!(a,b)
back.- some combination of
mx.mult_to!
,mx.div_from!
, etc. for directly modifying the NDArray - Use the macro
mx.@nd_as_jl
to work as if you were just using native arrays
This takes arguments ro
for NDArrays that are read and rw
for those that are written to. This macro copies everything to native Julia arrays and then writes them back into the NDArrays.
Implement Factor-based Learning Rate
This required modifying Mocha.jl’s src/optimizer.jl as to include an unimplemented subtype of AbstractLearningRateScheduler
so it can cooperate with the Stochastic Gradient Descent. There may be a better way to do this. Also not entirely comfortable with the robustness of my implementation as of the time of writing.