Although the library stops developing, it is worth to learn since there are many prominent papers using Theano. Besides, there are lots of stuff to learn from the implementation.

So, Theano basically is a **symbolic graphs of computation**. Plus, it
can do automatically gradient computing stuff.

# Learning Materials

# Installation

```
sudo pip install theano
```

To fix the inconsistent between `float32`

and
`float64`

, in `.theanorc`

:

```
[global]
floatX = float32
```

# Getting Started

There are several conventions:

```
import theano.tensor as T
from theano import function
```

`T`

stands for tensor, it includes tensor operations.`function`

constructs a function from given input, output and other properties.- function([input], output): the input always is a list even though there is only 1 argument.

## Algebra

### Scalars

```
x = T.dscalar('x')
y = T.dscalar('y')
z = x + y
f = function([x, y], z)
```

To create a scalar varialbe, use `T.<type>scalar('name_variable')`

where `<type>`

stands for the type of
the variable. The prefixes `b,i,f,d,c`

used for `byte, integer, float, double, complex`

respectively. By the way, there are 7 primitive types in Theano: byte (b), 16-bit integer (w), 32-bit integers (i), 4-bit integers (l), float (f), double (d), complex ©.

Use `pp`

to pretty-print the symbolic variable of Theano.

### Function

This damn thing seems very important. Let break it down and see how to manipulate it. Beyound basic argument including inputs, output, there are two more fancier ones:

`givens`

: pairs of`Var1, Var2`

which later the function will substitute`Var1`

by`Var2`

.`updates`

: Update rules.

To break the function down or debug it, `pydotprint`

visualize the function by graph. By far, this is the most intuitive way to examine the function.

# Data Management

Must-Read material: Understand Memory Aliasing for Speed and Correctness. There are some takeaway notes:

- Use
`borrow=True`

when creating new share variables. - Use
`borrow=False`

when retrieving the values of TensorVariable, this also is the default value of`borrow`

in`get_value`

.

# Tensor Operations

## Nondifferentiable functions

Firstly, let take a look at the `sgn`

function. Its gradients are zeros. It means that if we put the signed function, which is especially common in the hashing problem, all prior components of the network are not able to update their weights. Therefore, if the loss function is an autoencoder, namely:

$$ L = \left\lVert f(sgn(g(X))) - X \right\rVert $$

where $g$ and $f$ are encoder and decoder, respectively. We could not learn the encoder at all.