In this example we will use Theano to train logistic regression models on a simple two-dimensional data set. We will use Optunity to tune the degree of regularization and step sizes (learning rate). This example requires Theano and NumPy.
We start with the necessary imports:
import numpy from numpy.random import multivariate_normal rng = numpy.random import theano import theano.tensor as T import optunity import optunity.metrics
The next step is defining our data set. We will generate a random 2-dimensional data set. The generative procedure for the targets is as follows: \(1 + 2 * x_1 + 3 * x_2\) + a noise term. We assign binary class labels based on whether or not the target value is higher than the mean target:
N = 200 feats = 2 noise_level = 1 data = multivariate_normal((0.0, 0.0), numpy.array([[1.0, 0.0], [0.0, 1.0]]), N) noise = noise_level * numpy.random.randn(N) targets = 1 + 2 * data[:,0] + 3 * data[:,1] + noise median_target = numpy.median(targets) labels = numpy.array(map(lambda t: 1 if t > median_target else 0, targets))
The next thing we need is a training function for LR models, based on Theano’s example:
training_steps = 2000 def train_lr(x_train, y_train, regularization=0.01, step=0.1): x = T.matrix("x") y = T.vector("y") w = theano.shared(rng.randn(feats), name="w") b = theano.shared(0., name="b") # Construct Theano expression graph p_1 = 1 / (1 + T.exp(-T.dot(x, w) - b)) # Probability that target = 1 prediction = p_1 xent = -y * T.log(p_1) - (1-y) * T.log(1-p_1) # Cross-entropy loss function cost = xent.mean() + regularization * (w ** 2).sum() # The cost to minimize gw, gb = T.grad(cost, [w, b]) # Compute the gradient of the cost # (we shall return to this in a # following section of this tutorial) # Compile train = theano.function( inputs=[x,y], outputs=[prediction, xent], updates=((w, w - step * gw), (b, b - step * gb))) predict = theano.function(inputs=[x], outputs=prediction) # Train for i in range(training_steps): train(x_train, y_train) return predict, w, b
Now that we know how to train, we can define a modeling strategy with default and tuned hyperparameters:
def lr_untuned(x_train, y_train, x_test, y_test): predict, w, b = train_lr(x_train, y_train) yhat = predict(x_test) loss = optunity.metrics.logloss(y_test, yhat) brier = optunity.metrics.brier(y_test, yhat) return loss, brier def lr_tuned(x_train, y_train, x_test, y_test): @optunity.cross_validated(x=x_train, y=y_train, num_folds=3) def inner_cv(x_train, y_train, x_test, y_test, regularization, step): predict, _, _ = train_lr(x_train, y_train, regularization=regularization, step=step) yhat = predict(x_test) return optunity.metrics.logloss(y_test, yhat) pars, _, _ = optunity.minimize(inner_cv, num_evals=50, regularization=[0.001, 0.05], step=[0.01, 0.2]) predict, w, b = train_lr(x_train, y_train, **pars) yhat = predict(x_test) loss = optunity.metrics.logloss(y_test, yhat) brier = optunity.metrics.brier(y_test, yhat) return loss, brier
Note that both modeling functions (train, predict, score) return two score measures (log loss and Brier score). We will evaluate both modeling approaches using cross-validation and report both performance measures (see Cross-validation). The cross-validation decorator:
outer_cv = optunity.cross_validated(x=data, y=labels, num_folds=3, aggregator=optunity.cross_validation.list_mean) lr_untuned = outer_cv(lr_untuned) lr_tuned = outer_cv(lr_tuned)
At this point, lr_untuned and lr_tuned will return a 3-fold cross-validation estimate of [logloss, Brier] when evaluated.
This example is available in detail in <optunity>/bin/examples/python/theano/logistic_regression.py. Typical output of this script will look like:
true model: 1 + 2 * x1 + 3 * x2 evaluating untuned LR model + model: -0.18 + 1.679 * x1 + 2.045 * x2 ++ log loss in test fold: 0.08921125198 ++ Brier loss in test fold: 0.0786225946458 + model: -0.36 + 1.449 * x1 + 2.247 * x2 ++ log loss in test fold: 0.08217097905 ++ Brier loss in test fold: 0.070741583014 + model: -0.48 + 1.443 * x1 + 2.187 * x2 ++ log loss in test fold: 0.10545356515 ++ Brier loss in test fold: 0.0941325050801 evaluating tuned LR model + model: -0.66 + 2.354 * x1 + 3.441 * x2 ++ log loss in test fold: 0.07508872472 ++ Brier loss in test fold: 0.0718020866519 + model: -0.44 + 2.648 * x1 + 3.817 * x2 ++ log loss in test fold: 0.0718891792875 ++ Brier loss in test fold: 0.0638209513581 + model: -0.45 + 2.689 * x1 + 3.858 * x2 ++ log loss in test fold: 0.06380803593 ++ Brier loss in test fold: 0.0590374290183 Log loss (lower is better): untuned: 0.0922785987325000 tuned: 0.070261979980 Brier loss (lower is better): untuned: 0.0811655609133 tuned: 0.0648868223427