New Code Example : Forward-Forward Algorithm for Image Classification#1170
Conversation
Signed-off-by: Suvaditya Mukherjee <suvadityamuk@gmail.com>
|
Hi, @fchollet! Gentle ping for a review, thank you! Oh, and Merry Christmas 🎄 ! |
|
Thanks for the PR! I'll take a close look in the next few days. |
fchollet
left a comment
There was a problem hiding this comment.
Thanks for the PR. It's a great example!
| training instead of the traditionally-used method of backpropagation, as proposed by | ||
| [Prof. Geoffrey Hinton](https://www.cs.toronto.edu/~hinton/FFA13.pdf) | ||
| The concept was inspired by the understanding behind [Boltzmann | ||
| Machines](http://www.cs.toronto.edu/~fritz/absps/dbm.pdf). Backpropagation involves |
There was a problem hiding this comment.
Please keep all markdown links on a single line, otherwise they won't properly render
There was a problem hiding this comment.
Will make this change across the file
| [Prof. Geoffrey Hinton](https://www.cs.toronto.edu/~hinton/FFA13.pdf) | ||
| The concept was inspired by the understanding behind [Boltzmann | ||
| Machines](http://www.cs.toronto.edu/~fritz/absps/dbm.pdf). Backpropagation involves | ||
| calculating loss via a cost function and propagating the error across the network. On the |
There was a problem hiding this comment.
I think you can find a slightly better one-line description of backprop
There was a problem hiding this comment.
Sure, will update that.
|
|
||
| The following example explores how to use the Forward-Forward algorithm to perform | ||
| training instead of the traditionally-used method of backpropagation, as proposed by | ||
| [Prof. Geoffrey Hinton](https://www.cs.toronto.edu/~hinton/FFA13.pdf) |
There was a problem hiding this comment.
"Proposed by Hinton in [The Forward-Forward Algorithm: Some Preliminary Investigations]() (2022)"
There was a problem hiding this comment.
Right, will make this change
| other hand, the FF Algorithm suggests the analogy of neurons which get "excited" based on | ||
| looking at a certain recognized combination of an image and its correct corresponding | ||
| label. | ||
| This method takes certain inspiration from the biological learning process that occurs in |
There was a problem hiding this comment.
Please add line breaks between paragraphs
| Date created: 2022/12/21 | ||
| Last modified: 2022/12/23 | ||
| Description: Training a Dense-layer based model using the Forward-Forward algorithm. | ||
|
|
There was a problem hiding this comment.
Add an Accelerator: field (either GPU or None)
| x = self.flatten_layer(x) | ||
| perm_array = tf.range(start=0, limit=x.get_shape()[0], delta=1) | ||
| x_pos = self.overlay_y_on_x(x, y) | ||
| y_numpy = y.numpy() |
| plt.show() | ||
| else: | ||
| x = layer(x) | ||
| return {"FinalLoss": loss} |
There was a problem hiding this comment.
train_step should return average values, e.g. the output of loss_tracker.result() (where loss_tracker is a metrics.Mean instance).
Also, use snake case
There was a problem hiding this comment.
Will make the change
| h_pos, h_neg = x_pos, x_neg | ||
| for idx, layer in enumerate(self.layers): | ||
| if idx == 0: | ||
| print("Input layer : No training") |
There was a problem hiding this comment.
Do not include any print statements or matplotlib plots in train_step. You should write a callback for these.
There was a problem hiding this comment.
Right. I'll implement a callback and make use of that
| model = FFNetwork(dims=[784, 500, 500]) | ||
|
|
||
| model.compile( | ||
| optimizer=keras.optimizers.Adam(learning_rate=0.03), loss="mse", run_eagerly=True |
There was a problem hiding this comment.
You should arrive to a model that can be run in graph mode, without requiring run_eagerly=True
There was a problem hiding this comment.
Alright, I will try to do so
|
|
||
| results = accuracy_score(preds, y_test) | ||
|
|
||
| print(f"Accuracy score : {results*100}%") |
There was a problem hiding this comment.
The Accuracy score currently stands at ~24-27% on different runs (can set a constant Seed to enhance reproducibility). While the paper does get somewhat better results (not anywhere near SOTA though), I believe that can be achieved with more tuning and perhaps a wider network
There was a problem hiding this comment.
25% accuracy on MNIST tells you that your algorithm does not work, unfortunately. A simple logistic regression does ~92% or so. A simple logistic regression after applying ~90% noise on the input (setting to 0 90% of the pixels, randomly) still does 67%. Even if it were poorly tuned, the example should achieve at least 97% if the algorithm did work as expected.
There was a problem hiding this comment.
Right. Let me iterate on this algorithm for a while, see if and where I am making some errors and I will get back to you on this
| def forwardforward(self, x_pos, x_neg): | ||
| loss_list = [] | ||
| for i in trange(self.num_epochs): | ||
| with tf.GradientTape() as tape: |
There was a problem hiding this comment.
Aren't we falling back to backprop here?
The loss is calculated, gradients are computed backwards, and then we're doing an optimizer step
There was a problem hiding this comment.
This is happening locally, solely for this layer. So wouldn't qualify for backward propagation just yet.
Signed-off-by: Suvaditya Mukherjee <suvadityamuk@gmail.com>
|
Hi, @fchollet I have just added a new commit with all the changes. Apologies for the delay, took some time to make sure all of your comments are addressed. Also, a very Happy New Year to you! |
|
Also, not sure why the |
Congrats on making it work! The current code looks good to me. Please add the generated files. I pushed some copyedits, please pull them first. |
Signed-off-by: Suvaditya Mukherjee <suvadityamuk@gmail.com>
|
Hi, have added the generated files and some minor factual edits. Thank you! |
fchollet
left a comment
There was a problem hiding this comment.
Thank you for the great contribution! 👍
|
Happy to contribute! |
|
HI, First of all thanks a lot for the code. It's very instructive to see how you implemented this directly in keras/tensorflow. Anyway, I wanted to try it and I just copy pasted the code but I got a: KeyError: 'The optimizer cannot recognize variable dense_1/kernel:0. This usually means you are trying to call the optimizer to update different parts of the model separately. Please call I've tried several thing from yesterday but none worked. Any suggestion on what that could mean on where to add this optimizer.build? |
This PR adds a new Code Example implementation for the Forward-Forward algorithm, as introduced by Prof. Hinton in his paper at NeurIPS 2022.
Some things to note:
TQDMandrandomfor some cleaner visuals and simpler code.Tagging @LukeWood @fchollet for a review. Thank you for your time!
Signed-off-by: Suvaditya Mukherjee suvadityamuk@gmail.com