Make training iterations 0-indexed to display training loss in 0th iter #800

jeffdonahue · 2014-07-27T00:17:54Z

This PR makes the training iterations 0 indexed so that the loss is displayed in the 0th iter (if display is set), which I think makes sense now that we display the test outputs on the "0th" iter. It also makes the logic a bit cleaner/more natural (IMO), e.g. we can just use the one TestAll call inside the training loop rather than having a special one before. I personally like the semantics better -- the iteration number is the number of times the weights have been updated. Which is actually how the snapshots behaved before, so snapshots should be exactly the same (assuming you use a random seed and all other training inputs are the same), but now the loss displayed as "Iteration 20" will be the loss that used to display as "Iteration 21".

Any thoughts, @Yangqing or anyone else?

jeffdonahue · 2014-07-27T00:20:48Z

(it also improves #796 by displaying the debug outputs before any updates have been made, which may help clarify whether a problem is with initialization vs. training)

jeffdonahue · 2014-07-27T00:31:56Z

Oh, on second thought I gave a bad example of how it cleans up the logic; instead of a special test before the training loop we just have to do a special test afterwards instead, whoops. Still prefer the 0-indexing for the other reasons mentioned though :)

Yangqing · 2014-07-27T02:33:28Z

Always thumb up for 0 index :)

Yangqing · 2014-07-27T14:47:33Z

src/caffe/solver.cpp

display && param.debug_info()

May be faster than ?:

jeffdonahue · 2014-07-27T18:45:16Z

Thanks for the review Yangqing! Not sure why I used ternary operator instead of &&; fixed.

Also, there was another small problem: I'd added iter_ > 0 to the "should we snapshot?" condition, but if the snapshot was loaded from a solverstate it would do one update and then overwrite because iter_ is non-zero in the loaded snapshot. I fixed this by remembering the start_iter after loading a snapshot and changed the check to iter_ > start_iter.

I also added a final forward pass of training after optimization so that we display the loss @ max_iter as dictated by the display setting (but no backward pass because we've already done max_iter iterations).

Finally, in the last commit I moved the loss display to right after ForwardBackward, before ComputeUpdateValue. I think it makes sense to print the loss as soon as we know it, but I mainly did it so that the added loss display at the end doesn't look weird:

I0727 11:02:54.009799 17379 solver.cpp:107] Iteration 800, loss = 0.240632
I0727 11:02:54.009887 17379 solver.cpp:281] Iteration 800, lr = 0.00943913
I0727 11:02:55.549463 17379 solver.cpp:107] Iteration 900, loss = 0.159342
I0727 11:02:55.551097 17379 solver.cpp:281] Iteration 900, lr = 0.00937411
I0727 11:02:57.025902 17379 solver.cpp:127] Iteration 1000, loss = 0.0909304

rather than:

I0726 20:44:59.267554  4949 solver.cpp:277] Iteration 800, lr = 0.00943913
I0726 20:44:59.267832  4949 solver.cpp:107] Iteration 800, loss = 0.240632
I0726 20:45:00.714905  4949 solver.cpp:277] Iteration 900, lr = 0.00937411
I0726 20:45:00.715170  4949 solver.cpp:107] Iteration 900, loss = 0.159342
I0726 20:45:02.173050  4949 solver.cpp:123] Iteration 1000, loss = 0.0909304

jeffdonahue · 2014-07-27T19:11:56Z

(also, sorry for polluting the PR with 4fe16db which just fixes some bad variable names I came up with in #796. merging after travis.)

Make training iterations 0-indexed to display training loss in 0th iter

Fix my nonsensical variable names.

4fe16db

Yangqing reviewed Jul 27, 2014
View reviewed changes

src/caffe/solver.cpp Outdated

Copy link

Member

Yangqing Jul 27, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

display && param.debug_info()

May be faster than ?:

jeffdonahue added 2 commits July 27, 2014 12:04

Make training iterations 0-indexed.

84c424b

Move loss display before lr display in ComputeUpdateValue.

91ed16b

jeffdonahue added a commit that referenced this pull request Jul 27, 2014

Merge pull request #800 from jeffdonahue/zero-indexed-train-iter

4bd9489

Make training iterations 0-indexed to display training loss in 0th iter

jeffdonahue merged commit 4bd9489 into BVLC:dev Jul 27, 2014

jeffdonahue deleted the zero-indexed-train-iter branch July 27, 2014 19:23

jeffdonahue mentioned this pull request Jul 30, 2014

Solver callback #738

Closed

shelhamer mentioned this pull request Aug 7, 2014

Next: 0.9999 #880

Merged

mitmul pushed a commit to mitmul/caffe that referenced this pull request Sep 30, 2014

Merge pull request BVLC#800 from jeffdonahue/zero-indexed-train-iter

1bfe407

Make training iterations 0-indexed to display training loss in 0th iter

RazvanRanca pushed a commit to RazvanRanca/caffe that referenced this pull request Nov 4, 2014

Merge pull request BVLC#800 from jeffdonahue/zero-indexed-train-iter

a0866b9

Make training iterations 0-indexed to display training loss in 0th iter

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make training iterations 0-indexed to display training loss in 0th iter #800

Make training iterations 0-indexed to display training loss in 0th iter #800

Uh oh!

jeffdonahue commented Jul 27, 2014

Uh oh!

jeffdonahue commented Jul 27, 2014

Uh oh!

jeffdonahue commented Jul 27, 2014

Uh oh!

Yangqing commented Jul 27, 2014

Uh oh!

Yangqing Jul 27, 2014

Uh oh!

jeffdonahue commented Jul 27, 2014

Uh oh!

jeffdonahue commented Jul 27, 2014

Uh oh!

Uh oh!

Make training iterations 0-indexed to display training loss in 0th iter #800

Make training iterations 0-indexed to display training loss in 0th iter #800

Uh oh!

Conversation

jeffdonahue commented Jul 27, 2014

Uh oh!

jeffdonahue commented Jul 27, 2014

Uh oh!

jeffdonahue commented Jul 27, 2014

Uh oh!

Yangqing commented Jul 27, 2014

Uh oh!

Yangqing Jul 27, 2014

Choose a reason for hiding this comment

Uh oh!

jeffdonahue commented Jul 27, 2014

Uh oh!

jeffdonahue commented Jul 27, 2014

Uh oh!

Uh oh!