So, you mean two modes for teaching gradient descents. In batch mode, changes in the weight matrix are accumulated throughout the entire presentation of the training data set (one “era”); Online training updates the weight after presenting each vector containing a set of workouts.
I believe the consensus is that online learning is excellent because it converges much faster (most studies lack obvious differences in accuracy). (See, for example, Randall Wilson and Tony Martinez, “General Inefficiency of Batch Learning for Learning Gradient Descent,” “Neural Networks” (2003).
, - , , . , (, , ).
, . , - , .
"" ( ML ..), , . , ( ) .