Your question really comes down to how a delta-sigma A/D works.
The A/D internally models the input voltage with a stream of bits. Each bit can only indicate the minimum or maximum voltage. The aim is to make a stream of bits that in the aggregate when averaged together accurately represent the input voltage. The count of 1 bits out of the total number of bits is the ultimate digital answer of the delta-sigma A/D converter.
At least that's how a basic delta-sigma works. There are more wrinkles in most real implementations, like higher order. Let's ignore those for now and focus just on a basic bare-bones delta-sigma so that the fanciness doesn't obscure the basic workings.
Let's use a 8-bit delta-sigma converter as example. It internally produces a stream of 255 bits. The higher the voltage being converted, the more 1s need to be in this stream. The final 8 bit answer is the count of 1s in the stream.
So how do you generate this stream of bits? By constantly tracking the voltage that stream of bits represents so far, then deciding whether that's currently too high or too low. If too low, you make the next bit a 1. If too high, you make the next bit a 0. Now that new bit gets included in the internal voltage representing all bits so far. Then you check whether that's too high or too low again, make the next bit, etc.
That's the simplistic concept. The system in your diagram achieves the same thing by averaging out the error from each bit to the input signal, then comparing that to a fixed threshold, rather than averaging all the bits and comparing that to the input signal. It comes out to the same thing.
So where does the integrator get this "error from each bit to the input signal" from to integrate? That's what the diff amp makes. It literally subtracts the input signal from the last bit value (either full high or full low voltage) to make the error contributed by the last bit. Integrating this error produces the running average of the error from all bits produced in the stream so far. That gets compared to a fixed threshold by the comparator block, which results in the next bit in the stream.
What the block diagram doesn't show is the counter that keeps track of the number of 1 bits produced. You keep running this process 2N-1 times, which results in a N-bit value proportional to the input voltage.