Your phone transmits the images with their security envelope (which was computed on device and contains neural hash and "visual derivative") to the iCloud server. During that process, Apple does not know whether there's any CSAM in it, so they can transmit legally.
Then the server determines whether the number of matches exceeds the threshold. Only if that is the case (by crypto magic) can the security envelope of the flagged images only (by crypto magic) be unlocked, and the "visual derivative" be reviewed.
(Note that if (at a later stage) E2EE is enabled for the photos, the images themselves would never be accessible by Apple or LE, whether flagged or not, if I understand the design correctly).
Your understanding seems correct. After a positive evaluation from Apple, the CyberTipline report is filed to NCMEC, which operates as a clearinghouse and notifies law enforcement.
Law enforcement then gets a court order, which will cause Apple to release requested/available information about that account.
If photos later are outside the key escrow system, Apple would not be able to release the photos encryption keys, and would only be able to share 'public' photo share information which the user has opted to share without encryption to make it web-accessible.
Presumably in this case, the visual derivatives would still be used as evidence, but there would be 5th amendment arguments around forcing access to a broader set of photos.
Then the server determines whether the number of matches exceeds the threshold. Only if that is the case (by crypto magic) can the security envelope of the flagged images only (by crypto magic) be unlocked, and the "visual derivative" be reviewed.
(Note that if (at a later stage) E2EE is enabled for the photos, the images themselves would never be accessible by Apple or LE, whether flagged or not, if I understand the design correctly).