The other problem that bedeviled sites that allowed arbitrary HTML back in the day was crude phishing attempts: convert your user profile into a fake login page with CSS and HTML. Blocking this entirely is probably impossible. I suppose some machine learning could be used to detect pages styled as phishing attempts.
There were also all sorts of ways to sneak JavaScript back in. I remember embedding a javascript: protocol link inside a Flash applet would do it (flash eventually blocked that though).
Pretty sure if there's no JS you could just block iframes and maybe form tags and then people would have no way to submit anything. They could click a malicious link, sure, but they can do that on today's social networks.
Then you can replace the website "chrome"- the headers, the links back to the rest of the site- with doppelgangers that take you to a phishing page that makes it look like you've been logged out. All of those you'd expect to be internal links, so when they show you a "please log in again" screen you will have no reason for suspicion. You can't do that on Facebook today.
Alternatively, you don't need a form tag. Just show a login set of text inputs and an image that looks like a submit button. That button links you to a phishing site that says "oops! try again" and then you put your password in a second time and this time it's a real form. So you'd have to get rid of text inputs entirely.
If I understand you correctly those "you are leaving example.com" interstitial pages with a redirect are a solution to this problem. Although they are not so pleasant.
There were also all sorts of ways to sneak JavaScript back in. I remember embedding a javascript: protocol link inside a Flash applet would do it (flash eventually blocked that though).