-
Providing a very small fragment of source HTML, suitably edited to remove context, would really help in trying to understand exactly what the underlying issue is.
I've just tested copying superscript HTML characters into Word (Paste Special > Unformatted text) and don't find it corrupts anything, and into Notepad++, and it just comes in as regular unformatted text.
[edit: crossed over with your previous reply. Thing is, the "2" is part of the text content in HTML, so a method to strip it out probably needs to distinguish it by looking at the HTML formatting tags which surround it]
Do you have a sample source doc?