| Zero-width & invisible formatting | ZWSP, ZWNJ, ZWJ, WJ, BOM, soft hyphen | Removed | AI watermarks, copy-paste artifacts, steganography | 12+ codepoints |
| Bidirectional controls | LRM, RLM, RLO, LRO, FSI, PDI | Removed | Phishing attacks, filename spoofing, mixed-script text | 18 codepoints |
| Smart punctuation | “ ” ‘ ’ — – … | Replaced with ASCII | Word / Pages autocorrect, AI-generated prose | 20 codepoints |
| Non-standard whitespace | NBSP, em space, en space, ideographic space | Replaced with space | HTML , Word indentation, CJK paste | 18 codepoints |
| Variation selectors & tag characters | U+FE00–FE0F, U+E0000–E01EF | Removed | AI watermarking schemes, emoji variation hints | 512 codepoints |
| Control & format characters | NUL, BEL, formfeed, interlinear annotations | Removed | Binary data leaks, terminal escapes, Word fields | Entire Cc / Cf categories |
| Embedded object placeholders | U+FFFC | Removed | Word, OneNote, Outlook embedded objects | 1 codepoint |
| Line & paragraph separators | U+2028, U+2029 | Normalized to newline | Word "soft return", rich-text line breaks | 2 codepoints |
| Malformed Unicode | Surrogates, noncharacters, U+FFFD | Removed | Encoding errors, corrupted decodes | Entire Cs category + noncharacters |
| Private-use & unassigned | U+E000–F8FF private-use, Cn code points | Removed in balanced / aggressive | Custom fonts, legacy Mac symbols | Entire Co / Cn categories |