Proofmark icon ProofmarkText inspection studio
Text Inspection Studio

Proofmark

Paste any text. We surface and clean hidden, malformed, or suspicious characters — without touching your emoji.

  1. 1Paste your text on the left
  2. 2Click Inspect (or press Ctrl/Cmd + Enter)
  3. 3Copy the cleaned output on the right
Source Text
Current Output
Inspect text to generate the current output.
Cleanup Mode
Live Detector Map
High Risk Medium Risk Low Risk Invisible shown as tokens
Paste text to start scanning.
Repeated Issue Types
Inspect text to reveal repeated issue types.
Review

Ready to inspect

Paste text, choose a mode, then inspect.

  1. Paste source text
  2. Choose a cleanup mode
  3. Inspect and review
Balanced mode is active. Inspect text to review the recommended cleanup.
Current output will appear after inspection.
Learn more about Proofmark
How it works

Inspect text, review replacements, and copy the current output.

Proofmark is built for deliberate cleanup. Inspect every hidden or suspicious character, review the recommended output, and override exact codepoints when you need a different replacement.

1. Paste source text

Drop in the text you want to inspect, keep the original visible, and choose the cleanup mode that fits the job.

2. Inspect and review

Run inspection, review the recommended cleanup, and decide whether to keep, remove, or custom-replace specific codepoints.

3. Copy the current output

Copy or export the current output only after you have reviewed the exact replacements you want.

Comparison

Recommended cleanup when you want speed. Custom replacements when you want precision.

Recommended cleanup

Start with the recommended cleanup to remove hidden controls, normalize suspicious spaces, and keep the readable structure of your text.

Selective by codepoint

Override a specific codepoint when you want to keep an em dash, replace it with a semicolon, or remove it entirely across the current text.

Selective by category

Apply broader keep-or-remove policies by category when you want quick control without touching every individual finding.

Personal-use policy

Built for personal use, careful review, and no intentional text archive.

Proofmark is meant for personal inspection and cleanup work. It processes text for the current request, returns the result to the studio, and avoids user accounts, saved workspaces, and long-term text history in this build.

No saved workspaces

This build does not include accounts or stored project histories.

No retention by design

The app returns the result and moves on instead of building a text archive.

Restricted use

This build is not licensed for third-party hosting, resale, white-labeling, or monetized clones without written permission.

FAQ

Common questions

What kinds of characters does Proofmark catch?

Proofmark focuses on hidden formatting controls, bidirectional marks, suspicious whitespace, malformed Unicode, orphan combining marks, and common typographic variants that are risky in plain text.

Can I remove only specific characters like em dashes or zero-width spaces?

Yes. Use the repeated issue controls to override the cleanup for a single codepoint, or use the category controls when you want to keep or remove a whole class of characters at once.

Does this build keep my text?

No saved user history is part of this build. Proofmark processes text for the request and returns the result without user accounts, saved workspaces, or an intentional archive.

Can another company host or resell this build?

No. The current product policy is personal-use only. Third-party hosting, resale, white-labeling, sublicensing, and monetized clones require written permission from the owner.

Reference & learning

Tap any section below to learn more about what Proofmark catches and when to use it.

What Proofmark can clean10 categories · tap to expand

Every category below is caught by default in balanced mode.

CategoryExamplesActionWhere it comes fromScope
Zero-width & invisible formattingZWSP, ZWNJ, ZWJ, WJ, BOM, soft hyphenRemovedAI watermarks, copy-paste artifacts, steganography12+ codepoints
Bidirectional controlsLRM, RLM, RLO, LRO, FSI, PDIRemovedPhishing attacks, filename spoofing, mixed-script text18 codepoints
Smart punctuation“ ” ‘ ’ — – …Replaced with ASCIIWord / Pages autocorrect, AI-generated prose20 codepoints
Non-standard whitespaceNBSP, em space, en space, ideographic spaceReplaced with spaceHTML  , Word indentation, CJK paste18 codepoints
Variation selectors & tag charactersU+FE00–FE0F, U+E0000–E01EFRemovedAI watermarking schemes, emoji variation hints512 codepoints
Control & format charactersNUL, BEL, formfeed, interlinear annotationsRemovedBinary data leaks, terminal escapes, Word fieldsEntire Cc / Cf categories
Embedded object placeholdersU+FFFCRemovedWord, OneNote, Outlook embedded objects1 codepoint
Line & paragraph separatorsU+2028, U+2029Normalized to newlineWord "soft return", rich-text line breaks2 codepoints
Malformed UnicodeSurrogates, noncharacters, U+FFFDRemovedEncoding errors, corrupted decodesEntire Cs category + noncharacters
Private-use & unassignedU+E000–F8FF private-use, Cn code pointsRemoved in balanced / aggressiveCustom fonts, legacy Mac symbolsEntire Co / Cn categories
Common use cases6 scenarios · tap to expand

Pick the scenario closest to yours and follow the path.

Cleaning AI-generated text

You copied a response from ChatGPT, Claude, or Gemini and want to publish it without telltale em-dashes, curly quotes, or invisible watermark characters.

  1. 1Paste the AI output into Source Text
  2. 2Leave the mode on Balanced
  3. 3Review smart-punctuation and hidden-formatting findings
  4. 4Copy Current Output — watermarks gone

Sanitizing Word / Pages content

You pasted from a Word doc into a Markdown file, code editor, or commit message and ended up with non-breaking spaces, smart quotes, and stray formatting.

  1. 1Paste the Word content into Source Text
  2. 2Use Balanced for most cases
  3. 3Confirm NBSP, em-space, and curly quotes are flagged
  4. 4Copy the cleaned ASCII-friendly result

Auditing suspicious links or filenames

You received a URL, filename, or username that looks normal but might contain a right-to-left override, BOM, or other spoofing trick.

  1. 1Paste the suspect string into Source Text
  2. 2Scan the Issues panel for bidirectional controls
  3. 3Check the per-character context view
  4. 4Compare original vs cleaned to see the real content

Pre-publish QA for blog posts, READMEs, commits

Before you hit publish or push, you want to confirm the text contains zero hidden characters that could break rendering or leak through as noise.

  1. 1Paste the final draft into Source Text
  2. 2Switch to Aggressive for strictest cleanup
  3. 3Confirm the "no findings" state if everything is clean
  4. 4Otherwise review, override per-category, and copy

Stripping OneNote / Outlook paste residue

Content copied from OneNote, Outlook, or Teams often drags along embedded object placeholders, interlinear annotations, and hidden format controls.

  1. 1Paste the email or note into Source Text
  2. 2Balanced mode handles U+FFFC and format chars
  3. 3Review embedded-object and format-control findings
  4. 4Copy the plain-text result

Normalizing text pasted from a PDF

PDFs often render with soft hyphens across line breaks, ligatures, and unusual whitespace that breaks downstream search and diffs.

  1. 1Paste the extracted PDF text into Source Text
  2. 2Switch to Aggressive to fold ligatures (fi, fl, ff)
  3. 3Soft hyphens and unusual spaces are removed
  4. 4Copy the normalized plain text
What characters does Proofmark clean?Tap to expand the full list

Hidden & control characters 32

Invisible characters that paste through silently from AI tools, Word, OneNote, and HTML.

CodepointNameActionReason
U+00ADsoft hyphenremovedRemoved discretionary hyphenation hint.
U+034Fcombining grapheme joinerremovedRemoved invisible grapheme joiner.
U+061Carabic letter markremovedRemoved bidirectional control.
U+180Emongolian vowel separatorremovedRemoved deprecated invisible separator.
U+200Bzero-width spaceremovedRemoved zero-width separator.
U+200Czero-width non-joinerremovedRemoved zero-width separator.
U+200Dzero-width joinerremovedRemoved zero-width joiner.
U+200Eleft-to-right markremovedRemoved bidirectional control.
U+200Fright-to-left markremovedRemoved bidirectional control.
U+202Aleft-to-right embeddingremovedRemoved bidirectional control.
U+202Bright-to-left embeddingremovedRemoved bidirectional control.
U+202Cpop directional formattingremovedRemoved bidirectional control.
U+202Dleft-to-right overrideremovedRemoved bidirectional control.
U+202Eright-to-left overrideremovedRemoved bidirectional control.
U+2060word joinerremovedRemoved invisible joiner.
U+2061function applicationremovedRemoved invisible operator.
U+2062invisible timesremovedRemoved invisible operator.
U+2063invisible separatorremovedRemoved invisible separator.
U+2064invisible plusremovedRemoved invisible operator.
U+2066left-to-right isolateremovedRemoved bidirectional control.
U+2067right-to-left isolateremovedRemoved bidirectional control.
U+2068first strong isolateremovedRemoved bidirectional control.
U+2069pop directional isolateremovedRemoved bidirectional control.
U+206Ainhibit symmetric swappingremovedRemoved bidirectional control.
U+206Bactivate symmetric swappingremovedRemoved bidirectional control.
U+206Cinhibit arabic form shapingremovedRemoved bidirectional control.
U+206Dactivate arabic form shapingremovedRemoved bidirectional control.
U+206Enational digit shapesremovedRemoved bidirectional control.
U+206Fnominal digit shapesremovedRemoved bidirectional control.
U+FEFFbyte order markremovedRemoved zero-width byte order mark.
U+FFFCobject replacement characterremovedRemoved placeholder for an embedded object that did not paste as text.
U+FFFDreplacement characterremovedRemoved malformed decode artifact.

Suspicious whitespace 18

Non-standard spaces from HTML  , Word indentation, and CJK pasted content.

CodepointNameActionResult
U+00A0no-break spacereplacedreplaced with space
U+1680ogham space markreplacedreplaced with space
U+2000en quadreplacedreplaced with space
U+2001em quadreplacedreplaced with space
U+2002en spacereplacedreplaced with space
U+2003em spacereplacedreplaced with space
U+2004three-per-em spacereplacedreplaced with space
U+2005four-per-em spacereplacedreplaced with space
U+2006six-per-em spacereplacedreplaced with space
U+2007figure spacereplacedreplaced with space
U+2008punctuation spacereplacedreplaced with space
U+2009thin spacereplacedreplaced with space
U+200Ahair spacereplacedreplaced with space
U+202Fnarrow no-break spacereplacedreplaced with space
U+205Fmedium mathematical spacereplacedreplaced with space
U+2800braille pattern blankreplacedreplaced with space
U+3000ideographic spacereplacedreplaced with space
U+3164hangul fillerreplacedreplaced with space

Smart punctuation 20

Typographic variants from Word/Pages autocorrect and AI-generated text.

CodepointNameActionResult
U+2010hyphenreplacedreplaced with '-'
U+2011non-breaking hyphenreplacedreplaced with '-'
U+2012figure dashreplacedreplaced with '-'
U+2013en dashreplacedreplaced with '-'
U+2014em dashreplacedreplaced with '-'
U+2015horizontal barreplacedreplaced with '-'
U+2018left single quotation markreplacedreplaced with "'"
U+2019right single quotation markreplacedreplaced with "'"
U+201Asingle low-9 quotation markreplacedreplaced with "'"
U+201Bsingle high-reversed-9 quotation markreplacedreplaced with "'"
U+201Cleft double quotation markreplacedreplaced with '"'
U+201Dright double quotation markreplacedreplaced with '"'
U+201Edouble low-9 quotation markreplacedreplaced with '"'
U+201Fdouble high-reversed-9 quotation markreplacedreplaced with '"'
U+2023triangular bulletreplacedreplaced with '•'
U+2026horizontal ellipsisreplacedreplaced with '...'
U+2032primereplacedreplaced with "'"
U+2033double primereplacedreplaced with '"'
U+2043hyphen bulletreplacedreplaced with '-'
U+2212minus signreplacedreplaced with '-'

Line separators 2

Unicode line and paragraph separators normalized to plain newlines.

CodepointNameActionResult
U+2028line separatorreplacednormalized to newline
U+2029paragraph separatorreplacednormalized to newline

By Unicode category 9

Whole categories of characters are caught generically, even if not listed by codepoint above.

CategoryNameActionNotes
Cccontrol charactersremovedASCII and C1 control codes (NUL, BEL, etc.)
Cfformat charactersremovedInvisible formatting (interlinear annotations, hidden joiners)
Cssurrogate code pointsremovedMalformed UTF-16 fragments
Coprivate-use charactersremovedCustom glyphs with no standard meaning (kept in conservative mode)
Cnunassigned code pointsremovedSuspicious unmapped code points (kept in conservative mode)
noncharactersremovedU+FDD0–FDEF and any U+xxFFFE / U+xxFFFF
variation selectorsremovedU+FE00–FE0F and U+E0100–E01EF (steganography vectors)
tag charactersremovedU+E0000–E007F (used by some AI watermarking schemes)
orphan combining marksremovedCombining marks with no visible base character

Modes: balanced (default) handles all of the above. aggressive additionally folds compatibility forms (fullwidth ASCII, ligatures, circled digits). conservative keeps visible typographic characters and private-use code points.