Emit INVALID_TOKEN_IN_PLACEHOLDER on invalid palceholder token #2252

lutejka · 2025-08-28T19:31:37Z

Emit the correct error code when a named interpolation placeholder contains an invalid char

this fixes #2247

Summary by CodeRabbit

Bug Fixes
- Stricter validation for named placeholders: detects trailing invalid characters, reports clearer errors, and handles whitespace (including full-width spaces) more accurately during parsing.
Documentation
- Added a formal rule for placeholder names with examples ({msg}, {_userName}, {user-id}, {total$}); minor formatting cleanup.
Tests
- Expanded parameterized tests for invalid placeholders to broaden coverage without behavioral changes.

coderabbitai · 2025-08-28T19:31:44Z

Walkthrough

Adds a formalized placeholder name rule to docs, tightens tokenizer parsing to detect trailing invalid characters inside named placeholders and emit a specific error, and expands tests to cover symbol and accented-character invalid cases. No public API changes.

Changes

Cohort / File(s)	Summary
Documentation: Placeholder syntax `docs/guide/essentials/syntax.md`	Adds a formal rule for named interpolation placeholders (start with letter/_; continue with letters, digits, _, -, $) with examples and minor whitespace formatting.
Compiler: Tokenizer validation `packages/message-compiler/src/tokenizer.ts`	After assembling a named identifier, adds an inline check for trailing invalid characters (non-`}`/non-whitespace) that reads the invalid portion, emits `INVALID_TOKEN_IN_PLACEHOLDER`, and returns the combined invalid token; retains existing EOF/unterminated-brace handling.
Tests: Named placeholder errors `packages/message-compiler/test/tokenizer/named.test.ts`	Refactors imports and replaces looped error assertions with `test.each`, adding cases for `'$'`, `'-'`, `'àaa'`, and `'aàa'`, asserting the same error outcomes with explicit locations.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Src as Source
  participant Tok as Tokenizer
  participant Err as ErrorEmitter

  rect rgb(250,250,255)
    note over Tok: parseNamedPlaceholder()
    Tok->>Src: read initial char
    alt valid start (A-Z, a-z, _)
      Tok->>Src: consume valid continuation (letters/digits/_/-/$)
      Tok->>Src: peek next char
      alt next is "}"
        Tok-->>Src: return name token
      else next is EOF or whitespace
        Tok-->>Src: return name token
      else invalid trailing content
        Tok->>Src: readInvalidIdentifier()
        Tok->>Err: emit INVALID_TOKEN_IN_PLACEHOLDER
        Tok-->>Src: return combined invalid token
      end
    else invalid start or EOF
      Tok->>Err: emit existing UNTERMINATED/INVALID error
    end
  end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Assessment against linked issues

Objective	Addressed	Explanation
Allow acute accents within placeholder names [#2247]	❌	Changes explicitly treat accented characters as invalid and tests expect errors for `à` cases.
Prevent tokenizer failure by handling accented names gracefully [#2247]	❌	Tokenizer now emits an INVALID_TOKEN_IN_PLACEHOLDER for non-ASCII sequences instead of accepting them; no handling to accept accents was added.

Assessment against linked issues: Out-of-scope changes

Code Change	Explanation
Formalizing ASCII-focused placeholder syntax (docs/guide/essentials/syntax.md)	The linked issue requests accepting accented characters; documenting an ASCII-restricted rule (allowing `-` and `$`) does not implement acceptance of accents and is unrelated to the requested fix.

Poem

I hop through braces, keen and bright,
I sniff each token day and night—
If strange accents start to prance,
I thump my foot and halt the dance.
Errors flagged with tidy cheer, 🥕

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between b74058b and 703b67b.

📒 Files selected for processing (3)

docs/guide/essentials/syntax.md (2 hunks)
packages/message-compiler/src/tokenizer.ts (1 hunks)
packages/message-compiler/test/tokenizer/named.test.ts (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (3)

packages/message-compiler/test/tokenizer/named.test.ts
packages/message-compiler/src/tokenizer.ts
docs/guide/essentials/syntax.md

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbit in a new review comment at the desired location with your query.
PR comments: Tag @coderabbit in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbit gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbit read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbit help to get the list of available commands.

Other keywords and placeholders

Add @coderabbit ignore or @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbit summary or @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbit or @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (3)

docs/guide/essentials/syntax.md (1)
274-274: Wording nit

Tighten phrasing.
-The `message.greeting` is linked to the locale message key `message.name`.
+The `message.greeting` links to the `message.name` key.
packages/message-compiler/test/tokenizer/named.test.ts (1)
649-723: Accented placeholders are asserted invalid; this conflicts with Issue #2247’s expectation

The test suite codifies àaa and aàa as invalid, but the linked issue expected placeholders with acute accents to parse without errors. Please confirm product direction (ASCII-only vs. Unicode letters). If Unicode is desired, these cases should move to a “valid” set and tokenizer should allow \p{L}.

If keeping ASCII-only, consider adding an explicit positive test documenting rejection of non-ASCII:
test('rejects non-ASCII letters in named placeholder', () => {
  const errors: CompileError[] = []
  parse(`{àaa}`, { onError: e => errors.push(e) } as TokenizeOptions)
  expect(errors[0].code).toBe(CompileErrorCodes.INVALID_TOKEN_IN_PLACEHOLDER)
})
Also suggest adding coverage for full‑width space inside braces to lock intended behavior:
test('full-width space inside placeholder', () => {
  const errors: CompileError[] = []
  parse(`{name　}`, { onError: e => errors.push(e) } as TokenizeOptions)
  // Decide and assert: either allowed (no errors) or unterminated/error.
})
packages/message-compiler/src/tokenizer.ts (1)
491-508: Minor correctness/consistency nits in invalid-trailer check

Typo: “stoped” → “stopped”.

Use TokenChars.BraceRight instead of raw '}' for consistency.

currentChar !== EOF is redundant given the truthy guard.
-    // Check if takeNamedIdentifierChar stoped because of invalid characters
+    // Check if takeNamedIdentifierChar stopped because of invalid characters
     const currentChar = scnr.currentChar()
     if (
-      currentChar &&
-      currentChar !== '}' &&
-      currentChar !== EOF &&
-      !isWhitespace(currentChar)
+      currentChar &&
+      currentChar !== TokenChars.BraceRight &&
+      !isWhitespace(currentChar)
     ) {
       const invalidPart = readInvalidIdentifier(scnr)
       emitError(
         CompileErrorCodes.INVALID_TOKEN_IN_PLACEHOLDER,
         currentPosition(),
         0,
         name + invalidPart
       )
       return name + invalidPart
     }

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between f809bb3 and b74058b.

📒 Files selected for processing (3)

docs/guide/essentials/syntax.md (2 hunks)
packages/message-compiler/src/tokenizer.ts (2 hunks)
packages/message-compiler/test/tokenizer/named.test.ts (2 hunks)

🧰 Additional context used

🧬 Code graph analysis (2)

packages/message-compiler/test/tokenizer/named.test.ts (2)

packages/message-compiler/src/tokenizer.ts (2)

parse (1014-1025)

ERROR_DOMAIN (40-40)

packages/message-compiler/src/errors.ts (4)

CompileErrorCodes (25-46)

CompileErrorCodes (53-54)

errorMessages (57-78)

CompileError (14-17)

packages/message-compiler/src/tokenizer.ts (1)

packages/message-compiler/src/errors.ts (2)

CompileErrorCodes (25-46)

CompileErrorCodes (53-54)

🔇 Additional comments (3)

docs/guide/essentials/syntax.md (1)
29-31: Clarify ASCII-only placeholder identifiers and fix grammar
Update to:
The variable name inside `{}` must start with an ASCII letter (a–z, A–Z) or underscore (`_`), followed by any combination of ASCII letters, digits, underscores (`_`), hyphens (`-`), or dollar signs (`$`).

Note: Non-ASCII letters (e.g. `é`, `à`) are not supported and will raise `INVALID_TOKEN_IN_PLACEHOLDER`.
Confirm whether accented identifiers should be allowed per Issue #2247; if so, adjust the parser and tests accordingly.
packages/message-compiler/test/tokenizer/named.test.ts (1)

6-7: LGTM – import re-org

No behavioral change; reads clearer.

packages/message-compiler/src/tokenizer.ts (1)

793-799: Follow-up: ensure downstream doesn’t assume Named tokens are always valid

After emitting INVALID_TOKEN_IN_PLACEHOLDER you still return a Named token containing invalid chars. Verify parser/codegen paths won’t mis-handle this (e.g., assuming a following BraceRight or using the invalid name as an object key without guarding).

coderabbitai · 2025-08-28T19:40:38Z

packages/message-compiler/src/tokenizer.ts

+  function isWhitespace(ch: string): boolean {
+    return ch === SPACE || ch === NEW_LINE || ch === '\u3000' // includes full-width space
+  }
+


💡 Verification agent

🧩 Analysis chain

Unify whitespace handling (include U+3000 everywhere spaces are skipped)

You introduced isWhitespace including U+3000, but peekSpaces, skipSpaces, isTextStart, and isInvalidIdentifier still only consider SPACE/NEW_LINE. This can produce inconsistent states (e.g., {name　} may still trip UNTERMINATED_CLOSING_BRACE). Use isWhitespace in those helpers.

Proposed changes (outside this hunk) for consistency:

// Replace implementations to rely on isWhitespace: function peekSpaces(scnr: Scanner): string { let buf = '' while (true) { const ch = scnr.currentPeek() if (!ch || !isWhitespace(ch)) break buf += ch scnr.peek() } return buf } function isTextStart(scnr: Scanner, reset = true): boolean { const fn = (hasSpace = false, prev = ''): boolean => { const ch = scnr.currentPeek() if (ch === TokenChars.BraceLeft) return hasSpace if (ch === TokenChars.LinkedAlias || !ch) return hasSpace if (ch === TokenChars.Pipe) return !(isWhitespace(prev)) if (isWhitespace(ch)) { scnr.peek(); return fn(true, ch) } return true } const ret = fn() reset && scnr.resetPeek() return ret } function isInvalidIdentifier(ch: string): boolean { return ( ch !== TokenChars.BraceLeft && ch !== TokenChars.BraceRight && !isWhitespace(ch) ) }

🏁 Script executed:

#!/bin/bash # Search for whitespace handling in helper functions rg -n "function peekSpaces" -C3 packages/message-compiler/src/tokenizer.ts rg -n "function skipSpaces" -C3 packages/message-compiler/src/tokenizer.ts rg -n "function isTextStart" -C3 packages/message-compiler/src/tokenizer.ts rg -n "function isInvalidIdentifier" -C3 packages/message-compiler/src/tokenizer.ts

Length of output: 1154

🏁 Script executed:

#!/bin/bash # Show isWhitespace definition rg -n "function isWhitespace" -C3 packages/message-compiler/src/tokenizer.ts # Show any uses of isWhitespace rg -n "isWhitespace" -C3 packages/message-compiler/src/tokenizer.ts # Show broader context for isTextStart rg -n "function isTextStart" -C10 packages/message-compiler/src/tokenizer.ts # Show context for skipSpaces/peekSpaces rg -n "function skipSpaces" -C10 packages/message-compiler/src/tokenizer.ts rg -n "function peekSpaces" -C10 packages/message-compiler/src/tokenizer.ts # Show context for isInvalidIdentifier rg -n "function isInvalidIdentifier" -C10 packages/message-compiler/src/tokenizer.ts

Length of output: 3541

Replace SPACE/NEW_LINE checks with isWhitespace in these helpers

peekSpaces & skipSpaces

isTextStart

isInvalidIdentifier
This ensures full-width space (U+3000) is treated as whitespace everywhere.

🤖 Prompt for AI Agents

In packages/message-compiler/src/tokenizer.ts around lines 168 to 171, several helper functions (peekSpaces, skipSpaces, isTextStart, isInvalidIdentifier) still check for SPACE or NEW_LINE directly; replace those direct comparisons with calls to the isWhitespace(ch) helper so that full-width space (U+3000) is recognized everywhere. Update each occurrence to use isWhitespace(ch) (or the logical equivalent when checking sequences) and run tests to confirm no behavior change other than treating U+3000 as whitespace.

…CE when invalid token is in placeholder and update docs

github-actions · 2025-09-05T07:11:07Z

Size Report

Bundles

File	Size	Gzip	Brotli
core.esm-browser.prod.js	38.22 kB (+0.16 kB)	11.67 kB (+0.05 kB)	10.41 kB (+0.03 kB)
core.global.prod.js	31.24 kB (+0.13 kB)	10.82 kB (+0.04 kB)	9.70 kB (-0.01 kB)
core.runtime.esm-browser.prod.js	23.77 kB	7.70 kB	6.90 kB
core.runtime.global.prod.js	18.08 kB	7.03 kB	6.31 kB
message-compiler.esm-browser.prod.js	19.45 kB (+0.16 kB)	5.81 kB (+0.04 kB)	5.18 kB (+0.04 kB)
message-compiler.global.prod.js	17.46 kB (+0.13 kB)	5.59 kB (+0.04 kB)	5.01 kB (+0.03 kB)
petite-vue-i18n-core.esm-browser.prod.js	20.70 kB	6.98 kB	6.27 kB
petite-vue-i18n-core.global.prod.js	15.66 kB	6.08 kB	5.51 kB
petite-vue-i18n.esm-browser.prod.js	37.17 kB (+0.16 kB)	11.41 kB (+0.05 kB)	10.23 kB (+0.04 kB)
petite-vue-i18n.global.prod.js	30.00 kB (+0.13 kB)	10.26 kB (+0.05 kB)	9.24 kB (+0.03 kB)
petite-vue-i18n.runtime.esm-browser.prod.js	22.58 kB	7.35 kB	6.64 kB
petite-vue-i18n.runtime.global.prod.js	16.86 kB	6.42 kB	5.82 kB
vue-i18n.esm-browser.prod.js	50.87 kB (+0.16 kB)	15.22 kB (+0.05 kB)	13.58 kB (+0.03 kB)
vue-i18n.global.prod.js	40.71 kB (+0.13 kB)	13.67 kB (+0.04 kB)	12.29 kB (+0.04 kB)
vue-i18n.runtime.esm-browser.prod.js	36.28 kB	11.17 kB	10.03 kB
vue-i18n.runtime.global.prod.js	27.56 kB	9.88 kB	8.91 kB

Usages

Name	Size	Gzip	Brotli
packages/size-check-core (@intlify/core)	9.22 kB	3.82 kB	3.46 kB
packages/size-check-petite-vue-i18n (petite-vue-i18n)	78.12 kB	30.46 kB	27.44 kB
packages/size-check-vue-i18n (vue-i18n)	83.34 kB	31.94 kB	28.72 kB

pkg-pr-new · 2025-09-05T07:18:51Z

Open in StackBlitz

@intlify/core

npm i https://pkg.pr.new/@intlify/core@2252

@intlify/core-base

npm i https://pkg.pr.new/@intlify/core-base@2252

@intlify/devtools-types

npm i https://pkg.pr.new/@intlify/devtools-types@2252

@intlify/message-compiler

npm i https://pkg.pr.new/@intlify/message-compiler@2252

petite-vue-i18n

npm i https://pkg.pr.new/petite-vue-i18n@2252

@intlify/shared

npm i https://pkg.pr.new/@intlify/shared@2252

vue-i18n

npm i https://pkg.pr.new/vue-i18n@2252

@intlify/vue-i18n-core

npm i https://pkg.pr.new/@intlify/vue-i18n-core@2252

commit: 703b67b

kazupon

Thank you for your contribution!
I've just checked your codes.

LGTM!

…CE when invalid token is in placeholder and update docs (#2252)

…G_BRACE when invalid token is in placeholder and update docs (#2255) * Emit INVALID_TOKEN_IN_PLACEHOLDER instead of UNTERMINATED_CLOSING_BRACE when invalid token is in placeholder and update docs (#2252) * chore: drop node v18 --------- Co-authored-by: lutejka <[email protected]>

coderabbitai bot reviewed Aug 28, 2025

View reviewed changes

Emit INVALID_TOKEN_IN_PLACEHOLDER instead of UNTERMINATED_CLOSING_BRA…

703b67b

…CE when invalid token is in placeholder and update docs

lutejka force-pushed the fix-error-on-invalid-token-in-placeholder branch from b74058b to 703b67b Compare August 29, 2025 12:19

kazupon added Type: Improvement Includes backwards-compatible fixes 🍰 p2-nice-to-have Priority 2: nothing is broken but it's worth addressing labels Sep 5, 2025

kazupon approved these changes Sep 5, 2025

View reviewed changes

kazupon merged commit fdc9541 into intlify:master Sep 5, 2025
29 of 30 checks passed

kazupon pushed a commit that referenced this pull request Sep 5, 2025

Emit INVALID_TOKEN_IN_PLACEHOLDER instead of UNTERMINATED_CLOSING_BRA…

5a1e4f2

…CE when invalid token is in placeholder and update docs (#2252)

kazupon mentioned this pull request Sep 5, 2025

fix: Emit INVALID_TOKEN_IN_PLACEHOLDER instead of UNTERMINATED_CLOSING_BRACE when invalid token is in placeholder and update docs #2255

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Emit INVALID_TOKEN_IN_PLACEHOLDER on invalid palceholder token #2252

Emit INVALID_TOKEN_IN_PLACEHOLDER on invalid palceholder token #2252

Uh oh!

lutejka commented Aug 28, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Aug 28, 2025 •

edited

Loading

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Aug 28, 2025

Uh oh!

github-actions bot commented Sep 5, 2025

Uh oh!

pkg-pr-new bot commented Sep 5, 2025

Uh oh!

kazupon left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Emit INVALID_TOKEN_IN_PLACEHOLDER on invalid palceholder token #2252

Emit INVALID_TOKEN_IN_PLACEHOLDER on invalid palceholder token #2252

Uh oh!

Conversation

lutejka commented Aug 28, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Assessment against linked issues

Assessment against linked issues: Out-of-scope changes

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Sep 5, 2025

Size Report

Bundles

Usages

Uh oh!

pkg-pr-new bot commented Sep 5, 2025

Uh oh!

kazupon left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

lutejka commented Aug 28, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 28, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)