feat(coze-js-web): Add text-to-speech support #97

jackshen310 · 2025-02-13T11:34:37Z

Summary by CodeRabbit

New Features
- The default API endpoint has been updated for improved connectivity.
- Enhanced chat interactions with improved speech processing, including new transcription controls and a dedicated speech toggle in message footers.
- Integrated real-time audio streaming capabilities for a smoother voice interaction experience.

coderabbitai · 2025-02-13T11:34:46Z

Walkthrough

This pull request revises the ChatX module. The configuration file now returns a different default API URL. In the ChatX component, the import of Flex is added while speech handling is updated: speech methods are replaced with transcription methods and a new footer with a speech toggle is introduced. The use-ws-api hook has been enhanced with new types and functions to manage a dedicated speech WebSocket connection, including initialization, messaging, and closure functions.

Changes

File(s)	Change Summary
`.../chat-x/config.ts`	Updated default base URL in `getBaseUrl` from `'https://api.coze.com'` to `'https://api.coze.cn'`.
`.../chat-x/index.tsx`	Modified ChatX component: added Flex import; updated speech handling by replacing previous speech methods with `startTranscriptions`/`stopTranscriptions`, and added a footer button controlled by `getIsSpeech`.
`.../chat-x/use-ws-api.ts`	Enhanced WebSocket API for speech: introduced new types (`CreateSpeechWsReq`, `CreateSpeechWsRes`), added speech WebSocket management functions (`initSpeechWs`, `closeSpeechWs`, `startSpeech`, `stopSpeech`), and adjusted transcription methods.

Sequence Diagram(s)

sequenceDiagram
    participant U as User
    participant C as ChatX Component
    participant WS as useWsAPI Hook
    participant S as Speech WebSocket

    U->>C: Clicks speech toggle button
    C->>WS: Calls startTranscriptions()
    WS->>S: Initializes speech connection (initSpeechWs)
    S-->>WS: Sends onopen/onmessage events
    WS-->>C: Updates speech status via getIsSpeech
    C->>U: Reflects new speech state in UI

Suggested reviewers

jsongo
Tecvan-fe

Poem

In lines of code I softly hop,
A bunny curious, I never stop.
New URLs and WS streams in sight,
Transcriptions dance in digital light.
With joyful hops, I cherish this change,
A carrot of code for our project’s range!
🐇✨

✨ Finishing Touches

📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Caution

Inline review comments failed to post. This is likely due to GitHub's limits when posting large numbers of comments.

Actionable comments posted: 5

🧹 Nitpick comments (13)

examples/chat-sdk-web/tsconfig.node.json (2)
3-7: Consider relocating the build info file.

Storing build info in node_modules/.tmp may lead to unnecessary rebuilds as this directory is typically cleaned during package operations. Consider moving it to a more persistent location like .tsbuildinfo in the project root.
-    "tsBuildInfoFile": "./node_modules/.tmp/tsconfig.node.tsbuildinfo",
+    "tsBuildInfoFile": "./.tsbuildinfo",
9-9: Consider using JSONC extension for TypeScript config.

The file contains comments which violate JSON standard. Consider renaming the file to tsconfig.node.jsonc to indicate it's using JSONC format that supports comments.

Also applies to: 16-16

🧰 Tools

🪛 Biome (1.9.4)

[error] 9-9: JSON standard does not allow comments.

(parse)
examples/chat-sdk-web/tsconfig.app.json (3)
3-3: Consider relocating the build info file.

Storing the build info file in node_modules/.tmp might cause issues as this directory is typically excluded from version control and cleaned during package installations. Consider moving it to a more stable location like .tsbuildinfo in the project root or a dedicated .build directory.
-    "tsBuildInfoFile": "./node_modules/.tmp/tsconfig.app.tsbuildinfo",
+    "tsBuildInfoFile": "./.build/tsconfig.app.tsbuildinfo",
18-23: Consider additional type-checking options.

While the current linting configuration is good, consider adding these options to further enhance type safety:
    /* Linting */
    "strict": true,
    "noUnusedLocals": true,
    "noUnusedParameters": true,
    "noFallthroughCasesInSwitch": true,
-    "noUncheckedSideEffectImports": true
+    "noUncheckedSideEffectImports": true,
+    "noImplicitReturns": true,
+    "noImplicitOverride": true
🧰 Tools

🪛 Biome (1.9.4)

[error] 18-18: JSON standard does not allow comments.

(parse)

[error] 19-19: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 19-19: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 19-19: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 19-19: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 20-20: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 20-20: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 20-20: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 20-20: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 21-21: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 21-21: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 21-21: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 21-21: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 22-22: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 22-22: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 22-22: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 22-22: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 23-23: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 23-23: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 23-23: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

25-25: Consider including test files and type declarations.

The current include configuration only covers the "src" directory. Consider explicitly including test files and type declarations:
-  "include": ["src"]
+  "include": ["src", "src/**/*.ts", "src/**/*.tsx", "tests/**/*.ts", "tests/**/*.tsx"]
🧰 Tools

🪛 Biome (1.9.4)

[error] 25-25: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 25-25: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 25-25: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 25-26: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)
examples/chat-sdk-web/src/App.css (2)
8-19: Consider removing premature will-change optimization.

The will-change property should only be used as a last resort for performance issues, not as a preventive measure. Since we're only animating the filter on hover, this optimization is likely unnecessary.
 .logo {
   height: 6em;
   padding: 1.5em;
-  will-change: filter;
   transition: filter 300ms;
 }
30-34: Consider improving selector specificity and animation control.

Two suggestions for improvement:

Replace the nth-of-type selector with a more specific class for better maintainability

Add a way to pause the infinite animation on hover for better user experience
 @media (prefers-reduced-motion: no-preference) {
-  a:nth-of-type(2) .logo {
+  .logo.animate-spin {
     animation: logo-spin infinite 20s linear;
   }
+  .logo.animate-spin:hover {
+    animation-play-state: paused;
+  }
 }
examples/coze-js-web/src/pages/chat-x/use-ws-api.ts (2)
241-289: Initialize speech WebSocket with minor logging fix.

Overall, the setup logic is sound. However, note that the console log on line 269 says "[transcriptions] ws message" though this is clearly the speech WebSocket code path. Consider updating the log statement to maintain clarity.
- console.log('[transcriptions] ws message', data);
+ console.log('[speech] ws message', data);
293-298: Consider providing a closure code for speech WebSocket.

Currently, the code calls speechRef.current.close(); without a specific code or reason. Using a standard code (e.g., 1000) and a short reason string can ease debugging and maintain consistency:
- speechRef.current.close();
+ speechRef.current.close(1000, 'Normal Closure');
examples/coze-js-web/src/pages/chat-x/index.tsx (1)

89-102: Integration of new speech/transcriptions methods.

Destructuring startChat, stopChat, startTranscriptions, stopTranscriptions, ... from the updated hook ensures proper usage. The inline callback for TRANSCRIPTIONS_MESSAGE_UPDATE looks correct to update local state.

Consider also handling potential error events in this block for better resilience.
examples/chat-sdk-web/src/main.tsx (1)
6-10: Consider adding null check for root element.

The non-null assertion operator (!) assumes the root element will always exist. Consider adding a null check for better error handling:
-createRoot(document.getElementById('root')!).render(
+const rootElement = document.getElementById('root');
+if (!rootElement) {
+  throw new Error('Failed to find root element');
+}
+createRoot(rootElement).render(
   <StrictMode>
     <App />
   </StrictMode>,
)
examples/chat-sdk-web/eslint.config.js (1)
7-28: Consider adding accessibility-related ESLint rules.

Since this project involves text-to-speech functionality, consider adding eslint-plugin-jsx-a11y to enforce accessibility best practices:
import js from '@eslint/js'
import globals from 'globals'
import reactHooks from 'eslint-plugin-react-hooks'
import reactRefresh from 'eslint-plugin-react-refresh'
+import jsxA11y from 'eslint-plugin-jsx-a11y'
import tseslint from 'typescript-eslint'

export default tseslint.config(
  { ignores: ['dist'] },
  {
-    extends: [js.configs.recommended, ...tseslint.configs.recommended],
+    extends: [
+      js.configs.recommended,
+      ...tseslint.configs.recommended,
+      'plugin:jsx-a11y/recommended'
+    ],
    files: ['**/*.{ts,tsx}'],
    languageOptions: {
      ecmaVersion: 2020,
      globals: globals.browser,
    },
    plugins: {
      'react-hooks': reactHooks,
      'react-refresh': reactRefresh,
+      'jsx-a11y': jsxA11y,
    },
    // ... rest of the config
  }
)
Don't forget to add the plugin to package.json:
{
  "devDependencies": {
+    "eslint-plugin-jsx-a11y": "^6.7.1"
  }
}
examples/chat-sdk-web/src/index.css (1)
1-14: Consider using CSS custom properties for better maintainability.

The root variables could be organized better for theme customization.
 :root {
+  /* Typography */
   font-family: Inter, system-ui, Avenir, Helvetica, Arial, sans-serif;
   line-height: 1.5;
   font-weight: 400;

+  /* Colors */
+  --text-primary: rgba(255, 255, 255, 0.87);
+  --bg-primary: #242424;
   color-scheme: light dark;
-  color: rgba(255, 255, 255, 0.87);
-  background-color: #242424;
+  color: var(--text-primary);
+  background-color: var(--bg-primary);

   font-synthesis: none;
   text-rendering: optimizeLegibility;
   -webkit-font-smoothing: antialiased;
   -moz-osx-font-smoothing: grayscale;
 }

🛑 Comments failed to post (5)

examples/chat-sdk-web/src/App.css (1)
36-42: 🛠️ Refactor suggestion

Ensure sufficient color contrast for accessibility.

The text color #888 might not provide sufficient contrast with the background for optimal readability. Consider using a darker shade to meet WCAG 2.1 Level AA contrast requirements.
 .read-the-docs {
-  color: #888;
+  color: #666;
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
.card {
  padding: 2em;
}

.read-the-docs {
  color: #666;
}
examples/chat-sdk-web/src/App.tsx (3)
6-33: ⚠️ Potential issue

Integrate text-to-speech functionality.

Based on the PR objectives, this component should include text-to-speech support:

Here's a suggested integration. Would you like me to provide a complete implementation?
import { useState, useCallback } from 'react'
import { useSpeech } from '@coze/chat-sdk'  // Assuming this hook exists

function App() {
  const [count, setCount] = useState(0)
  const { speak, speaking, supported } = useSpeech()
  
  const handleSpeak = useCallback((text: string) => {
    if (supported) {
      speak(text)
    }
  }, [speak, supported])

  // ... rest of the component
}
12-17: 🛠️ Refactor suggestion

Add missing accessibility attributes to links and images.

External links should indicate they open in a new window, and images should have meaningful alt text:
-        <a href="https://vite.dev" target="_blank">
+        <a href="https://vite.dev" target="_blank" rel="noopener noreferrer" aria-label="Visit Vite documentation (opens in new window)">
           <img src={viteLogo} className="logo" alt="Vite logo" />
         </a>
-        <a href="https://react.dev" target="_blank">
+        <a href="https://react.dev" target="_blank" rel="noopener noreferrer" aria-label="Visit React documentation (opens in new window)">
           <img src={reactLogo} className="logo react" alt="React logo" />
         </a>
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
        <a href="https://vite.dev" target="_blank" rel="noopener noreferrer" aria-label="Visit Vite documentation (opens in new window)">
          <img src={viteLogo} className="logo" alt="Vite logo" />
        </a>
        <a href="https://react.dev" target="_blank" rel="noopener noreferrer" aria-label="Visit React documentation (opens in new window)">
          <img src={reactLogo} className="logo react" alt="React logo" />
        </a>
21-23: 🛠️ Refactor suggestion

Add ARIA attributes to interactive elements.

The button should have an aria-label for better screen reader support:
-        <button onClick={() => setCount((count) => count + 1)}>
+        <button 
+          onClick={() => setCount((count) => count + 1)}
+          aria-label={`Increment count, current value is ${count}`}
+        >
           count is {count}
         </button>
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
        <button 
          onClick={() => setCount((count) => count + 1)}
          aria-label={`Increment count, current value is ${count}`}
        >
          count is {count}
        </button>
examples/chat-sdk-web/package.json (1)
22-27: 💡 Verification agent
🧩 Analysis chain

Consider using stable versions for critical dependencies.

Several dependencies are using beta or unstable versions which could lead to issues:

eslint v9.14.0 (ahead of stable)

eslint-plugin-react-hooks (beta version)

typescript v5.5.3 (ahead of stable)

🌐 Web query:
What is the latest stable version of TypeScript?
💡 Result:

The latest stable version of TypeScript is 5.7.3, released in January 2025[8][9]. This version includes performance optimizations, stability improvements, and maintains compatibility with recent ECMAScript standards like ES2024[7].

Key details:

Released via npm with the command npm install -D typescript[1][9]

Current beta version is 5.8, which is available for testing[7]

Recent versions have focused on:

Improved type inference patterns[4]

Enhanced editor reliability[4]

Better ESM module support[3][7]

Memory/performance optimizations[1][4]

For production use, TypeScript recommends staying on stable releases like 5.7.x unless specifically needing experimental features from beta versions[7][9]. The TypeScript team maintains a regular release cycle with approximately quarterly updates[4][7].

Citations:

1: https://devblogs.microsoft.com/typescript/announcing-typescript-5-0/

2: https://github.com/microsoft/typescript/releases

3: https://devblogs.microsoft.com/typescript/announcing-typescript-5-4/

4: https://effectivetypescript.com/2024/07/02/ts-55/

5: https://en.wikipedia.org/wiki/TypeScript

6: https://www.npmjs.com/package/typescript/v/4.8.4

7: https://www.typescriptlang.org

8: https://www.npmjs.com/package/typescript

9: https://www.typescriptlang.org/download/
Update Critical Dependency Versions to Stable Releases

There are several critical dependencies that appear to use potentially unstable or beta versions which could introduce risks:

eslint: Currently at "9.14.0". Please verify if this version is intended and stable, as the version number suggests it might be a pre-stable release.

eslint-plugin-react-hooks: This dependency is using a beta release ("5.1.0-beta-26f2496093-20240514"); consider switching to a stable version.

typescript: The dependency is defined as "^5.5.3", but the latest stable version is now 5.7.3. Please update to ensure you benefit from performance optimizations and stability improvements.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (4)

examples/coze-js-web/src/pages/chat-x/use-ws-api.ts (4)

241-289: Consider adding error recovery and reconnection logic.

The WebSocket initialization looks good, but there's no automatic reconnection strategy if the connection fails or drops.

Consider adding a reconnection mechanism:

 const initSpeechWs = useCallback(
   async (onMessage?: (data: CreateSpeechWsRes) => void) => {
     closeSpeechWs();
     if (!clientRef.current) {
       return;
     }
+    const maxRetries = 3;
+    let retryCount = 0;
+    
+    const connect = async (): Promise<WebSocketAPI<CreateSpeechWsReq, CreateSpeechWsRes>> => {
       const ws = await clientRef.current.websockets.audio.speech.create();
       
       return new Promise<WebSocketAPI<CreateSpeechWsReq, CreateSpeechWsRes>>(
         (resolve, reject) => {
           ws.onopen = () => {
             console.log('[speech] ws open');
             resolve(ws);
           };
           
           ws.onerror = async (error, event) => {
             console.error('[speech] WebSocket error', error, event);
             ws.close();
-            reject(error);
+            if (retryCount < maxRetries) {
+              retryCount++;
+              console.log(`[speech] Retrying connection (${retryCount}/${maxRetries})...`);
+              try {
+                const newWs = await connect();
+                resolve(newWs);
+              } catch (retryError) {
+                reject(retryError);
+              }
+            } else {
+              reject(error);
+            }
           };
           // ... rest of the implementation
         }
       );
+    };
+    
+    return connect();
   },
   [],
 );

300-309: Consider optimizing the audio decoding process.

The current implementation creates a new ArrayBuffer for each message. For better performance with large audio streams:

 const handleAudioMessage = useCallback(async (message: string) => {
-  const decodedContent = atob(message);
-  const arrayBuffer = new ArrayBuffer(decodedContent.length);
-  const view = new Uint8Array(arrayBuffer);
-  for (let i = 0; i < decodedContent.length; i++) {
-    view[i] = decodedContent.charCodeAt(i);
-  }
+  const binaryString = atob(message);
+  const bytes = new Uint8Array(binaryString.length);
+  bytes.set(Uint8Array.from(binaryString, c => c.charCodeAt(0)));
 
-  await wavStreamPlayer.add16BitPCM(arrayBuffer, trackId.current);
+  await wavStreamPlayer.add16BitPCM(bytes.buffer, trackId.current);
 }, []);

474-474: Remove or update the documentation link.

The link to internal documentation should be removed or replaced with public documentation.

-    // See https://bytedance.larkoffice.com/docx/Uv6Wd8GTjoEex3xyq4YcxDnRnkc#HrVcdpW4Oo3A25xCQIDcGBlknOc
+    // For API documentation, see: <add public documentation link>

504-504: Consider caching the speech status.

The getIsSpeech function checks the stream status on every call. Consider caching this value.

+const [isSpeechActive, setIsSpeechActive] = useState(false);
+
+useEffect(() => {
+  const streamStatus = wavStreamPlayer.stream !== null;
+  if (streamStatus !== isSpeechActive) {
+    setIsSpeechActive(streamStatus);
+  }
+}, [wavStreamPlayer.stream]);
+
-const getIsSpeech = useCallback(() => wavStreamPlayer.stream !== null, []);
+const getIsSpeech = useCallback(() => isSpeechActive, [isSpeechActive]);

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3cbf239 and a71ee51.

📒 Files selected for processing (3)

examples/coze-js-web/src/pages/chat-x/config.ts (1 hunks)
examples/coze-js-web/src/pages/chat-x/index.tsx (4 hunks)
examples/coze-js-web/src/pages/chat-x/use-ws-api.ts (8 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

examples/coze-js-web/src/pages/chat-x/config.ts
examples/coze-js-web/src/pages/chat-x/index.tsx

🔇 Additional comments (2)

examples/coze-js-web/src/pages/chat-x/use-ws-api.ts (2)

8-9: LGTM! Good modular approach to WebSocket types and audio tools.

The addition of speech-specific types and WavStreamPlayer shows good separation of concerns.

Also applies to: 16-16

33-36: LGTM! Well-structured WebSocket reference.

The speechRef follows the same pattern as other WebSocket references, maintaining consistency.

examples/coze-js-web/src/pages/chat-x/use-ws-api.ts

… handling

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (4)

examples/coze-js-web/src/pages/chat-x/index.tsx (2)

247-263: Enhance accessibility of the speech toggle button.

While the implementation is clean, the button could benefit from accessibility improvements.

Apply this diff to improve accessibility:

 <Button
   size="small"
   type="text"
   icon={<PhoneOutlined />}
+  aria-label={getIsSpeech() ? "Stop speech" : "Start speech"}
+  title={getIsSpeech() ? "Stop speech" : "Start speech"}
   style={{ marginInlineEnd: 'auto' }}
   onClick={() => {
     if (getIsSpeech()) {
       stopSpeech();
     } else {
       startSpeech(msg);
     }
   }}
 />

392-402: Add error handling for transcription operations.

The speech state handling should include error handling to provide better user feedback.

Apply this diff to add error handling:

 onRecordingChange: async nextSpeech => {
   if (nextSpeech) {
     lastContentRef.current = content;
     message.info('Start Voice to Text');
-    await startTranscriptions();
+    try {
+      await startTranscriptions();
+    } catch (error) {
+      message.error('Failed to start voice to text');
+      setSpeech(false);
+      return;
+    }
   } else {
     message.info('Stop Voice to Text');
-    await stopTranscriptions();
+    try {
+      await stopTranscriptions();
+    } catch (error) {
+      message.error('Failed to stop voice to text');
+    }
   }
   setSpeech(nextSpeech);
 },

examples/coze-js-web/src/pages/chat-x/use-ws-api.ts (2)

300-309: Add input validation and error handling for audio message processing.

The audio message handling should include validation and error handling to prevent potential issues.

Apply this diff to improve error handling:

 const handleAudioMessage = useCallback(async (message: string) => {
+  if (!message) {
+    console.warn('[audio] Empty message received');
+    return;
+  }
+  try {
     const decodedContent = atob(message);
     const arrayBuffer = new ArrayBuffer(decodedContent.length);
     const view = new Uint8Array(arrayBuffer);
     for (let i = 0; i < decodedContent.length; i++) {
       view[i] = decodedContent.charCodeAt(i);
     }
     await wavStreamPlayer.add16BitPCM(arrayBuffer, trackId.current);
+  } catch (error) {
+    console.error('[audio] Failed to process audio message:', error);
+  }
 }, []);

293-298: Add cleanup timeout for WebSocket closure.

To ensure proper cleanup, add a timeout to force close the connection if it takes too long.

Apply this diff to add cleanup timeout:

 const closeSpeechWs = () => {
+  const forceCloseTimeout = setTimeout(() => {
+    if (speechRef.current) {
+      console.warn('[speech] Force closing WebSocket after timeout');
+      speechRef.current.close(1000, 'force close');
+      speechRef.current = null;
+    }
+  }, 5000);

   if (speechRef.current && speechRef.current.readyState === WebSocket.OPEN) {
     speechRef.current.close();
+    clearTimeout(forceCloseTimeout);
   }
   speechRef.current = null;
 };

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a71ee51 and 5aa122c.

📒 Files selected for processing (3)

examples/coze-js-web/src/pages/chat-x/config.ts (1 hunks)
examples/coze-js-web/src/pages/chat-x/index.tsx (4 hunks)
examples/coze-js-web/src/pages/chat-x/use-ws-api.ts (9 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

examples/coze-js-web/src/pages/chat-x/config.ts

⏰ Context from checks skipped due to timeout of 90000ms (1)

GitHub Check: Node.js v20 (ubuntu-latest)

🔇 Additional comments (2)

examples/coze-js-web/src/pages/chat-x/index.tsx (1)

3-11: LGTM! Clean imports and hook usage.

The imports are properly organized and the hook destructuring includes all necessary methods for speech functionality.

Also applies to: 89-98

examples/coze-js-web/src/pages/chat-x/use-ws-api.ts (1)

461-498: Add rate limiting for speech requests.

The startSpeech function should implement rate limiting to prevent overwhelming the speech service.

codecov · 2025-02-14T08:40:16Z

Codecov Report

All modified and coverable lines are covered by tests ✅

@@            Coverage Diff             @@
##             main      #97      +/-   ##
==========================================
+ Coverage   91.82%   92.71%   +0.89%     
==========================================
  Files         117       52      -65     
  Lines        5469     1923    -3546     
  Branches     1106      344     -762     
==========================================
- Hits         5022     1783    -3239     
+ Misses        447      140     -307

Components	Coverage Δ
coze-js	`92.71% <91.22%> (-0.24%)`	⬇️
realtime-api	`∅ <ø> (∅)`

see 72 files with indirect coverage changes

… handling (#97) Co-authored-by: shenxiaojie.316 <[email protected]>

jackshen310 force-pushed the feat/coze-js-web branch from 3cbf239 to a71ee51 Compare February 13, 2025 11:37

coderabbitai bot reviewed Feb 13, 2025

View reviewed changes

examples/coze-js-web/src/pages/chat-x/use-ws-api.ts Show resolved Hide resolved

feat(coze-js-web): Add text-to-speech functionality and improve audio…

5aa122c

… handling

jackshen310 force-pushed the feat/coze-js-web branch from a71ee51 to 5aa122c Compare February 14, 2025 08:36

coderabbitai bot reviewed Feb 14, 2025

View reviewed changes

jsongo approved these changes Feb 14, 2025

View reviewed changes

jsongo added this pull request to the merge queue Feb 14, 2025

Merged via the queue into coze-dev:main with commit 7476fbd Feb 14, 2025
8 checks passed

This was referenced Feb 20, 2025

feat(coze-js): add WebSocket speech and transcription clients #99

Merged

Feat/realtime console #109

Closed

jackshen310 added a commit that referenced this pull request Mar 6, 2025

feat(coze-js-web): Add text-to-speech functionality and improve audio…

747a497

… handling (#97) Co-authored-by: shenxiaojie.316 <[email protected]>

This was referenced Mar 7, 2025

feat(chat-x): feat: add configurable WebSocket URL #135

Merged

fix(coze/api): fix speech bug #141

Merged

coderabbitai bot mentioned this pull request Apr 9, 2025

feat: Add transcription SDK and update documentation #190

Merged

coderabbitai bot mentioned this pull request Nov 26, 2025

feat(speech/transcription): 音播报提供释放播放器功能并支持语音识别自定义配置 #324

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(coze-js-web): Add text-to-speech support #97

feat(coze-js-web): Add text-to-speech support #97

Uh oh!

jackshen310 commented Feb 13, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 13, 2025 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

codecov bot commented Feb 14, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(coze-js-web): Add text-to-speech support #97

feat(coze-js-web): Add text-to-speech support #97

Uh oh!

Conversation

jackshen310 commented Feb 13, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Suggested reviewers

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Feb 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jackshen310 commented Feb 13, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 13, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

codecov bot commented Feb 14, 2025 •

edited

Loading