AI chat UI patterns beyond ChatGPT

Threads, suggestions, tool calls, streaming. The patterns that make AI products feel modern, and the ones that still look like a fork of the same demo.

Every AI product shipped in the last 18 months has the same chat interface, a centered scroll of messages, an input pinned to the bottom, a paperclip icon, a send button. It's the ChatGPT shape, ported. It's also why most AI apps feel interchangeable, the chrome is identical, the personality is invisible.

A dark AI chat interface with a streaming assistant message, three suggested follow-up chips below it, and an inline tool-call card showing a web search in progress.
Streaming text, suggestion chips, tool-call cards. The three details that read as 2026.

The patterns that distinguish the good ones aren't subtle. They're a small list of conventions Vercel's AI SDK, Linear's AI features, Cursor, Raycast, and Claude itself have all converged on. Here's the field guide.

The five surfaces of a modern AI chat

1. The thread

The conversation itself, but with two affordances most teams skip: every user message is editable in place (Cursor's "edit and resend" flow), and every assistant message can be branched from. The thread is a tree, not a list, even if the UI hides the tree most of the time.

2. Streaming responses, with visible token-by-token render

Don't wait for the full response. Stream tokens as they arrive, with a soft fade-in or a subtle blinking cursor. Users perceive streaming output as 2–3x faster than the same response delivered in a single block, even when the total time is identical. Skip the spinner; the streaming text is the loading state.

3. Suggested follow-ups

After every assistant response, render 2–3 small follow-up suggestions below it. Claude, Perplexity, and ChatGPT all do this now. Suggestions are not chrome, they're a UI for the model's own confidence about what the user might ask next, and they materially lift engagement.

4. Tool calls, surfaced as inline cards

When the model calls a tool, search the web, run code, query a database, render the call as a small inline card, not a wall of JSON. "Searching the web for X" with a spinner, then a compact result list. The user should see what the model is doing and trust the output more for it.

An inline tool-call card inside a chat message, with a small spinner, a search query, and a collapsed result list with three sources.
Tool calls as compact cards. Not a wall of JSON.

5. The input

Multiline. Auto-growing. Cmd-enter to send (preserves enter for newlines). Attachment slot in the corner. A small model-selector if you ship more than one model. Don't overload the input with seven icons, attachments + model + send is the maximum that doesn't feel cluttered.

The interaction patterns that matter most

  • Stop generation. A visible "Stop" button while streaming, replaced by "Regenerate" when complete. Critical for long responses and the single most common AI-app UX bug when missing.
  • Copy and edit on every message. Hover any assistant message and small icons appear, copy, regenerate, share. Hover any user message and an edit icon appears.
  • Per-message timestamp on hover. Don't show timestamps inline by default, they make the thread feel like a chat log instead of a conversation. Reveal on hover for the rare time someone needs them.
  • Markdown rendering done right. Code blocks with syntax highlighting and a copy button. Headings with hierarchy. Lists indented properly. Most AI products ship with broken markdown for the first six months; don't be one of them.
  • Auto-scroll, but pause on user scroll. When new content streams in, scroll to follow it. The moment the user scrolls up to read something, stop auto-scrolling and add a "new message" pill that floats at the bottom.

The chrome decisions that matter

Sidebar for chat history

A left sidebar with grouped conversations (Today / Yesterday / Previous 7 days), search at the top, pinned items above the groups, settings near the bottom. The sidebar is collapsible but not hidden; users return to past conversations more than they expect to.

Empty state as a launchpad

When a user opens a new chat, the empty thread should suggest 4–6 specific things to try. Not "How can I help you today?" but "Summarize my last meeting," "Draft a follow-up email," "Review this PR." The empty state is the moment that decides whether a user understands what the product is for.

Density that matches the audience

Consumer AI products run wide and airy (ChatGPT, Claude). Developer-facing AI products run tighter (Cursor, Cody). Match the density to your audience; if your users are devs, your line-height and font-size should look more like an editor than a marketing page.

The mistakes that make AI apps feel cheap

  1. 01Avatars on every assistant message. A single "Claude" or model label per response is enough. Repeating an avatar 40 times in a thread looks like a Discord channel.
  2. 02A loading state that isn't streaming. A spinner where token-by-token text could be is the single largest reason an AI product feels slow.
  3. 03No way to start a new chat. "New chat" should be a visible button in the sidebar and a ⌘N shortcut. Burying it in a menu is the kind of mistake every team makes once.
  4. 04Saved conversations with auto-generated titles that never update. Title the conversation after the first exchange, not after the first user message. The user's first message is rarely descriptive enough.
  5. 05Personality copy that's louder than the response. "Sure, I'd be happy to help with that!" before every reply is the model's voice, not your product's. Trim it in the system prompt.

If you're building on the AI SDK

Vercel's AI SDK ships the streaming, tool calls, and message shape primitives. Pair with shadcn/ui for the components and Lucide for the icon set. Most production AI apps in 2026 are running some version of that stack, and the parts that look custom are the system prompts and the suggestions logic, not the UI primitives.

Frequently asked

Should every AI feature ship as a chat?

No, and the trend is moving away from it. Chat is the right surface for open-ended exploration. For focused tasks (improve this paragraph, generate a commit message), an inline action button outperforms a chat sidebar. Ship chat where the user genuinely has a conversation; ship inline AI where they have a one-shot task.

How do I handle streaming errors mid-response?

Show the partial response with a small inline error banner below it ("Connection lost, regenerate?"). Don't delete what was already streamed, users get upset when half-written text disappears. Offer a retry that picks up from the partial.

Where should I put the model selector?

Inside the input, on the left, as a small button that opens a popover. Not above the thread, not in the header, the model choice is per-conversation, and putting it next to the input means users associate the model with what they're about to send.

Is there a standard for citation and source rendering?

Inline numeric citations linking to a source list at the end of the message. Perplexity sets the bar; Bing Chat (Copilot) follows it. Don't ship hovercards for citations on first pass, the standard inline link is enough.

Ship one

The interaction entry in the directory has tuned prompts for streaming threads, suggestion chips, and tool-call cards in the AI SDK / shadcn shape. Pair with the forms entry for the multiline input and the features entry if you're showcasing AI features on a marketing page, the chat UI you ship in the product should look like the screenshot on the landing page.

Keep reading