Understanding Usage & Costs on Roundhouse

Roundhouse gives you access to a suite of leading AI models from OpenAI, Anthropic, and Google. With the introduction of usage-based charges, this guide will help you understand how costs are calculated, which model to choose for different tasks, and how to get the most out of the platform without unnecessary spend.

What Are Tokens?

Think of tokens as the unit of measurement AI models use to process text — similar to how a phone plan measures data in megabytes. Every word you type and every word the AI responds with is broken down into tokens.

As a rough guide:

  • 1 token ≈ 4 characters of text
  • 100 tokens ≈ 75 words
  • A short question and answer ≈ a few hundred tokens
  • A long document analysis ≈ tens of thousands of tokens

Your usage charges are based on the number of tokens consumed — both what you send to the model (input) and what it sends back (output).

Tip: You don’t need to count tokens manually — Roundhouse tracks your usage automatically. Just be mindful that longer conversations and large document uploads will consume more tokens.

Available Models & Pricing Tiers

Not all models cost the same. More powerful models — designed for complex, nuanced tasks — are priced higher than lighter models built for speed and efficiency. Choosing the right model for the job is the single biggest lever you have on your usage costs.

Model Provider Best For Cost
Auto (Recommended) Everyday tasks — Roundhouse selects the best model automatically Variable
GPT-5.5 OpenAI Complex reasoning, large documents, nuanced analysis High
GPT-5.4 OpenAI Advanced drafting, research, and multi-step tasks Medium
GPT-5.4 Mini OpenAI Shorter tasks, quick answers, cost-efficient Low
Claude Opus 4.7 Anthropic In-depth analysis, long-form writing, complex problem solving High
Claude Sonnet 4.6 Anthropic Balanced capability — great for most professional service tasks Medium
Claude Haiku 4.5 Anthropic Fast, lightweight tasks — highly cost-efficient Low
Gemini 3.1 Pro Google Large context tasks, document analysis High
Gemini 3.1 Flash-Lite Google Quick lookups, simple queries — most cost-efficient Low

General rule of thumb: the more capable the model, the higher the cost per token. Lightweight models like Claude Haiku 4.5 and Gemini 3.1 Flash-Lite are significantly cheaper than frontier models like GPT-5.5 or Claude Opus 4.7.

Which Model Should I Use?

Use AUTO or a Lighter Model for Everyday Tasks

For most day-to-day tasks, we recommend selecting AUTO or a lower-tier model. AUTO is Roundhouse’s smart default — it analyses your request and selects an appropriate model automatically, balancing quality and cost.

Good candidates for AUTO or lighter models (e.g. GPT-5.4 Mini, Claude Haiku 4.5, Gemini Flash-Lite) include:

  • Quick questions or lookups
  • Summarising short documents
  • Drafting simple emails or messages
  • Basic data formatting or extraction
  • Checking figures or quick calculations

Tip: When in doubt, start with AUTO. It’s designed to give you a great result without over-spending on model capacity you don’t need.

Save the Powerful Models for Complex Tasks

Reserve the larger, more capable models (e.g. GPT-5.5, Claude Opus 4.7, Gemini 3.1 Pro) for tasks that genuinely benefit from their additional horsepower:

  • In-depth tax research or technical analysis
  • Reviewing and drafting complex documents or workpapers
  • Processing very large files or lengthy reports
  • Multi-step reasoning across a lot of data
  • Situations where accuracy is critical and you need the best possible output

Top Tips for Keeping Usage Costs Down

  1. Use AUTO as your default. It intelligently routes your request to a suitable model, so you’re not overpaying for simple tasks.
  2. Be concise in your prompts. Clear, specific instructions reduce back-and-forth and unnecessary token consumption. You don’t need to write an essay — get to the point.
  3. Only upload what’s needed. Large file uploads consume a lot of tokens. Where possible, paste or share only the relevant sections of a document rather than the entire file.
  4. Start a fresh conversation for new tasks. Every message in a conversation adds to the token count, as the model re-reads the full history each time. Starting fresh for unrelated tasks avoids carrying over unnecessary context.
  5. Match the model to the task. Don’t use a premium model when a lighter one will do the job just as well. Save the big models for the tasks that truly need them.
  6. Avoid regenerating responses unnecessarily. Each time you regenerate a response, you’re consuming additional tokens. If the output isn’t quite right, try refining your prompt by editing it rather than regenerating.
  7. Use the Fork Conversation feature. If you have useful context built up at the start of a conversation that you want to reuse, fork it instead of starting from scratch or repeating yourself. This lets you branch off into a new conversation with that context already in place — saving both time and tokens.
  8. Use the Edit Message feature to course-correct. If you weren't happy with a response, you don't need to keep prompting forward. Use the edit message feature to go back and modify an earlier message — this effectively lets you "time travel" and steer the conversation in a different direction, without burning tokens on a long chain of follow-up messages.
  9. Review your usage regularly. Keep an eye on your usage dashboard to identify patterns and adjust your habits accordingly.

Will I be charged for uploads or file attachments?

Yes — uploaded files are converted to text and processed as tokens. Larger files will consume more tokens, so upload only the sections relevant to your task where possible.

Does switching models mid-conversation affect cost?

Yes. If you switch to a more powerful model partway through a conversation, subsequent messages — including the full conversation history that is re-read each turn — will be billed at the new model’s rate.

What is AUTO mode selecting?

AUTO analyses your prompt and context to pick an appropriate model from the available options. It’s designed to balance output quality with cost efficiency — so it won’t spin up a flagship model for a simple question.

Who can I contact about billing queries?

Reach out to the AI Dojo support team via the Help beacon or your platform admin for any billing questions or concerns.

Need help? Contact us at support@ai-dojo.com.au