By Kashif Ullah · Published May 22, 2026 · 12 min read ·

#streamlit
#python
#production
#deployment

From Streamlit Prototype to Production: A Checklist

Streamlit is great for prototypes and bad for production. Here's how to know when to keep it, when to wrap it, and when to rewrite.

Streamlit is the best tool I know for proving a data-driven idea quickly. You can go from “what if we had a dashboard for this?” to a working app in an afternoon. It’s also the cause of more “we can’t scale this” conversations than anything else in Python land. The problem isn’t Streamlit — it’s that people mistake a prototype tool for a production framework and then spend months fighting its limitations instead of making a clean decision.

After migrating a dozen Streamlit apps to production architectures for clients, I’ve developed a clear decision framework. Here’s how I decide what to do with a Streamlit app once it’s working.

This is the actual decision tree I walk through — and I’ll be honest, most apps stop at the first or second box:

            Streamlit app works ✅
                     │
        ┌────────────┴────────────┐
        │ ≤10 internal users,     │ yes ──▶ KEEP AS-IS
        │ read-mostly, no auth?   │        (just containerize)
        └────────────┬────────────┘
                     │ no
        ┌────────────┴────────────┐
        │ jobs > 30s blocking     │ yes ──▶ WRAP WITH QUEUE
        │ each other?             │        (RQ / Celery + Redis)
        └────────────┬────────────┘
                     │ no
        ┌────────────┴────────────┐
        │ non-human consumers,    │ yes ──▶ SPLIT BACKEND
        │ >30 users, real auth?   │        (FastAPI + thin client)
        └────────────┬────────────┘
                     │ customer-facing / brand / rich UX
                     ▼
                REWRITE UI  (Next.js + FastAPI)

Understanding Why Streamlit Breaks at Scale

Before deciding what to do, it helps to understand why Streamlit struggles in production. The architecture explains everything.

Streamlit runs your entire Python script top-to-bottom on every user interaction. Click a button? The whole script re-executes. Change a slider? Top-to-bottom again. This “reactive re-run” model is brilliant for prototyping because it means zero boilerplate — you just write sequential Python and Streamlit figures out the UI. But it creates three problems at scale:

Compute waste. Every interaction re-runs expensive operations (database queries, API calls, model inference) unless you explicitly cache them. Caching helps, but @st.cache_data has gotchas: it’s per-process (not shared across users), it doesn’t invalidate on external data changes, and cache misses cause noticeable freezes.
Concurrency limits. Each user session runs in a single thread within a single Python process. Streamlit Community Cloud gives you one process. Self-hosted, you can run multiple workers, but each worker is independent — there’s no shared state, no connection pooling across workers, and no horizontal scaling without a load balancer that understands Streamlit’s WebSocket connections.
No real authentication. Streamlit has st.experimental_user for basic identity on Streamlit Cloud, but no role-based access, no SSO integration, and no session management you’d trust with sensitive data. Every “auth solution” for Streamlit is a hack layered on top of a framework that wasn’t designed for it.

These aren’t bugs — they’re design choices that make Streamlit excellent for its intended purpose. The mistake is expecting it to be something it’s not.

Decision 1: Keep It As-Is

Keep the Streamlit app unchanged if all of these are true:

≤10 concurrent users, all internal (colleagues, not customers).
Read-mostly operations — viewing dashboards, running queries, exploring data. No critical writes.
1–2 second response times are acceptable for interactive elements.
Single-process execution is fine — you’re okay with one user’s heavy query slowing down everyone else temporarily.
No authentication requirements beyond “it’s behind our VPN.”

For internal dashboards and analyst tools, this covers the majority of cases. I’ve seen teams spend months “productionizing” a Streamlit app that serves five analysts who are perfectly happy with it as-is. Don’t overengineer.

What to Still Do

Even for a “keep it” decision, do the bare minimum:

Containerize it. Don’t run streamlit run in a tmux on someone’s laptop. Write a Dockerfile, push to your container registry, deploy to a VM or Cloud Run.
Move secrets out of code. Use environment variables or a secret manager. Hardcoded API keys in app.py will eventually leak.
Add an uptime check. A simple HTTP ping that alerts you when the app is down. Your users will notice before you do otherwise.

Decision 2: Wrap It with a Queue

Add a job queue if:

The app runs jobs longer than 30 seconds (model training, large data exports, batch processing).
Multiple users would block each other because one user’s heavy operation ties up the process.
You need progress visibility for long-running tasks (progress bars, status updates, estimated time remaining).

The pattern: keep the Streamlit UI as-is, but submit heavy work to a background queue and poll for results. The user clicks “Run Analysis,” the app submits the job to Redis Queue (RQ) or Celery, shows a spinner with a progress bar, and displays results when the job completes.

Implementation Sketch

import streamlit as st
from redis import Redis
from rq import Queue

q = Queue(connection=Redis())

if st.button("Run Analysis"):
    job = q.enqueue(run_heavy_analysis, dataset_id=selected_dataset)
    st.session_state.job_id = job.id

if "job_id" in st.session_state:
    job = q.fetch_job(st.session_state.job_id)
    if job.is_finished:
        st.success("Done!")
        st.dataframe(job.result)
        del st.session_state.job_id
    elif job.is_failed:
        st.error(f"Failed: {job.exc_info}")
        del st.session_state.job_id
    else:
        st.spinner("Processing...")
        st.rerun()

Users don’t notice the queue. The app stays responsive. Heavy work runs in a separate process (or even on a different machine). This pattern extends the useful life of a Streamlit app by months, sometimes years.

The key constraint: you need a Redis instance. On AWS, ElastiCache gives you a managed Redis. On GCP, Memorystore. For small workloads, a Redis container on the same VM works fine.

One trade-off I’ll defend: I reach for RQ (Redis Queue) over Celery for these Streamlit-wrapping jobs, even though Celery is the more “serious” answer. My reasoning is that the whole point of staying on Streamlit is that the app is small and the team is small. Celery’s broker/result-backend/worker/beat topology is power you pay for in operational surface area — more to configure, more to monitor, more to break at 2 AM. RQ is one Redis connection and a worker process; a junior dev on the client’s team can read the entire RQ codebase in an afternoon. The day the workload genuinely needs Celery’s routing, retries, and scheduling, I switch — but I refuse to pay that complexity tax before the workload demands it.

Insert a real screenshot here. A GIF or still of your queued Streamlit app showing the progress spinner while a job runs in the background sells this pattern better than any code block. Swap this note for that capture when you have one.

Decision 3: Split the Backend

Separate the backend into a proper API service if:

You need to expose the same functionality to non-humans — other services, CLI tools, mobile apps, external partners.
More than ~30 active users are using the app concurrently.
You need real authentication — SSO, role-based access control, audit logs.
You need horizontal scaling — adding more instances behind a load balancer.
Multiple frontends need to access the same data and logic.

The migration path: extract your business logic into a FastAPI service. Keep Streamlit (or replace it with a React/Next.js frontend) as a thin client that calls the API. The Streamlit app becomes a UI layer with no business logic of its own.

How Long Does This Take?

It depends entirely on how your Streamlit app is structured:

If the app already separates UI from logic (functions that compute results are in separate modules from st. calls): 1–2 weeks. You’re essentially wrapping existing functions in FastAPI endpoints.
If everything is in one file with logic mixed into Streamlit widgets: 4–6 weeks. Because the real work isn’t building the API — it’s untangling the logic from the UI. Every st.session_state access needs to become a proper parameter. Every st.cache_data call needs to become a real caching strategy.

This is why I always recommend keeping logic and UI separate from day one, even in prototypes. The 30 minutes you spend putting your data processing functions in a core/ directory saves you weeks when the migration inevitably comes.

The difference shows up immediately when you run a profiler against the two layouts. Here’s a py-spy capture of a mixed-logic Streamlit app during a single button click — the kind of trace that tells me a migration will be painful:

$ py-spy top --pid 4821
Total Samples 2400
GIL: 71.00%, Active: 94.00%, Threads: 1

  %Own   %Total  Function (file)
 41.00%  41.00%  run_query        (app.py:88)   ← business logic inside app.py
 22.00%  63.00%  st._main_run     (streamlit/…)  ← re-runs the whole script
 18.00%  18.00%  load_model       (app.py:142)  ← reloaded every interaction
  9.00%  72.00%  render_dashboard (app.py:201)

When run_query and load_model live inside app.py like this, every slider drag re-runs them. When they live in a core/ module behind a cache, the same trace is dominated by st._main_run doing almost nothing — and the FastAPI migration becomes “import core, wrap in endpoints” instead of a rewrite.

The FastAPI Migration Pattern

Identify all operations the Streamlit app performs (queries, computations, writes, file operations).
Create a FastAPI endpoint for each operation. Use Pydantic models for request and response schemas.
Move the business logic into the endpoint handlers. The logic shouldn’t change — only the interface.
Replace direct function calls in Streamlit with httpx or requests calls to the API.
Add authentication to the API (JWT tokens, API keys, or OAuth depending on your needs).
Deploy the API independently — Lambda for sporadic traffic, Fargate or ECS for steady-state loads.

Decision 4: Rewrite the UI

Replace the Streamlit frontend entirely if:

The app is customer-facing. Streamlit’s generic component library and limited styling options don’t meet brand standards for external products.
You need granular interactivity that Streamlit can’t deliver — drag-and-drop interfaces, complex multi-step form wizards, real-time collaboration, rich text editing.
You’re fighting the framework more than using it. Custom CSS hacks, JavaScript injection via st.components, elaborate session state management to simulate multi-page workflows — these are signs you’ve outgrown Streamlit.
Performance matters. Streamlit’s full-script re-execution model adds latency that’s noticeable to end users. A React frontend with targeted state updates is fundamentally faster for interactive applications.

For customer-facing tools, my default stack is Next.js with a FastAPI backend. For data-heavy applications that need visualizations, I add D3.js or Recharts. For anything that needs 3D visualization, React Three Fiber.

The Mistake to Avoid

Don’t try to “scale Streamlit.” I’ve seen teams add nginx reverse proxies for load balancing, Redis for shared session state, custom authentication middleware, JavaScript injection for UI customization, and WebSocket hacks for real-time updates — all layered on top of Streamlit. Each patch solves one problem and creates two more.

Every patch on top of Streamlit to make it production-grade is a step away from the framework’s actual strengths. Either keep it small and friendly, or migrate cleanly. The middle ground — a Streamlit app with 15 layers of production infrastructure hacked around it — is where projects go to die slowly.

The Production Readiness Checklist

Before deploying any Streamlit app beyond a personal prototype, verify these items:

Authentication

Streamlit’s session secrets are not auth. Use a reverse proxy (Cloudflare Access, oauth2-proxy, AWS ALB with Cognito) in front.
If using Cloudflare Access, configure it to handle SSO and pass user identity via headers.
Test that unauthenticated requests are actually blocked, not just hidden behind UI elements.

Secrets Management

No API keys, database passwords, or tokens in code — not even in a .env file committed to git.
Use environment variables injected at deploy time, or a secret store (AWS Secrets Manager, GCP Secret Manager, HashiCorp Vault).
Verify that st.secrets (if used) is loaded from a file not in version control.

State Management

Don’t rely on st.session_state for anything important. It’s per-tab, per-session, and lost on browser refresh.
For persistent state (user preferences, saved queries, draft reports), use a database.
Understand that st.session_state doesn’t survive server restarts or worker recycling.

Caching

@st.cache_data is for serializable data (DataFrames, dicts, lists). Use it for expensive queries and computations.
@st.cache_resource is for connections and clients (database connections, API clients, ML models). Don’t use it for data.
Set appropriate ttl values. Default is infinite, which means stale data forever.
Remember: caches are per-process. Multiple workers each have their own cache.

Deployment

Containerize with Docker. Pin your Python version and all dependencies.
Use a process manager (gunicorn or supervisord) — don’t rely on bare streamlit run.
Configure health check endpoints for your orchestration platform.
Set resource limits (memory, CPU) appropriate to your workload.

Monitoring

An uptime check that alerts you when the app is unreachable.
A logging pipeline you actually read — centralized logs via CloudWatch, Stackdriver, or ELK.
Error tracking (Sentry) to catch unhandled exceptions before users report them.
Basic usage metrics: how many users, how often, which features.

Real-World Migration Timeline

Here’s what a typical Streamlit-to-production migration looks like, based on projects I’ve completed for clients:

Phase	Duration	What happens
Audit	2–3 days	Map all features, identify logic vs. UI, document data flows
API design	3–5 days	Define FastAPI endpoints, Pydantic schemas, auth strategy
Backend build	1–2 weeks	Implement API, migrate logic, add tests
Frontend (if replacing UI)	2–3 weeks	Build React/Next.js frontend against the API
Integration testing	3–5 days	End-to-end testing, load testing, security review
Deployment	2–3 days	CI/CD pipeline, monitoring, DNS cutover

Total: 4–7 weeks for a full migration with UI replacement, or 2–3 weeks if keeping the Streamlit frontend with an API backend.

Frequently Asked Questions

Can I use Streamlit for a customer-facing product?

You can, but I wouldn’t recommend it for anything beyond an MVP or early beta. Streamlit’s limited styling options, generic component library, and performance characteristics make it hard to deliver a polished customer experience. Use Streamlit to validate the product concept, then migrate to a proper frontend once you’ve confirmed product-market fit.

Streamlit’s native multi-page feature (placing files in a pages/ directory) works for simple navigation. For more complex routing with URL parameters, authentication per page, and shared state across pages, you’ll quickly hit limitations. This is one of the signals that it’s time to consider a dedicated frontend framework.

Is Streamlit Cloud good enough for production?

For internal tools with fewer than 10 users, Streamlit Cloud’s free tier is surprisingly capable. For anything beyond that — especially customer-facing apps or apps handling sensitive data — self-hosting gives you more control over performance, security, and reliability. Container deployment on Cloud Run, Fargate, or a simple VM is my recommendation.

Should I use Gradio instead of Streamlit for ML demos?

Gradio is excellent for ML model demos — it’s purpose-built for showcasing inputs and outputs of ML models. If your app is primarily “upload data, get prediction,” Gradio is faster to build and looks better out of the box. If your app involves dashboards, data exploration, multi-step workflows, or business logic beyond model inference, Streamlit is the better choice.

How do I handle file uploads larger than 200MB in Streamlit?

Streamlit’s default upload limit is 200MB (configurable via server.maxUploadSize). For larger files, I recommend direct-to-cloud uploads: generate a presigned S3 URL, have the user upload directly to S3 from the browser, and process the file in a background worker. This keeps the Streamlit server responsive and avoids memory issues from loading large files into the Python process.

Need help moving a Streamlit app to real production? That’s a service I offer.