This example shows a realistic way to profile memory use from native Python extensions on Render:
- A FastAPI service exposes an authenticated endpoint that captures a Memray profile for a single request.
- The profiled workload uses NumPy and pandas, which exercise native code paths and temporary allocations that are hard to spot with Python-only tracers.
- Render stores the
.bincapture and generated flamegraph on a persistent disk so the artifacts survive restarts and redeploys. - Dependencies are managed with
uv, usingpyproject.tomland a checked-inuv.lock.
app.py: FastAPI app with the authenticated profiling endpoints.workload.py: A customer analytics workload that allocates heavily in NumPy and pandas.offline_profile.py: A CLI entrypoint for one-offmemray run --nativecaptures.pyproject.toml/uv.lock: Project metadata and locked dependencies foruv.render.yaml: Render Blueprint for deploying the example with a persistent disk.
Using memray.Tracker(..., native_traces=True) around a single request lets you capture one expensive path without running the entire service under a profiler all day. That is a better fit for a production-like Render service than starting the server itself with memray run.
The service also writes captures to /var/data/memray, which is on a persistent disk in render.yaml. That matters because Memray captures can be large and you usually want them available after the request completes.
Create a virtual environment, install dependencies, and start the service:
uv sync
export MEMRAY_PROFILE_TOKEN=local-dev-token
uv run uvicorn app:app --reloadTrigger a profile:
curl -X POST http://127.0.0.1:8000/profile/native \
-H 'Content-Type: application/json' \
-H 'X-Profile-Token: local-dev-token' \
-d '{"rows": 400000, "customers": 35000, "top_n": 12}'List available captures:
curl http://127.0.0.1:8000/profiles \
-H 'X-Profile-Token: local-dev-token'Open the generated flamegraph from the output path returned by the API, or download it directly:
curl -OJ http://127.0.0.1:8000/profiles/<run-id>/flamegraph \
-H 'X-Profile-Token: local-dev-token'If you want the simplest possible Memray command on a Render shell, use the shared workload directly:
uv run memray run --native -o /var/data/memray/manual-run.bin \
python offline_profile.py --rows 400000 --customers 35000 --top-n 12Then generate the HTML report on the same machine:
uv run memray flamegraph -o /var/data/memray/manual-run.html /var/data/memray/manual-run.bin- Create a new Blueprint service from this repository.
- Render provisions the
MEMRAY_PROFILE_TOKENenvironment variable automatically. - After the service is live, copy the token value from the dashboard.
- Trigger the profiling endpoint against your Render URL:
curl -X POST https://your-service.onrender.com/profile/native \
-H 'Content-Type: application/json' \
-H 'X-Profile-Token: <token-from-render>' \
-d '{"rows": 400000, "customers": 35000, "top_n": 12}'- Download the capture or flamegraph:
curl -OJ https://your-service.onrender.com/profiles/<run-id>/capture \
-H 'X-Profile-Token: <token-from-render>'
curl -OJ https://your-service.onrender.com/profiles/<run-id>/flamegraph \
-H 'X-Profile-Token: <token-from-render>'MEMRAY_MAX_ROWSlimits how large a single capture request can be.- Only one profile runs at a time. The service returns
409if another capture is already in progress. - Native symbol resolution is best when you generate the report on the same Render instance that created the capture.
- Render automatically makes
uvavailable for Python services whenuv.lockis present in the repo root.