--- 7.6.2025 16:30 ---
Hi there
Thanks for reaching out to us
 
We implement caching optimizations to deliver the best possible performance. However, in periods of high demand – especially when multiple customers are making heavy use of the system – our caching layers may need to be cleared more frequently. This can unfortunately result in occasional increases in latency. We're actively monitoring usage patterns to balance performance and fairness across all workloads.
 
Would you be able to perhaps move to a different type of GPU, I can see that you have 4090s enabled on one vllm endpoint and A6000s and A40s on another, but I suspect you may have a better experience with perhaps L40s or other types of GPUs – Could you try them out?
 
A6000s are currently on low availability, and while A40s are in high availability they may not have been cached to that good an extant at the moment, so I suspect playing around with the GPU types here should give a better experience

Regards
River 🌊
From RunPod support, RunPod is always here for anything you need :)

--- 10.6.2025 17:02 ---
Hi David,

Were you able to resolve ticket #18647? We haven’t heard back from you on our earlier email asking for more details.

If we still haven’t heard back in the next 48 hours, the ticket will be closed. We know our customers are busy, so if your issue is not resolved and you continue to need support, please reach out to us directly by replying to this email or email us at help@runpod.io.

We also provide documentation and tutorials. Our community in Discord is very active and willing to help with questions on open-source projects and other software running on our service. You can also check the repository for open-source projects.

Join our Discord channel and community discussions here - https://discord.gg/DaFV9Fg34X
Here's our self help center with documentations and tutorials - https://docs.runpod.io/

Best,
RunPod Support Team

--- 11.6.2025 04:19 ---
Hello. No, I wasn't able to resolve it. The reply didn't ask me for further details, just to use different GPUs. Because @River also replied to the Discord thread, I continued the conversation there but didn't get any additional response after sharing more details and asking questions. I also see that multiple users share the same problem and they're also leaving the platform because of this. So perhaps it deserves a bit of attention.

From my latest observations:
I've created a tiny testing container with just the minimal working handler to exclude the size and software of my custom images. It's still happening. On all data centres, on multiple GPU types (I even tested the recommended L40s stated in the email), on all request types (sync, async, openAI), and on a fresh new endpoint. When it's cold-start it takes 6-8s+ just to start the container. Even when you spam the warm worker with requests, the best-case delay is like 10x what it used to be in the past. This state is nowhere near the marketed <250ms (which was true in the past, I remember the same endpoint, with the same container performing like that), and unfortunately unusable for low-latency tasks.

image.png
image.png

--- 16.6.2025 12:37 ---
Hi David
I think I may have missed the discord thread 😅 having further responses, my apologies for that!
 
I am sorry, you're experiencing this issue, Any chance you can share your dockerfile with us?, I suspect this issue may be due to docker caching and optimizations on this regard
 
Let me know if this helps you out 🙂 
Regards
River 🌊
From RunPod support, RunPod is always here for anything you need :)

--- 16.6.2025 20:14 ---
This issue is happening on all templates so far. Official ones, mine, or even the ones of other users that could reproduce it in the thread, like the Discord mod Jason. For testing, it helps to have some that don't have anything outside the handler function (so it's not also counted as delay), so you can see the delay time really just as job queue delay.

Nevertheless, here are some Dockerfiles I've used:
https://github.com/davefojtik/RunPod-vLLM/blob/main/Dockerfile
https://github.com/davefojtik/RunPod-Fooocus-API/blob/Standalone/Dockerfile

I've also created a tiny testing container with just the minimal working handler to exclude the size and software of my custom images. It was simply:

```Dockerfile
FROM python:3.10.18-slim
ENV DEBIAN_FRONTEND=noninteractive \
    PIP_PREFER_BINARY=1
SHELL ["/bin/bash", "-o", "pipefail", "-c"]

# Update and install system packages
RUN apt-get update && apt-get upgrade -y && \
    apt-get clean -y && rm -rf /var/lib/apt/lists/*

RUN pip install runpod==1.7.10

RUN mkdir /src
WORKDIR /src
ADD src .

CMD ["python", "handler.py"]
```

And handler:
```python
import runpod

async def handler(job):
    print(job)
    return {"status": "ok"}

if __name__ == "__main__":
    runpod.serverless.start({"handler": handler})
```

As documented in the Discord thread, this minimal container experienced the same queue delay times on first worker requests as any other template.

I've been able to achieve good delay times only when spamming active workers. But even in that case, the first request to the worker is still bad, which doesn't make much sense - the container should already run and be ready. Delay with active workers:
image.png

Delay with idle, non-active workers:
image.png

The issue should be reproducible by anyone with the following steps:
- Go to the serverless endpoint and send request to a worker for the first time
- You should see these long delays.
- Emulate worker shifting by eliminating the worker who got that request and is warm.
- Send another request, and it will have a long delay again.
- Repeat the eliminations and send additional requests. All of them should have long queue delays.

I hope these observations will help pinpoint the cause better.

--- 17.6.2025 12:58 ---
Hi David,
 
Thank you for all the detailed information and testing. Since you are still seeing the same queue delays even with minimal containers, could you please try updating to the latest version of the runpod SDK? There have been some recent changes and optimizations that may improve queue and worker initialization times.
 
If you continue to experience delays after updating, please let us know and we will escalate your ticket for deeper investigation.
 
Best regards,
Roman

--- 18.6.2025 11:14 ---
Updating the minimal container to the RunPod SDK 1.7.12 actually made it worse. Now it seems much more inconsistent, worse on average and I could hit the longest delays I have ever seen. 20+ seconds!
All requests below have been made to cold workers:
runpod_1_7_12_01.png
runpod_1_7_12_02.png

--- 18.6.2025 14:31 ---
Hi David,
 
Thank you for testing with the latest SDK. Since you are seeing even longer delays with version 1.7.12, could you please try reverting to runpod SDK version 1.7.10? Based on our recent observations, 1.7.10 may provide more consistent results for your use case.
 
Let us know how it performs and if you continue to experience similar delays. We appreciate all your detailed feedback and patience as we investigate.
 
Best regards,
Roman

--- 18.6.2025 15:04 ---
Yes, I was using 1.7.10 the whole time. All previous tests were on that version. I also tried to downgrade to 1.6.2 and even 1.5.2, and it's still happening there. This is all mentioned in the 2 week long conversation.
It can also be easily tested by the staff. I've sent the steps to reproduce this problem and so far it seems only Jason (Discord mod) tested and confirmed it. I would appreciate such confirmation also by the official team, along with the information on whether it will be fixed or not.

--- 21.6.2025 13:18 ---
Hi David
Thanks for getting back to us on this regard
 
Would you be able to either come on a call with me (if you are free between any time from 6AM to 4PM CEST) so I can investigate this issue further?
 
Just do let me know which time you are free at, and I'll send you a meeting invite 🙂 
 
This would help me dive into the issue and fix the issue asap for you
Regards
River 🌊
From RunPod support, RunPod is always here for anything you need :)

--- 21.6.2025 16:40 ---
Hi River,

Thanks for your reply. Unfortunately, I won’t be taking more calls or meetings about this matter. I’ve already spent significant time documenting and sharing all the necessary details for you to work on the issue. The delay is reproducible and confirmed by multiple users, so this shouldn’t require further input from my side.
At this point, I’m only waiting on two things:
1. Internal confirmation that the developer team has reproduced the issue.
2. A clear answer on whether it will be fixed or not.
As context: I encountered a very similar delay issue with another provider. Their support understood the problem after three concise questions and pushed a code hotfix the same day, all resolved in ~20 minutes via live chat. That’s the level of support customers expect from a platform they rely on to run production workloads, not weeks of vague replies and then an invitation to more discussion.

I’d genuinely prefer to see this resolved rather than be forced to move my projects and open-source efforts elsewhere. I’ve spent over a year and a half maintaining community templates, promoting RunPod, and helping your customers troubleshoot issues. All without expecting anything in return. That’s also why such treatment is especially disappointing.

Let me know once there’s an update.
Best,
David

--- 23.6.2025 11:01 ---
Hi David
Thanks for reaching out
 
I wanted to get on a call with you to be able to reproduce this issue with ease on my end, as an alternative (so that I can recreate this issue for you), Would you be able to convert your account to a team account (via this https://docs.runpod.io/get-started/manage-accounts#convert-personal-account-to-a-team-account) and add me to the mentioned account please?
 
This would help me reproduce this issue via your account, and internally document it on our end, so I can fix this issue as soon as possible
 
Let me know if this helps you out
Regards
River 🌊
From RunPod support, RunPod is always here for anything you need :)

--- 23.6.2025 21:52 ---
Hi River,
Thanks for the follow-up.
Just to clarify, this issue has already been acknowledged, documented and reproduced by a lead engineer, as confirmed yesterday by @Dj on Discord. Perhaps you didn't get notified about it. I also posted this update in the public Discord thread connected with this issue, which you're part of. I've been told to share "reference E-2905" with you.

I'm happy to help solve this issue. Let me know if I could provide information beyond what's already been shared.

Best,
David

--- 28.6.2025 09:30 ---
Hi David
That sounds awesome, in that case, I am going to connect the reference with that ticket and mark this as escalated to engineering, I'll update you when we have this issue resolved 🙂 
Regards
River 🌊
From RunPod support, RunPod is always here for anything you need :)

--- 3.7.2025 14:28 ---
Hi David,
 
Thanks for following up. To proceed, we need a screenshot showing that auto-pay is turned off on your account. Once you send that over, we can make sure your ticket stays open for further updates from our team.
 
Let me know if you need help finding this setting.
 
Best regards,
Roman

--- 6.7.2025 17:54 ---
Weird requirements... But here you have it. Keep the ticket open, please.
{1864A8C7-59A9-4B0E-B950-0FE4BD477332}.png

--- 7.7.2025 16:42 ---
Hi David,
 
Thanks so much for this David, I have gone aheda and escalated this for a refund, we will get back to you asap here, when we have refunded this
 
Let me know if you need anything else from my side 🙂 
Regards
River 🌊
From RunPod support, RunPod is always here for anything you need :)

--- 10.7.2025 16:56 ---
Hey there,
 
I just want to reach out today to confirm that we are still working on your issue. We are coordinating efforts to ensure we have a response to you soon. Please note that some of our support is based on EU times and for this reason some of the replies may be delayed.
 
If you have any additional details you would like to provide please feel free to reach out to this ticket we are more than happy to help.
Thank you,

Geovany
Technical Support Analyst

--- 14.7.2025 15:11 ---
Hi David,
 
I'm sincerely sorry for your experience here.
 
I can see our engineering team has made some changes, which are under review. Our team will keep a tap on it and follow up on the ticket for you. Also, I have added $10 back to your Runpod wallet so that you can use those later once the issue is fixed.
 
Once again, I apologize for your experience with the support this time. My team was trying to gather information over emails, which is quick on the chats, which is Discord. Thank you for writing back to us and bringing this issue to our attention.
 
Your engineering ticket has been tagged in this support ticket now, so we can easily track the progress and update you on this. We're going to keep this support ticket open for your visibility and tracking until it's solved. 
Thanks & Regards,
Namrata Raut
RunPod Support Team

--- 15.7.2025 07:16 ---
Thank you. Even though I don't have a way to distribute such a refund to all the users who have lost, or are currently losing money because of this bug. Based on it, your management could also reconsider the decision to bill users from the moment the worker gets a wake-up signal, rather than actually starting. In that case, even if the prepared fix would still not be able to push queue delays down to the marketed values, at least it would not be unfair. I am looking forward to seeing RunPod as a customer and developer-friendly platform again.

Regards,
David

--- 15.7.2025 17:54 ---
Hi David,
 
I understand your frustration here. Please allow us some time to get you the final updates. 
 
Rest assured, our engineering team is working to fix the issue soon. I have also shared your feedback with our product team and engineering leadership team.
 
Your experience matters to us. 
Thanks & Regards,
Namrata Raut
RunPod Support Team

--- 30.7.2025 00:55 ---
Hi David,
 
I'm following up to confirm that our engineering team has fixed this issue. We apologise for the delay, which was due to its complexity. It took us some time as the testing was pending after the changes.
 
At this point, I have added $20 back to your account so that you can test the changes and cover your loss due to the issue you encountered. 
 
Let us know if you have any questions. We're here to help!
Thanks & Regards,
Namrata Raut
RunPod Support Team

--- 31.7.2025 00:39 ---
Hello. The queue delay issue remains unresolved. This is also confirmed after chatting about it with @Dj on Discord.

Me: image.png
Dj: That would not be the problem this solves, we've stopped specifically the 2 minute delay from applying when it shouldn't for endpoints with no running workers
...
I think support was using your ZenDesk ticket as a tracker for the edge case bug, but this is still an important bug fix with respect to the job pickup process.

It seems like you fixed something else, but not the problem this issue reported the whole time. The workers still experience unusable first-time hit delays. See the following current delays with a minimal example container:
image.png

Since you informed me that a solution for this reported major issue was being prepared about a month ago, which came after another month of conversations and deep documentations of the problem from multiple people, is there any ETA for that being released too? And have the teams come to any decision about whether it's okay to charge this queue delay time the same as when the container is actually running? Thank you.

Regards,
David

--- 1.8.2025 21:54 ---
Hi David,
 
Thank you for your response. 
 
Our engineering team has received the necessary updates. We will write back to you soon with further progress on this ticket. 
 
We're sorry for the inconvenience and sincerely appreciate your patience while we review your concern. 
Thanks & Regards,
Namrata Raut
RunPod Support Team

--- 1.8.2025 23:45 ---
Hi David,
 
Thank you for your input. I completely understand your frustration, and I sincerely apologize for the experience you've had with our support process.
 
Could you please confirm whether the most recent endpoint ID 0m3elsicja3vai is the one where you’re experiencing the queue delay issue? We’d like to follow up with our engineering team to investigate this further.
 
We appreciate your cooperation and look forward to resolving this with you.
Regards,
Anamika Nayak
Runpod Support Team

--- 2.8.2025 00:41 ---
Hi David,
 
Please ignore our previous email; we found the details internally.
 
We analysed the logs further and can see that there was no delay, which should be concerning. The fix we posted here has fixed the delays you reported earlier. 
 
As a result of the fix being successful, we will close this ticket.
 
If you encounter any issues, please raise a ticket by emailing us at help@runpod.io with the endpoint, worker logs, and the time of the incident.
 
I appreciate your patience and understanding in this matter, which helped us resolve this issue as expected. 
 
I wish you a great weekend!
Thanks & Regards,
Namrata Raut
RunPod Support Team

--- 2.8.2025 02:59 ---
It's not fixed. However, over the last two months, I've seen enough to move on and understand that it's going nowhere. It was interesting to see how a company can look successful from the outside and in numbers, but function like this internally. 

I am archiving my templates and informing the users about the experience and results of this issue, and I'll have to recommend them to choose a different provider. I am still grateful for the years spent on building projects for your platform, contacts made and things learned along the way. I wish you luck in the future. 

Regards, 
David