How to Avoid Microsoft Graph API Throttling?

How to Avoid Microsoft Graph API Throttling

The Microsoft Graph API is an incredibly powerful tool that gives you unified access to a wide range of Microsoft 365 services. Whether you’re building applications, automating administrative tasks, or pulling reports, the Graph API offers unified access to services from Exchange Online and SharePoint to Microsoft Teams, Entra ID, and much more.  

However, as usage grows, so does the potential to run into throttling limits! 

However, as the scale of your usage increases, the likelihood of hitting throttling limits also rises. These throttling measures are in place to protect the overall health of the services and ensure fair access for all users. While this is great for the platform’s stability, it can also be a challenge if you’re trying to build applications that rely on seamless, high-frequency calls to Microsoft 365 services. 

In this post, I’ll demystify Graph API throttling, e xplore the practical steps you can take to avoid hitting those limits, and discuss how to handle throttling gracefully when it inevitably happens. 

What is Microsoft Graph API Throttling? 

Throttling is essentially Microsoft’s way of managing traffic to the Graph API to ensure the service stays stable and responsive. When an application exceeds certain usage thresholds, the API will respond with a 429 Too Many Requests or 503 Service Unavailable error. This helps prevent overloading the system and ensures that all users get fair access to the services.  

It’s important to understand that throttling isn’t a bug—it’s a designed feature. Some common scenarios where throttling might happen include: 

  • High-frequency polling (e.g., constantly checking mailbox messages every few seconds) 
  • Large data migrations or syncing huge volumes of data. 
  • Burst traffic during scheduled jobs 
  • Poorly designed retry logic that floods the API 

Without throttling, a single app that’s sending too many requests could potentially slow down or even break the experience for an entire tenant—or worse, for other tenants sharing the same infrastructure. 

Throttling can happen at various levels, including:  

  • Per user 
  • Per app 
  • Per tenant 
  • Per resource type (e.g., mail, calendar, directory) 

Additionally, each Microsoft 365 service behind Graph (like SharePoint or Exchange) has its own specific throttling policies to further fine-tune how the system manages traffic. 

Types of Throttling Limits in Microsoft Graph API 

Microsoft Graph API uses various throttling limits to ensure that resources are used fairly and the service remains stable for everyone. These limits can be applied in different ways—whether per user, per app, or even across the entire tenant. Knowing the different types of throttling will help you fine-tune your requests and avoid hitting those limits. 

Global Throttling Limits 

Before we get into service-specific limits, let’s start with the global throttling limit. This is a universal cap that applies across all Microsoft Graph services. 

Any request type: 130,000 requests per 10 seconds per app across all tenants. 

This global limit applies no matter which service you’re accessing. When you make a request, it’s evaluated against several factors, like the scope of the limit (per app across all tenants, per tenant for all apps, per app per tenant, etc.), the type of request (GET, POST, PATCH, etc.), and more. The throttling kicks in once any of these limits are reached. 

Key Service-Specific Throttling Limits 

Now, let’s see the service-specific throttling limits. Each service, like SharePoint, Outlook, and Teams, has its own set of thresholds. For example, when interacting with Excel, the API will allow up to 5,000 requests per 10 seconds, but only across all tenants for a given app. 

I’ve broken down these service limits into an easy-to-reference table below. You can use it as your own quick cheat sheet, especially when the official documentation feels a bit overwhelming. That said, for a full, detailed breakdown, make sure to check out the official Microsoft Graph Throttling Limits documentation. 

Service Operation/Scope Limit Context 
Microsoft Teams GET team 30 requests/sec Per app per tenant 
 POST channel 30 requests/sec Per app per tenant 
 DELETE channel 15 requests/sec Per app per tenant 
 GET channel message 20 requests/sec Per app per tenant 
 POST channel message 50 requests/sec Per app per tenant 
 Any request on a given team 4 requests/sec Per app per team (hard cap) 
Outlook API requests 10,000 requests/10 minutes Per app ID + mailbox combination 
 Concurrent requests 4 concurrent requests Per app ID + mailbox 
 Upload (PATCH/POST/PUT) 150 MB/5 minutes Per app ID + mailbox 
Excel Any request (global) 5,000 requests/10 seconds Per app across all tenants 
 Any request (tenant-specific) 1,500 requests/10 seconds Per app per tenant 
Identity & Access Small tenants (<50 users) 3,500 Resource Units/10 seconds Per app + tenant 
 Medium tenants (50–500 users) 5,000 Resource Units/10 seconds Per app + tenant 
 Large tenants (>500 users) 8,000 Resource Units/10 seconds Per app + tenant 
 Write operations 3,000 requests/2.5 minutes Per app + tenant 
OneNote Delegated context 120 requests/minute, 400/hour Per app per user 
 Concurrent requests (delegated) 5 concurrent requests Per app per user 
 App-only context 240 requests/minute, 800/hour Per app 
 Concurrent requests (app-only) 20 concurrent requests Per app 
Subscriptions POST, PUT, DELETE, PATCH 2,000 requests/20 seconds Per app across all tenants 
 POST, PUT, DELETE, PATCH 500 requests/20 seconds Per app per tenant 
 POST /reauthorize by ID 4,000 requests/20 seconds Per app across all tenants 
 POST /reauthorize by ID 1,000 requests/20 seconds Per app per tenant 

How Throttling is Communicated to Developers/Admins

When your application gets throttled by Microsoft Graph, the API doesn’t just cut you off—it tells you what’s happening and what to do next. Specifically, you’ll get an HTTP response like 429 Too Many Requests or 503 Service Unavailable, often with headers that guide your retry strategy. Here are the key headers to watch for: 

  • Retry-After: This is the most important one—it tells your app exactly how many seconds to wait before making another request. You should always honor this. 
  • x-ms-throttle-limit (occasionally included): This can give you visibility into the threshold that was hit, though it’s not always present or documented consistently. 

Example response: 

HTTP/1.1 429 Too Many Requests 
Retry-After: 30 

If you see a Retry-After header, back off for the specified duration before retrying. Ignoring it will just keep getting you throttled and might even lead to longer delays. 

Best Practices to Avoid Throttling 

While you can’t eliminate throttling entirely—especially at scale—you can dramatically reduce how often it happens with smart API usage. Here are best practices every Microsoft 365 developer or admin should keep in mind: 

1. Optimize Your Data Requests: 

  • Use Query Parameters: Utilize $select to retrieve only necessary fields, and $filter to narrow down results. Always request only the data you need. If you don’t need the entire user object, don’t ask for it. This reduces payload size and processing time. 
  • Implement Pagination: Large result sets should be paged using @odata.nextLink. Trying to pull too much data in one shot is a fast track to throttling. 

2. Use Batching Wisely: 

  • Batch requests: Microsoft Graph supports batching up to 20 requests in a single HTTP call. This reduces network overhead, but remember: each request within the batch is still counted against throttling limits independently.  
  • Group Strategically: Don’t throw 20 expensive operations into one batch. Mix lightweight and heavy requests to avoid burst spikes. 

3. Use Change Tracking to Minimize Polling: 

  • Delta Queries: Instead of re-fetching everything repeatedly, use delta queries to only retrieve data that has changed since the last sync instead of the entire dataset. 
  • Change Notifications: Subscribing to change notifications (webhooks) lets your app respond to events as they happen, rather than polling the API unnecessarily. It’s much more efficient and throttle-friendly. 

4. Implement Intelligent Retry Logic: 

When a 429 response is received: 

  • Respect Retry-After: Always wait the exact amount of time specified before retrying. Microsoft is telling you how long the system needs—listen to it. 
  • Use Exponential Backoff: If the header isn’t present, implement exponential backoff to increase wait times between retries. This reduces the pressure on the API and increases your chances of a successful retry. 
  • Set Retry Limits: Always include a retry cap to avoid infinite loops and unnecessary load. 

Example pseudocode: 

wait = 1 
while retry: 
   try: 
       make_api_call() 
       break 
   except Throttled: 
       time.sleep(wait) 
       wait = min(wait * 2, 60)  # cap wait time 

5. Schedule Workloads During Off-Peak Hours: 

  • Many organizations generate peak load during business hours. If your app runs nightly reports, sync jobs, or batch processes, schedule them for early morning, evenings, or weekends when overall traffic is lower. 

6. Distribute Load Intelligently: 

  • Spread Requests Across Users: If your app acts on behalf of many users, distribute the workload across them rather than funneling everything through a single user context. 
  • Use Queues or Throttling Middleware: Introduce slight delays, queues, or background workers to space out requests, especially when handling large batches. 

7. Monitor and Log Throttling Events: 

  • Always log 429 and 503 responses, including Retry-After durations and affected endpoints. This helps you identify which patterns or endpoints are more prone to throttling—and optimize accordingly. 

Implementing these best practices not only improves the stability of your application but also ensures it remains respectful of shared cloud resources. A well-behaved app that handles throttling gracefully is one that scales reliably and earns fewer support tickets. 

Building Resilient Apps with Microsoft Graph 

Microsoft Graph API throttling is an important mechanism to ensure performance and stability for all tenants. As your app or script grows in scale and complexity, understanding how throttling works becomes crucial to keeping things smooth and reliable.  

By proactively applying the strategies we’ve covered—like efficient querying, smart batching, exponential backoff, and change tracking—you can design systems that not only avoid hitting limits but also recover gracefully when they do. 

If you’re working with Microsoft Graph at scale, it’s worth investing in solid telemetry. Monitor your API usage patterns, log throttling responses, and track retry behavior. These insights will help you identify bottlenecks early and fine-tune your app over time. 

Stay within the limits, build with resilience in mind, and your app will be better for it. 

Happy coding! And may your 429s be few and far between!  

Previous Article

How to Disable Personal OneDrive Account Syncing in Work Accounts 

Next Article

How to Use ActorInfoString for Better Audit Visibility in Exchange Online

Write a Comment

Leave a Comment

Your email address will not be published. Required fields are marked *

Subscribe to Newsletter

Subscribe to our email newsletter to get the latest posts delivered right to your email.
Powered by Amail.