ROAR: A New Commander Rating Scale

Tom AndersonCommander

There’s a number of reliable hot-button issues among the Commander community, but judging by the metric of online discussion, the objective rating of deck power levels might be the biggest. The goal is not just to pin down a tier list of optimal decks for the competitive side of the format, known as cEDH. That would be too easy! In fact, the main drive to establish some sort of objective measure of power comes from the desire to match decks evenly at tables up and down that spectrum; to spare us all the awkward second-guessing of what to bring to a MTGO table marked “CASUAL DECKS ONLY.”

Defining general power level in a game as varied as Magic is always tricky — doubly so when you’re trying to come up with some abstract definition rather than just counting up deck win rates. As with another famously vague definition, most experienced players simply “know power when they see it.” And even among the endless possibilities of the Commander format, there are several strong indicators that a certain deck might be formidable opposition. The presence of specific highly competitive commanders, quality of mana base, number of tutors, choice of wincons, overall curve and budget all reliably speak to the power level of a deck.

But none of these are close to completely reliable, especially when we can’t assume all deck-builders have maximum power as their end goal. Even notorious commanders like Zur the Enchanter could turn out to be leading someone’s “Pete Venters artwork tribal” brew. For this reason, every proposed scale based on one or more of the above metrics draws furious interjections from players pointing out how powerful their $100 deck is, or sundry other exceptions to the chosen rule.

Another complicating factor is the lack of a universal Commander metagame; while certain cards and decks are widely known, every playgroup ends up supporting a different sort of environment, each with its own vulnerabilities. It’s hard to accept someone telling you that the Hermit Druid deck that just killed you turn two “isn’t that strong” because it “folds to graveyard hate,” when your playgroup doesn’t usually play narrow hate cards like Rest in Peace. Are those answers widespread enough among all Commander players to consider when deciding the Dredge deck’s power level?

Even a successful numeric power scale is not going to be a totally reliable guide, nor can it account for player skill. Nobody is saying you can’t jam mismatched decks, if that’s what you want! But having an official baseline for measuring power is very important for GP’s and other public games, which are underrepresented in Commander discussion and should be the main focus of all official rulings/banlists. So, we’re going to try and make one!

THE MEASURE OF A MAGE

To try and create a kind of instant, X/10 rating system we need for public game organization, we have to establish the net impact of the related factors we mentioned above. Holistically, power level is how effective the deck is at winning games, which we can measure in speed and reliability. A number of the more serious attempts at Commander power scales ask “what turn does your deck win on,” but this is too nebulous. While knowing how fast your deck goldfishes is a useful component of any power assessment, an assumed four-player table of equal power should necessarily change that timeline…

As such, I base my assessment around a metric I’ll call Resolutions Or Attacks Required, or ROAR. Starting from an empty board state (except maybe lands), how many unhindered spells/attack steps do you need to resolve in a row to kill opponents?

This is very similar to “what turn do you win on,” in some ways, but it inherently acknowledges the ease of presenting a two-card combo instead of a three-card combo, even if the latter is “faster.” It also accounts for the impact of interaction, which is more likely to be a factor as you cast more spells of different types and effects.

ROAR doesn’t consider what turn you win on, because most actual Commander games aren’t a drag-race. One of the most tangible and oppressive effects of playing against a much-better deck is the fear that they could steal the win at any time, which restricts your ability to play your own game. So it’s more important to measure a deck’s explosiveness — how fast it can win from nothing — than its speed.

Aside from just balancing tables, I feel that evaluating cards and decks in this way is important to helping people understand what they want out of Commander. I’ve heard vague misgivings from many players online and in my own groups about how their deck doesn’t “feel strong”; understanding what makes a deck feel strong helps you tune for any desired level of power, and identify what potential exists in your favorite Commander or archetype. 

CALCULATING YOUR ROAR POWER

Your ROAR number is presumed to be an approximate average across all winning games or based on a random draw of cards, rather than a theoretical optimal kill (unless you can consistently access that kill, perhaps from the Command Zone). ROAR understands a “kill” as creating a game state in which you would expect a knowledgeable opponent to concede — demolishing their hand and mana, locking them with some prison setup, or making eight Chandra, Torch of Defiance emblems at once are all probably good enough to count. 

All ROAR calculations are for multiplayer Commander with at least three opponents. This is a highly important distinction, as the presence of other players affects how opponents will deploy their interaction. In multiplayer, the act of winning is much more clear cut and binary, since it almost always involves presenting an infinite combo or other completely overwhelming board. Despite this, if your wincon kills people one at a time instead of all at once — say, a Voltron deck without reliable extra combat steps or the like — I would still calculate the ROAR based on that individual kill. A lot of table politics revolve around that threat, and even if you only blow up a single person on turn two, they aren’t having any more fun than if you blew up the table — perhaps less, if they have to wait for the whole game to finish.

Because this is a calculation of mid-game explosiveness, you can just assume you have all the mana you need to do your thing. I wouldn’t normally count ramp effects and mana rocks to be part of your ROAR number, since in most decks, they’re just “win more” or “win sooner” cards. The two exceptions would be a deck like cEDH Godo, Bandit Warlord, where assembling a certain amount of mana (nine) on one turn is the key barrier to comboing off; OR a deck where accelerating to go under your opponents is equally critical to your chance of victory — perhaps something like Maralen of the Mornsong, where you need a resource lead to break the symmetry, or a Stax deck which needs to immediately cast things to stymie opposing development.

The “in a row” part of the definition is a nod to the play patterns of Commander and the stop-start nature of “going for it” in most games. If your most common winning line involves casting your commander, a tutor and a combo piece, we’re happy to assume your ROAR is three. Perhaps you already tried similar lines twice and got interrupted by discard and removal, but we don’t count the cards from your earlier attempts unless they stuck around and had a role in the winning combo.

This is why it’s important to base your ROAR on your own winning games. A deck trying to win with a specific A+B combo obviously ends most winning scenarios with both A and B in play, so we don’t need to try factoring in extra resolutions to Recover piece B and re-cast it after it gets Vindicated the first time. Conversely, a creature beatdown deck is probably used to winning through several sweepers and other removal effects. If your average scenario involves casting three creatures and then attacking three times to count to 40, replacing one along the way, then a more accurate ROAR is seven, not six.

It can be a little harder to guess at from only hypotheticals, but ROAR should include any spells you cast to try and advance your game plan — not just those that were ultimately part of your winning combo. 

This is vital for measuring the impact of redundant effects, draw spells, tutors and general focused deck-building. Perhaps the best abstract measure of “competitiveness” in Commander is how completely your card choices focus on always moving toward a common win condition; a less powerful deck will draw (and cast) more intermediate spells before eventually presenting a winning combo, leading to a higher ROAR.

Attack steps are included in the calculation as roughly equal in added risk to casting another spell. While they can’t be countered, each attack does require your creatures to survive until your next turn without being wrathed or otherwise shut down. If your main wincon involves a permanent with a tap ability, once-per-turn trigger, or other effect that demands you untap with it — say, Heartless Hidetsugu decks — I would add one to the ROAR for each full turn cycle you expect it to survive. 

However, I might not apply the same consideration to lines where I need to untap with a specific land, like Dark Depths, since they are a little harder to interact with. Use your own judgment and experience when considering noncreature permanents.

PHILOSOPHY, CAVEATS & EXAMPLES

While I do believe ROAR to be the most holistic, practical and useful single number rating to consider when matching up Commander decks for balanced games, we should all agree that it is a starting point and guideline for guessing at power, not an infallible rule. There is no single number or measure which should justify excluding a deck from your table without more nuanced consideration.

I’ll try to quickly cover some of the other things that are important in these assessments, and give examples of how I would envision ROAR being useful.

First of all, what definition of “power” is ROAR measuring? You’ve probably noticed that it doesn’t directly consider speed or mana efficiency. A Yargle, Glutton of Urborg deck that wins by casting its commander and attacking three times could technically be claimed as ROAR 4. A deck that tutors for Thassa’s Oracle, tutors for Demonic Consultation, casts them both and wins… is also ROAR 4. 

Clearly, these decks are not equally powerful, since the more expensive, slower Yargle plan is much easier to interact with or simply go under with a faster deck. But ROAR power only reflects the competitiveness of a deck in free-for-all multiplayer, at a relatively balanced table. Assuming the OracleConsultation deck is facing prepared opponents, simply being able to go off fastest is rarely a big advantage. It’s highly likely that there will be enough interaction between three opponents to stymie even the fastest all-in draw!

This is why even-powered formats like cEDH, Legacy, and Vintage routinely feature very long games. Waiting for the right moment and exploding into it is the real path to victory.

Improving your ability to seize those moments is as much about consistently finding your wincon and packing disruption (and counter-disruption) as it is about raw speed.

No card or deck evaluation EVER exists without context. While the OracleConsultation deck will have an edge against slower, less interactive casual decks, and has more powerful nut draws that could kill the whole table in a one-turn window, ROAR is focused on how consistently the deck can present lethal threats — which tends to be the biggest bottleneck in a 99 card singleton format with three opponents ready to shoot down your first few attempts.

As is traditional for these sorts of rating scales, I’ll throw in some quick examples of where various decks ROARs might end up. I doubt that this will exactly line up with current cEDH tier lists, but I don’t think that’s a realistic expectation for such a highly simplified scale. Remember that small differences between ROAR numbers represent a very minor power gap, and that other factors like mana efficiency sometimes outweigh them in practice.

ROAR 0-2: All but requires your deck to be built around an A+B combo with your commander (call it an A+C?), where the A effect is so widely available that a significant % of your draw steps see it. 

I don’t think this is currently possible, though several cEDH commanders are close!

ROAR 3-4: The best achievable ratings — either an A+C combo or similarly abrupt wincon involving your Commander. Occasionally, you’ll tutor for two non-Commander combo pieces and cast them instead. Either way, you’ll need upwards of ten cards that can fill in for or tutor up the combo piece to be reliable. I’m only counting steps needed to assemble and initiate the combo — if you get to the deterministic part, I assume you’re home free, since it’s clear you’re winning and they either have interaction or they don’t.

Example 1: Kaalia of the Vast + attack (putting a large flyer on board) + Armageddon (or similar)

Example 2: Niv-Mizzet, Parun + Gamble (finding Curiosity) + Curiosity on Niv-Mizzet + tap to activate

Example 3: Scion of the Ur-Dragon + attack + activate (finding Skithiryx, the Blight Dragon) + activate (finding Moltensteel Dragon)

ROAR 5-6: This is where the most tuned casual lists will fall, along with more midrange cEDH decks. The fastest fair combat and voltron lists end up here, too.

Example 1: Ritual + ritual + ritual + Godo, Bandit Warlord (finding Helm of the Host) + equip Helm to Godo

Example 2: Heliod, Sun-Crowned + tutor for Walking Ballista + Walking Ballista + activate Heliod giving Ballista lifelink + activate Ballista (a lot)

Example 3: Saskia the Unyielding + creature + creature + attack + attack

ROAR 7-10: This is the more casual end of focused deck-building. Every additional ROAR stacks more potential complications and chances to interact on top of an already precarious line of play. Even if you can’t think about any specific strong lines for your deck, try turning over the top 20 or so cards: I think a significant majority of us would be able to find some sort of winning line even though 10 laborious steps!

Example: Derevi, Empyrial Tactician, + token-making spell + token-making spell, attack (untapping a lot of mana) + draw some cards + another token-making spell + Synthetic Destiny + (attack/ combo activation).

HOW DO I KNOW IF ROAR IS RIGHT FOR ME?

Some of you may own decks for which the strict numerical scale of ROAR is harder to apply.  These decks aren’t built to focus on assembling specific wincons, so the deckbuilding and playing philosophy more closely resembles “Goodstuff” or “Pile” decks from Constructed — you just play cards that are strong until you build enough advantage to win. This makes it very difficult to accurately calculate ROAR, since your winning game states rely more on the wider context of the game and may not even feature a distinct tipping point which you can count ROAR back from.

This is not entirely by accident. 

ROAR measures a combination of explosiveness and consistency — which, along with sufficient interaction, is key to staying relevant at the most competitive and optimized tables.

Decks that aren’t built with this goal in mind will likely lack one or the other, meaning their ROAR number will be significantly higher. If you can’t think of a particularly common path to winning to evaluate, try just flipping cards off your deck until you see a critical mass of effects emerge, and then you have a starting point to figure out the ROAR.

However, below a certain level of optimization, the objective power level of a deck starts to matter much less than the philosophy of the pilot and table politics in general. If your deck isn’t built to overpower three other players with a turn 3 infinite combo, then you should feel safe dropping it onto most tables without needing the granularity of a numbered power scale to feel things out. The less applicable ROAR is to your deck, the less you probably need it. 

Finally, I want to make absolutely clear that this approach doesn’t come with any additional assumptions about such decks or their pilots. I exclusively play and build such decks myself, and always have!

What I would advocate for very casual decks are some less quantified labels that call out specific fundamental edges a deck might exert over its opponents. These can (and should) be combined with ROAR power to give a more nuanced description of tuned lists. A few starting examples of things you might want to flag, since they broadly change the experience of battling your deck: 

Also, if your main plan is to swiftly lock opponents out of playing spells (Grand Arbiter Augustin IV/Hokori, Dust Drinker) or distort the game so much that people can’t play “normally” (typical UR “chaos” decks with Possibility Storm, Hive Mind, Grip of Chaos and Eye of the Storm, combos which give every player infinite draws and mana) — and you aren’t looking to simply have people concede to it — you should probably discuss that, too.

Of course, there’s a certain economy which must be respected when it comes to that sort of pre-game negotiation. That’s why we’re looking for a universal numerical power scale in the first place. Still, without going overboard, I believe building these labels into a one-line “elevator pitch” for your deck will help us all eventually reach Commander nirvana… or at least avoid depressing, uncompetitive experiences at public tables. 

I’m very happy to hear critiques or questions about ROAR and my overall Commander philosophy, so get those posting fingers ready – and I’ll catch you next week!