About the Method

Archetypes first. Then synergy, uncertainty, and draft pressure.

The method presented in this website aims at constructing directional graph networks for MTG limited data. It tries to learn archetypes from decklists, then measures single-card strength, pair synergy, triplet motifs, and draft urgency inside each learned shell.

The formulas below describe the current implementation directly.

1. Archetypes from full decklists

Let \(X\) be the deck-by-card matrix built from the deck_* columns of the game export. Each row is one deck build and each column is one card. Cards that appear in more than 55% of decks are removed before clustering because they are too common to separate one archetype from another. A card that appears in almost every deck contributes very little information about which shell a deck belongs to.

The weighting used before clustering is a classic TF-IDF transform. TF-IDF means term frequency - inverse document frequency. Here the “term” is a card and the “document” is a deck. The first factor rewards cards that appear in a deck; the second factor downweights cards that are common across many decks.

\[ w(d,c) = (1 + \log x(d,c)) \cdot \log\!\left(\frac{1 + D}{1 + df(c)}\right) \]

The weighted matrix is reduced with truncated SVD and then clustered with MiniBatch K-means.

2. Signposts

For archetype \(k\), the signpost statistic asks whether a card appears much more often inside that archetype than outside it.

\[ \ell(c) = \log\!\left(\frac{a_c + \alpha_c}{N_k - a_c + \beta_c}\right) - \log\!\left(\frac{b_c + \alpha_c}{N_{\neg k} - b_c + \beta_c}\right) \]

A positive value of \(\ell(c)\) means the card is concentrated inside the archetype. A larger value means that concentration is stronger.

The prior is centered on the card's global prevalence. In plain terms, the method starts from the card's average frequency in the whole format and then asks how much the archetype deviates from that average. This keeps the statistic stable when sample sizes are modest.

\[ p_{0,c} = \frac{a_c + b_c}{D}, \qquad \alpha_c = 1 + 28\,p_{0,c}\,w_c, \qquad \beta_c = 1 + 28\,(1-p_{0,c}) \]

The rarity weight slightly decreases the influence of rare and mythic cards. The practical goal is to reduce the chance that a rare card rises to the top of the signpost list mainly because it is scarce and powerful.

\[ z(c) = \frac{\ell(c)}{\sqrt{\mathrm{Var}(\ell(c))}}, \qquad signpost\_score(c) = z(c)\sqrt{a_c} \]

3. Deck-level pairs and triplets

Once the archetypes are learned, the method asks which cards co-occur in completed decklists more often than independence would predict.

\[ E_{\mathrm{deck}}(a,b) = N_k\,\pi(a)\,\pi(b), \qquad L^{\mathrm{deck}}_{ab} = \log_2\!\left(\frac{O_{\mathrm{deck}}(a,b)+1}{E_{\mathrm{deck}}(a,b)+1}\right) \]
\[ E_{\mathrm{deck}}(a,b,c) = N_k\,\pi(a)\,\pi(b)\,\pi(c), \qquad L^{\mathrm{deck}}_{abc} = \log_2\!\left(\frac{O_{\mathrm{deck}}(a,b,c)+1}{E_{\mathrm{deck}}(a,b,c)+1}\right) \]

A positive deck-level lift means the cards appear together in finished decklists more often than independent deck inclusion would predict.

When the code finds a strong deck triplet, it pushes part of that information back into the pair graph. For each retained triplet, it computes a triplet quality score and then gives each of the three pairs inside that triplet a bonus equal to 12% of that quality. If several triplets contain the same pair, the code keeps the largest bonus for that pair.

4. Game-level pair synergy

The game-side statistics use the chosen game zone from the 17Lands game export. In the current implementation the chosen zone is the drawn zone, taken from the drawn_* columns. So the card-level rate here is a drawn-zone win rate, which is very close in spirit to a game-in-hand style statistic.

\[ standalone\_delta(a) = p_a - p_k \]

The expected pair win rate is computed in logit space. The reason is that independent probability factors multiply, so after moving to odds and then taking a logarithm they add cleanly.

\[ \operatorname{logit}(\hat p_{ab}) = \operatorname{logit}(p_a) + \operatorname{logit}(p_b) - \operatorname{logit}(p_k) \]
\[ synergy\_delta\_logit(a,b) = \operatorname{logit}(p_{ab}) - \operatorname{logit}(\hat p_{ab}) \]

5. Standard deviation and confidence

The uncertainty model starts from the usual binomial variance and propagates it to the synergy statistic.

\[ \mathrm{Var}(p) = \frac{p(1-p)}{n}, \qquad \mathrm{Var}(\operatorname{logit}(p)) = \frac{1}{n\,p(1-p)} \]
\[ SE_{ab} = \sqrt{\mathrm{Var}(\operatorname{logit}(p_{ab})) + \mathrm{Var}(\operatorname{logit}(\hat p_{ab}))} \]
\[ \sigma_{ab} = \frac{|synergy\_delta\_logit(a,b)|}{SE_{ab}}, \qquad confidence(a,b) = \operatorname{erf}\!\left(\frac{\sigma_{ab}}{\sqrt{2}}\right) \]

The percentage shown on the site is the Gaussian central probability corresponding to the estimated signal-to-noise ratio. Higher confidence means the estimated effect is large compared with its estimated standard deviation.

6. Pair quality, triplet quality, and draft corroboration

The graph needs one ranking score to decide which links deserve visual emphasis. The score is a product of several terms. Each term measures a different ingredient: game-level lift, support, deck-level corroboration, triplet structure, draft corroboration, and confidence. A pair receives a large score when several of these signals agree.

\[ pair\_quality_{raw}(a,b) = \max(0, synergy\_delta\_logit(a,b)) \sqrt{n_{ab}} (0.02 + \max(0, p_{ab} - p_k)) (1 + 0.25\max(0, L^{\mathrm{deck}}_{ab})) (1 + 0.18\max(0, triplet\_bonus(a,b))) (1 + 0.05\log(1 + draft\_pair\_events(a,b))) \]
\[ pair\_quality(a,b) = pair\_quality_{raw}(a,b) \cdot \min(1, \sigma_{ab}) \]
\[ triplet\_quality_{raw}(a,b,c) = synergy\_delta\_logit(a,b,c) \sqrt{n_{abc}} (1 + 0.20\max(0, L^{\mathrm{deck}}_{abc})) (1 + 0.06\log(1 + draft\_triplet\_events(a,b,c))) \]

7. Draft pressure, ALSA, and wheeling

ALSA is measured globally across the whole draft table.

\[ ALSA(c) = \frac{1}{N_c}\sum_i last\_seen_i(c) \]
\[ ALSA_{sd}(c) = \max\!\left(0.65, \sqrt{\frac{1}{N_c}\sum_i last\_seen_i(c)^2 - ALSA(c)^2}\right) \]
\[ \tau(c) = \max\!\left(1.1, 0.60\,ALSA(c) + 0.80\,ALSA_{sd}(c)\right) \]
\[ wheel\_model(c) = \frac{1}{4}\sum_{s=1}^{4} \exp\!\left(-\frac{\max(0, (s+8)-ALSA(c))}{\tau(c)}\right) \]

8. Exported scores

Dependency score

Weighted mean of positive incident pair synergy and triplet bonus. A high value means the card usually gains value from neighboring cards in the same shell.

Analysis score
\[ analysis_{raw}(c) = signpost\_score(c) + 18\,dependency(c) + 140\max(0, standalone\_delta(c)) \]
  • \(signpost\_score(c)\): signpost strength of card \(c\).
  • \(dependency(c)\): dependency score of card \(c\).
  • \(standalone\_delta(c)\): win-rate lift of card \(c\) above the archetype baseline.

The raw score is transformed with \(\operatorname{sign}(x)\log(1+|x|)\), clipped between the 5th and 95th percentiles inside the archetype, and rescaled to 0-10.

Pick priority score
\[ pick\_priority_{raw}(c) = 100\big(\max(0,draft\_win\_lift(c)) + 0.55\max(0,standalone\_delta(c))\big) \sqrt{\max(1,draft\_drafts(c))} \big(1 + 0.10\max(0, signpost\_lift(c)-1)\big) \]
  • \(draft\_win\_lift(c)\): smoothed draft match win-rate lift of card \(c\) inside projected drafts of the archetype.
  • \(draft\_drafts(c)\): number of projected drafts in the archetype that contain card \(c\).
  • \(signpost\_lift(c)\): ratio between the card's prevalence inside the archetype and its global deck prevalence.

The raw value is transformed with \(\log(1+x)\), clipped between the 5th and 95th percentiles inside the archetype, and rescaled to 0-10.

Pick urgency
\[ pick\_urgency(c) = 10\left(\frac{0.90\,market(c) + 0.10\,pick\_priority(c) - 0.58}{0.42}\right)_{[0,1]}^{1.15} \]
  • \(market(c)\): normalized market-pressure score derived from ALSA, average taken-at, and wheel probability.
  • \(pick\_priority(c)\): normalized archetype performance score from the previous formula.
  • \((\cdot)_{[0,1]}\): clamp to the interval from 0 to 1.

This score is a recommendation scale. Higher values mean the table values the card and the archetype also gains from it.

Shell dependence
\[ shell\_dependence_{raw}(c) = \frac{\max(0,support\_delta(c))\sqrt{\max(1,supported\_games(c))}(1 + 4\,dependency(c))(1 + 0.18\max(0,signpost\_lift(c)-1))}{1 + 16\max(0,standalone\_delta(c))} \]
  • \(support\_delta(c)\): supported win rate minus solo-drawn win rate for card \(c\).
  • \(supported\_games(c)\): games in the archetype where card \(c\) appears with at least one other mapped card in the drawn zone.
  • \(dependency(c)\), \(signpost\_lift(c)\), and \(standalone\_delta(c)\): same quantities defined above.

The raw value is transformed with \(\log(1+x)\), clipped, and rescaled to 0-10.

Build-around detector

A card is flagged as a build-around when all four conditions hold: supported games at least 80, support delta at least 1.8 percentage points, dependency at least 0.075, and standalone delta at most 1.4 percentage points.

Core score
\[ core_{raw}(c) = signpost\_score(c) + 52\,deck\_prevalence(c) + 85\max(0,standalone\_delta(c)) + 14\,dependency(c) \]
  • \(deck\_prevalence(c)\): fraction of decks in the archetype that contain card \(c\).

The raw value is transformed with \(\log(1+x)\), clipped, and rescaled to 0-10.

9. Network construction rules

ObjectCurrent rule
Candidate card poolUp to 18 cards with the highest signpost score, then cards with the largest standalone lift, then cards that appear frequently in games of the archetype, then anchor neighbors: cards that co-occur strongly in deck motifs with the leading signposts. Lands are removed and the final list is truncated to 28 cards.
Node support floorAt least 110 appearances in the chosen game zone inside the specific cluster archetype.
Pair support floorAt least 90 games inside the specific cluster archetype where both cards appear in the chosen game zone.
Triplet support floorAt least 30 games inside the specific cluster archetype where all three cards appear in the chosen game zone.
Edge thresholdsynergy_delta_logit >= 0.05.
Base confidence thresholdSeed edges use a confidence level around 90%.
Knot thresholdRetained knots use a confidence level around 80%.
Knot promotionOnce a knot is established, nearby edges can enter at a lower confidence threshold when they reinforce that knot and stay within the same established structure.
Displayed solo/support statsAt least 30 supporting games are required in the relevant bucket.
Arrow directionThe directed graph points from the higher-win-rate card to the lower-win-rate card. The symmetric pair statistic stays unchanged.