AI Deck Comparison

All created AI or player decks. All discussion related to Wagic AI deck competion and results.
Psyringe
Posts: 1163
Joined: Mon Aug 31, 2009 10:53 am

AI Deck Comparison

Post by Psyringe »

Some days ago I implemented a "Gauntlet" mode into my personal version of Wagic. This mode lets me pit one AI deck against all others, or each AI deck against each other, and logs the results. Immediately afterwards I started to play all AI decks against each other. The first round is now completed, here are the results.

59 decks participated in the survey. 49 of these are part of official Wagic, 3 more were created by a user and will become official shortly, the remaining 7 ones are pre-release decks done by me. Each deck played two games against each other deck, one as the first player and one as the second player. Wins and losses were tallied, and a percentage of won games was calculated as a final score for each deck. 3,481 games were played in total.

The following table lists each deck:
- its rank
- its name (e.g. "Treefolk")
- its deck number (for example, "Treefolk" is deck no. 48, and its filename is deck48.txt
- the number of matches (M) played (116 for each deck)
- the numbers of won (W) and lost (L) games
- the percentage of matches won (M%)
- the percentages of matches won as the first player and second player, respectively (1.P/2.P)

Code: Select all

Rank Deck             #    M   W   L   M% (1.P/2.P)
 1   Treefolk         48  116  92  24  79% (79%/79%)
 2   Elfball          30  116  91  25  78% (78%/79%)
 3   Soldiers         24  116  87  29  75% (83%/67%)
 4   Jihad            32  116  86  30  74% (76%/72%)
 5   Kinsbaile Cavali 56  116  85  31  73% (71%/76%)
 6   Master of Ether  36  116  82  34  71% (66%/76%)
 7   Justice          18  116  81  35  70% (72%/67%)
 8   Bad Moon         20  116  81  35  70% (72%/67%)
 9   Fairies          17  116  80  36  69% (69%/69%)
 10  Elves            19  116  80  36  69% (67%/71%)
 11  Kithkin          26  116  80  36  69% (69%/69%)
 12  Undead Lord      40  116  80  36  69% (72%/66%)
 13  Spectral Rack    33  116  79  37  68% (66%/71%)
 14  Soldiers2        29  116  78  38  67% (72%/62%)
 15  Selesnya         23  116  75  41  65% (67%/62%)
 16  Burning          21  116  72  44  62% (62%/62%)
 17  Angelism         31  116  72  44  62% (66%/59%)
 18  Wrath             7  116  71  45  61% (55%/67%)
 19  Bloodhall Ooze   58  116  71  45  61% (69%/53%)
 20  Deep Blue        14  116  69  47  59% (62%/57%)
 21  Fairie Archmage  44  116  68  48  59% (60%/57%)
 22  Terror           12  116  67  49  58% (59%/57%)
 23  Zuberi's Flock   41  116  66  50  57% (55%/59%)
 24  Dragon           49  116  65  51  56% (60%/52%)
 25  Craven Giant     59  116  65  51  56% (62%/50%)
 26  Howlins          10  116  64  52  55% (62%/48%)
 27  Heartmender      43  116  64  52  55% (55%/55%)
 28  Tsabo            27  116  62  54  53% (59%/48%)
 29  Plateau           4  116  61  55  53% (59%/47%)
 30  Jungle           13  116  58  58  50% (53%/47%)
 31  Depletion        28  116  57  59  49% (50%/48%)
 32  Might Sliver     35  116  57  59  49% (48%/50%)
 33  Inquisitor        8  116  55  61  47% (48%/47%)
 34  Rats!            15  116  52  64  45% (41%/48%)
 35  Ashenmoor Cohort 46  116  52  64  45% (52%/38%)
 36  Vampires         53  116  52  64  45% (43%/47%)
 37  Giants!          22  116  51  65  44% (50%/38%)
 38  Bad Dreams       51  116  49  67  42% (53%/31%)
 39  Magus Coffers    57  116  49  67  42% (43%/41%)
 40  Snake Shamans    45  116  48  68  41% (38%/45%)
 41  Yavimaya          6  116  47  69  41% (36%/45%)
 42  Millage          50  116  45  71  39% (34%/43%)
 43  Sleeper Agent    54  116  45  71  39% (47%/31%)
 44  Alliance         11  116  44  72  38% (48%/28%)
 45  Kobold Overlord  34  116  44  72  38% (43%/33%)
 46  Wild Jhovall     55  116  42  74  36% (28%/45%)
 47  Savannah          3  116  41  75  35% (33%/38%)
 48  Noble Panther    47  116  40  76  34% (36%/33%)
 49  Taiga             2  116  39  77  34% (31%/36%)
 50  Viashino Warrior 42  116  38  78  33% (38%/28%)
 51  Lafiel           25  116  36  80  31% (34%/28%)
 52  Badlands          5  116  32  84  28% (28%/28%)
 53  Ball Lightning   39  116  32  84  28% (34%/21%)
 54  Shatter           9  116  30  86  26% (28%/24%)
 55  Nightmare         1  116  29  87  25% (19%/31%)
 56  Magnivore        52  116  26  90  22% (28%/17%)
 57  Panda Hive       16  116  25  91  22% (24%/19%)
 58  Terravore Turmoi 37  116  18  98  16% (19%/12%)
 59  Pyromancer       38  116  15 101  13% (21%/ 5%)
So I guess it's time to congratulate "Treefolk" for earning the title of first unofficial Wagic AI deck champion. :)

Some remarks and caveats:

- Obviously, the current numbers have a *huge* margin of error due to only two games having been played between any two decks. There's always a chance that the stronger deck loses a game due to bad luck in the draw (mana screw, land flood, etc.). Due to this, I don't recommend to read any actual meaning into score differences of up to 10%.

- Also, some decks are affected by bugs. For example, "Savannah" is probably not living up to its potential in this survey, because the special abilities of its basilisks and cockatrices don't work currently. Other decks benefit from bugs - for example, "Treefolk" profits from the fact that it's currently impossible to trample over indestructible creatures.

- Furthermore, AI-vs-AI games aren't always indicative as to how strong this deck would perform against a *human* player. Example: The AI doesn't understand the card "The Rack" at all. If AI-1 has Racks in play, then AI-2 will nevertheless try to play most cards most of the time, which means that AI-2 will potentially suffer a lot of damage from the Racks. Therefore, AI-1 has a good chance to win this game.

Now pit AI-1 against a human player. This player will still suffer from the Racks, but he'll decide whether he *wants* to incur damage, or whether he saves some cards in his hand to prevent that damage. The player has much better control over this situation than an AI would have. As a result, an AI deck that scores well against other AIs because of its frequent use of The Rack, will probably score less well against human players.

- Finally, a deck that performs badly is not automatically a failure. Actually Wagic *needs* low-performing AI decks as stepping stones for beginner players, and not providing some would be bad game design. Nevertheless, it's good to know which decks perform well (and which don't) so that we can improve those that are meant to be *good* performers, but currently don't operate on this level.

Have fun, and discuss the results. :) I'll add my own observations shortly, some of the results really surprised me. :)


Edit: A total matchup table has been posted here.
Last edited by Psyringe on Fri Oct 23, 2009 9:33 pm, edited 1 time in total.
John Rohan
Posts: 37
Joined: Sat Jul 18, 2009 2:08 am
Contact:

Re: AI Deck Comparison

Post by John Rohan »

I did notice one thing with enchantments like Control Magic or Pacifism. If I have two creatures on the battlefield, one very large and one small, very often the AI will cast these enchantments on the small creature. I guess it chooses randomly. But I think by default, it should always automatically choose the larger one - either by casting cost or by Power.
Psyringe
Posts: 1163
Joined: Mon Aug 31, 2009 10:53 am

Re: AI Deck Comparison

Post by Psyringe »

John Rohan wrote:I did notice one thing with enchantments like Control Magic or Pacifism. If I have two creatures on the battlefield, one very large and one small, very often the AI will cast these enchantments on the small creature. I guess it chooses randomly. But I think by default, it should always automatically choose the larger one - either by casting cost or by Power.
Yep. :) I replied to that in the AI Improvement thread.
Raziel017
Posts: 132
Joined: Sat Aug 29, 2009 7:05 am

Re: AI Deck Comparison

Post by Raziel017 »

hahaha thats pretty cool...its like a wagic tournament for AI decks...hahaha...first ever!..maybe we should make it a weekly thing in the forum whoever, like maybe submit your deck and wololo can have them beat the heck out of each other who ever wins will be champ...and the deck can be posted in the forum along with the name of the creator.hahahaha..(for fame :lol: )
Daddy32
Posts: 177
Joined: Thu Aug 20, 2009 8:20 am
Location: Slovakia

Re: AI Deck Comparison

Post by Daddy32 »

The elfball deck was my champion from the start :) Sorry to see it second. Guess I'll have to take a look at treefolks.
Psyringe, game-play modes added by you are great, even for single-player, non-testing purposes. Will they get implemented into the next release? It's probably too late for that...
By the way, approximately how long did it take to play all those games?
Psyringe
Posts: 1163
Joined: Mon Aug 31, 2009 10:53 am

Re: AI Deck Comparison

Post by Psyringe »

Daddy32 wrote:Psyringe, game-play modes added by you are great, even for single-player, non-testing purposes. Will they get implemented into the next release? It's probably too late for that...
It's too late to add them to the next release (especially considering that my code has grown pretty ugly and I'd like to rewrite a good part of it before turning it into something official), but I think there's a good chance that new game modes will be available in the release after that. No promises though. :)
By the way, approximately how long did it take to play all those games?
My machine needs a bit more than one minute per game on average, so it took about 60 hours to play all of them. I'm currently looking into ways of speeding that up. The process ran at 10% CPU load for most of the time, so there seems to be a brake somewhere in the code (which makes sense - usually the player wants to be able to watch the AI's move, and without a brake somewhere, the game might run too fast for that on powerful machines). I'd then like to implement a "turbo" mode that switches this brake off. But I haven't found the relevant piece of code yet. If someone knows where to look, please tell. :)
Psyringe
Posts: 1163
Joined: Mon Aug 31, 2009 10:53 am

Re: AI Deck Comparison

Post by Psyringe »

Okay, here's my analysis of the results (okay, "random observations" would probably be the better term ... ;) )

1. The top five decks were created by 4 different players. This is good, it means that our community has several AI deck developers capable of creating high-performing AI decks, we're not depending on a single one.

2. The top three decks are similar in their design philosophy: Each card has a high chance of strengthening the other cards already played, and it's practically impossible to play a "wrong" card with these decks. No matter which card the AI will play, its situation has usually improved afterwards. There are hardly any cards in these decks which only work under certain conditions, or which only work well if played in a certain sequence. This confirms that the current AI is not "smart" enough to play difficult decks - basically, each choice that we give the AI is an opportunity for it to screw up. In the future we'll hopefully be able to improve on that aspect of the AI, but for now, let's enjoy the fact that several people have nevertheless managed to build some very strong AI decks. :)

3. All top decks are monocolor. The best two-color deck (Selesnya) came in at place 15, with 65% of matches won. However, very few two-color decks have actually been created, so it's hard to arrive at a definite conclusion. It's only logical that multicolor decks are worse performers though, because the AI cannot use any multicolor lands in a goal-oriented fashion. This severely limits the building of strong multicolor decks.

One detail of note is the "Slivers" deck, which is a 4-color deck (2 main colors, 2 splashed). The deck still manages to win about half of its games, which sounds good - however, the Sliver deck uses exactly the same mechanics as the top three decks, which won 75% and more of their matches. I don't see anything that could account for this performance difference apart from the fact that the top decks are monocolor, and the Sliver deck uses 4 colors.

4. The top decks are very creature-heavy, however, a spell-based deck with only 12 creatures (Jihad) snatched 4th place.

5. Green and white seem to have the strongest decks. Places 1 and 2 are held by pure green decks, then follow 3 pure white decks, an artifact deck, and another white deck. The best black decks hold places 8, 12, and 13, and all won about 69% of their matches. The best blue deck is "Fairies" (9th place, 69% victories), but it takes until places 20 and 21 for the next blue decks to show up /which both won 59% of matches). Red is the color with the worst performance, the best red deck is "Burning" (16th place, 62% of victories). This might be due to the fact that red depends more on targeting than any other color, and targeting is one of the great weaknesses of the current AI. Blue is the most context-sensitive color to play, which also poses problems for the AI.

It's difficult to say how successful artifacts are. Very few artifact-heavy decks have been built, and of those that exist, one performed very well (6th place), one ranked in the middle, and one performed badly. The results are inconclusive. My impression is that the "good" artifact deck benefits from a very specific combo built upon artifact lands, that the "bad" artifact deck tries to do something which the AI currently just can't, and that "normal" artifact decks can be expected to have an average performance. Apart from "Master of Ether", very few artifacts are used by the top decks. This is logical, since artifacts are usually a bit more costly than their colored alternatives, and for monocolored decks it's often more efficient to choose a colored card with a similar effect instead.

6. There was nearly no performance difference regarding whether an AI played as first or second player. Or, to be precise: For some decks there was a performance difference (but it's hard to tell at this stage whether the deck really is stronger if playing first, or second), but taking all decks together, these differences cancel each other out. In total, 51.75 of all games have been won by the first player (and 48.25% by the second player, obviously), that's too small a difference to denote any important non-random factor.


Also, some more observations on specific decks:

7. "Millage" won 39% of its games, which is pretty impressive considering that the AI doesn't understand the cards it's playing in this deck at all. For the current AI, milling decks will probably never be top performers, but they can be built strong enough to offer an interesting challenge to the player, i.e. a deck that plays radically different than most other decks currently in the game.

8. I'm a bit disappointed by the performance of the "Heartmender" and "Fairy Archmage" decks. I thought that the "persist" mechanic would make these decks stronger. Any suggestions to improve them? A human player would probably add some tutoring cards to get the Heartmender out more reliably (which is the central card of both decks).

9. "Craven Giant" is a deck that has only creatures with the "can't block" ability. I expected this deck to perform badly, but it won 56& of its games. This shows two things: (a) The *attacking* AI suffers a lot from not being able to see when its attackers would be unblockable and could attack without any risk, and (b) the assignment of *blockers* is often so bad that not blocking at all is a viable tactic. Both show that "assignment of attackers/blockers" is an area where the AI could improve a lot.

10. The deck "Terravore Turmoil" currently doesn't work at all - it relies on fetchlands, and the AI has somehow stopped using since Wagic 0.8.1 (where it could and did use them). The deck has been taken out for the next release, but I'll re-add it later, hoping that we'll be able to fix it for the release after that. Similarly, the "Pyromancer" deck showed the worst performance of all decks, because of AI targeting problems. It has been temporarily removed as well until the AI targeting improves.

Final observation: It has been fun conducting this survey, and I'll probably do it again - the results should also be more reliable with more data collected. Let me know if you have ideas or suggestions. :)
Last edited by Psyringe on Fri Oct 23, 2009 2:56 pm, edited 2 times in total.
wololo
Site Admin
Posts: 3728
Joined: Wed Oct 15, 2008 12:42 am
Location: Japan

Re: AI Deck Comparison

Post by wololo »

Psyringe wrote: I'd then like to implement a "turbo" mode that switches this brake off. But I haven't found the relevant piece of code yet. If someone knows where to look, please tell. :)
There is a timer in the AI that prevents it from playing too quickly.
Two things need to be done with this timer:
1) Switch it to a number of "update" calls rather than a time difference
2) Change the limit of this timer to 3.

Look for "timer" in TestingsuiteAI, it has that kind of mechanism that makes it run slowly for the first test and fast for the next ones

Note that this change shouldn't be done when playing against a human player, or even in the standard demo mode. The goal in those modes is to actually have a brake so that we humans can see what's going on.
Psyringe
Posts: 1163
Joined: Mon Aug 31, 2009 10:53 am

Re: AI Deck Comparison

Post by Psyringe »

wololo wrote:Look for "timer" in TestingsuiteAI, it has that kind of mechanism that makes it run slowly for the first test and fast for the next ones
Thanks, I'll have a look at that. :)
wololo wrote:Note that this change shouldn't be done when playing against a human player, or even in the standard demo mode. The goal in those modes is to actually have a brake so that we humans can see what's going on.
I think the best way is to add a "speed change" item to the game menu. This menu can be called in human-vs-ai, ai-vs-ai, and (probably) even in TestSuite mode. If it has items for "Set speed to turbo", "Set speed to normal", and "Set speed to slow", it will give the spectator total control over the speed he wants the game to run in in any of these situations, and he can switch speeds whenever he wants to. (There are situations where I could use a *slower* speed when testing AI decks.)
Niegen
Posts: 200
Joined: Sun Jul 12, 2009 9:46 am

Re: AI Deck Comparison

Post by Niegen »

I also think another reason for their success is that those decks put many targets for the opponent's removal that the opposing ai doesn't know which creature it should REALLY try to remove (most of the time, it would be the lords). Plus the fact that treefolk's lord has high toughness means only white and black removal might work.
Rereading my statement, I think that when looking for a target for its removal, the AI should value creatures with lord effects higher than the others, so that it aims at those creatures first. It won't always be the wisest choice, however, it will greatly improves its game.

Of course, aiming a single Lightning Bolt at a Timber Protector is probably not smart enough.
Locked