Technical Machine: A Pokemon bot

aurora · Jan 14, 2012

That's actually the best project I've ever seen.

obi · Jan 19, 2012

I updated the supported compilers list to correct a few minor errors.

Technical Machine now fully supports the next stage of learning / fully automated play: stealing teams that beat it. To minimize the risk of collisions in file names, TM saves the teams as a randomly named 8 A-Z 0-9 file name. It also checks to see if the file exists before it attempts to save, which means the only chance of a collision would be if TM were made multi-threaded and randomly generates the same file name at the same time, and even then, the damage is low. It still needs to run some "improve this team" algorithms on it.

Technical Machine automatically registers PO user names if the user name does not exist.

I improved the installation process to be much more straightforward, as well. It now creates all necessary files / folders and checks settings to make sure they are valid. The only thing that it does not automatically fill in defaults for is the user name and password. When Technical Machine runs, it checks to see if there is a user name filled in. If not, it prompts the user to go to settings/settings.txt and fill in the user name / password field. I think I'll also add a password randomization feature, to give TM a strong random password on installation.

In other words, typing scons and then running TM is all the preparation you need to do. If there are any necessary fields that you have not filled out, TM either pre-fills in a sensible default or else tells the user exactly what to do next. This could be improved to allow the user to type in their desired user name at the prompt instead of telling them to go modify some file somewhere.

There is some PO bug related to TM getting confused about the active Pokemon. I am currently working on fixing this, but it is difficult to trigger so I am reviewing large portions of the code to try and track it down. While doing so, I noticed a place where TM could get a major speed up (the primary algorithm could theoretically gain a 20-30% speed increase at the minimum, as well as having lower memory requirements). This change will also make some secondary algorithms more efficient and reduce the probability of other bugs.

That difficult-to-trigger bug seems to occur in about half of my battles in which I use phazing moves. I think this is the last major bug that TM cannot recover from that regularly occurs in my current testing.

WinstonShnozwick · Jan 20, 2012

I think that the person activating TM should be able to and be required to enter all info for the bot. Name and such I mean. When it's randomly generated, you might lose track of it, and you don't want that. Also, I'm a bit worried about it registering the username, I think it's not a great idea, because imagine TM being used constantly. It might generate hundreds of registered accounts and that could have negative effects like blocking off real people from registering usernames or putting weight in the server cache or something.

obi · Jan 20, 2012

The password is what would be randomly generated. The user wouldn't need to keep track of this, TM would do that (and it would be saved in a plain text file). There is no more risk of TM registering user names than just some random person doing it manually.

obi · Feb 9, 2012

I've recently added some code that should fix virtually every crash I've encountered so far. Well, not quite 'fix', but stop the program from crashing, report the error, and keep going.

I noticed that most crashes occur in the core of the program: deciding which move to make. This is because TM evaluates so many positions that it exercises most of its code each battle. The deeper the search, the more likely a crash is, because it's evaluating more and more obscure positions. However, to allow a particular optimization to work, TM already evaluated to a depth of 1 first, then 2, then 3, etc., until it hit the maximum depth it was told to search. My crashes could almost entirely be traced back to two lines of code (but with over a million ways to arrive there, I think). Therefore, I put in some code around those two lines that detects an error and aborts the current search. The program recognizes that its search was not completed successfully, and thus only uses the results from the previous depth of search.

To make this work in all cases (what happens if it encounters a bug at depth = 1?), I added a new mode of play for TM: depth = 0. At depth = 0, TM will make a random legal move. This should also prove useful later as part of a stress test for TM, once I get the obvious bugs worked out.

Due to these changes, Technical Machine is much more stable than it has ever been.

Recently, I was also in fixing up some old code to simply be organized more logically ("refactoring"). The goal of this was not to make TM run any faster, but while doing so, I noticed a few cases where TM was needlessly wasting memory. Most of TM's time is spent writing to memory (it has to make a lot of copies of Pokemon in its search), which means that for me, memory efficiency is speed efficiency -- there is very little of a trade off in this department. Those minor changes proved to make TM work much faster.

How much faster? Previously, a depth=3 search took about 20-30 seconds in most cases, and some of the time would take around a minute per turn. Now it takes approximately 5 seconds or less in nearly all cases (and the rest of the time should be well under 10 seconds). A depth=4 search took so long I was never able to get data to measure it (it couldn't even finish evaluating a single response to a single move before I gave up or it crashed). Now depth=4 takes about 15-20 minutes to complete.

This is obviously too long for most 'real' play. However, it opens up a few options. First of all, I can play TM as a sort of background process. I can be reading something, and PO alerts me when TM has made a move. This will allow me to see how TM plays at its highest level so far (I have not yet done more than about 7 turns vs. TM on depth=4, so I can't tell how much stronger it is). I could also do a TM vs. TM battle and let it run overnight, which should be interesting to watch the replay of (a 40 turn battle would take about 10 hours for a single TM, but since they have to share memory since I only have one fast computer, it would take longer. However, the final turns of a match go much faster than the first, because there are fewer options, so 10 hours is probably still an accurate estimate).

Most importantly, I know I can do better. While I was refactoring, I had an idea for how to dramatically improve TM's speed. The speed ups I made recently are minor in comparison. This change may be able to do give me something along the lines of a 1 minute turn. From there, a few incremental changes should bring depth=4 down into the reasonable realm of 20-30 seconds per turn, which I think is tolerable in live play.

Unfortunately, TM appears to be unable to play in battles in which it is the challenger. I'll have to look into why this is the case.

obi · Feb 10, 2012

Thanks to these recent improvements, I was able to have a very good battle with TM in gen 4. You can read the warstory here.

PK Gaming · Feb 10, 2012

I read the whole thing.

Jaw dropping, I'm patiently waiting for the day when your AI goes gold. It could revolutionize Smogon; it could be the perfect tool for users who want to practice on the go w/o internet and it would be a huge help in tutoring.

You rock david stone.

Ithilanor · Feb 12, 2012

Absolutely amazing. This is going to revolutionize the Pokemon world, and I wouldn't be surprised if this ends up getting written about in AI textbooks...it's a stunning achievement.

WinstonShnozwick · Feb 13, 2012

Finally, something I can yell obscenities at while battling without the risk of offending anyone.

Cooky · Feb 13, 2012

this is going to be so useful. what an incredible contribution

Unbreakable · Feb 15, 2012

WinstonShnozwick said:
Finally, something I can yell obscenities at while battling without the risk of offending anyone.

OH MY GOD THIS

The Truth · Feb 15, 2012

The Truth said:
Wow, this is some really amazing stuff. Do you have to change its values based on the team type that it's using, and if so, do you foresee TM being able to analyze a team and set its own priorities in the future?

Lol i posted this on the warstory instead of this thread. The question still stands though :3

Nova · Feb 15, 2012

holy shit, was just jokingly talking about this idea with a friend

obi · Feb 15, 2012

It uses a single evaluation function regardless of the team it's using, which is how I feel it should be.

The evaluation function isn't even really that good. I just kind of invented a few values, and the only change I've made so far was to reduce the value of Stealth Rock from 200 to 150 (or something along those lines). One of my future improvements will be an evolutionary algorithm to have TM decide what those evaluation constants should be. TM would run with a variety of teams and fine-tune its own constants. These constants should then allow TM to effectively play with an even wider variety of teams in the future.

My current plan is to have TM run in two modes: learning and tournament.

In learning mode, it would always slightly modify its old values for these constants. It would play itself in learning mode always, and mix-in self-play with ladder play. Tournament mode would turn off this modification for the current battle, and have it just use the values that it has currently found to be best.

One thing that is related to your question that TM does not do yet is team improvement. I intend for TM to analyze its own teams and modify the team itself to be better.

WinstonShnozwick · Feb 15, 2012

It uses a single evaluation function regardless of the team it's using, which is how I feel it should be.

The evaluation function isn't even really that good. I just kind of invented a few values, and the only change I've made so far was to reduce the value of Stealth Rock from 200 to 150 (or something along those lines). One of my future improvements will be an evolutionary algorithm to have TM decide what those evaluation constants should be. TM would run with a variety of teams and fine-tune its own constants. These constants should then allow TM to effectively play with an even wider variety of teams in the future.

My current plan is to have TM run in two modes: learning and tournament.

In learning mode, it would always slightly modify its old values for these constants. It would play itself in learning mode always, and mix-in self-play with ladder play. Tournament mode would turn off this modification for the current battle, and have it just use the values that it has currently found to be best.

In your last post you talked about the different levels of analyzing moves to pick in battle, with the 4th or something being so in depth that it took 20 minutes. Are these different levels of thought availible to change in the TM for battle? What are they for exactly?

One thing that is related to your question that TM does not do yet is team improvement. I intend for TM to analyze its own teams and modify the team itself to be better.

Step 1: Give TM your desired team.
Step 2: TM analyzes and modifies your team.
Step 3: ???
Step 4: Profit!

This will be very useful for self tutoring.

DHR-107 · Feb 15, 2012

david stone said:
One thing that is related to your question that TM does not do yet is team improvement. I intend for TM to analyze its own teams and modify the team itself to be better.

Would this mean that it could in theory create a "perfect" team? Capable of winning say 75% or more battles against competent players? (Obviously, it would be much weaker against inexperienced players as random choices come into play/over prediction).

Obviously the network would need hundreds and hundreds of battle's worth of learning data to even comprehend so many strategies.

Asbestospoison · Feb 15, 2012

One one hand, this looks amazing. On the other hand, when the robot uprising finally approaches, this bot will be their strategy expert, and we'll all be screwed.

Joking aside, this is going to be amazing, I wouldn't be surprised if it becomes at least somewhat famous. Especially once it can build amazing teams, then it will be more impressive then your standard chess-computer. I need to try this **** out as soon as possible.

obi · Feb 16, 2012

WinstonShnozwick said:
In your last post you talked about the different levels of analyzing moves to pick in battle, with the 4th or something being so in depth that it took 20 minutes. Are these different levels of thought availible to change in the TM for battle? What are they for exactly?

The level of depth tells Technical Machine how many turns to look ahead. So a depth=3 search means that it considers every possible outcome from any combination of moves for the next 3 turns, and then it looks at the state at the end of those turns and gives that state a score. It properly accounts for the probability of reaching that point, and picks the move that gives it the best score, on average.

DHR-107 said:
Would this mean that it could in theory create a "perfect" team? Capable of winning say 75% or more battles against competent players?

I don't think so, because I don't think that Pokemon has a perfect team.

However, Technical Machine even now is not limited to one team. It already supports multiple teams and randomly selecting from all available teams. This eliminates the threat of counter-teaming, given a suitably large selection of teams in Technical Machine's arsenal.

(Obviously, it would be much weaker against inexperienced players as random choices come into play/over prediction).

I disagree. Technical Machine will generally do better against a weak player. It should almost always beat a random player, for instance. Technical Machine does assume that the foe makes the best move possible. However, it moves such that the opponent should still use that move. Any deviation from the strategy that Technical Machine 'assumes' will lead to Technical Machine being a better position, not a worse one (as long as Technical Machine can accurately evaluate the position).

Obviously the network would need hundreds and hundreds of battle's worth of learning data to even comprehend so many strategies.

Much more than hundreds. Fortunately, Technical Machine is fast enough to quickly play itself and, eventually, learn from those plays, so it could easily complete thousands of self-battles every day.

At a depth of 2, Technical Machine takes much less than half of a second to move. Obviously, running two Technical Machines at the same time will increase that speed. I have a multi-core computer, and Technical Machine is currently a single-threaded application, so they can do a lot of work without slowing each other down, but a lot of the time is spent waiting on memory. In reality there would only be a minor overhead for running another TM, but let's just assume that this slows me down to need a full second for two TMs to evaluate to a depth of 2. If the average battle is about 50 turns, then TM will take 50 seconds to complete a match. That works out to 1728 self-battles every day, per instance of TM, and each instance would be capable of evaluation, so in reality, I would be able to have over 3000 battles per day with TM set to a reasonable strength of depth=2. As you can see, I could very quickly get a large data set of self-play. Technical Machine could then mix-in battles online to inject some variety.

lethminite · Feb 17, 2012

would you be mixing up different depths as well, and give real matches more weight then self matches or something?

otherwise it might get a problem where some strategy it uses works well against it's self, and so enormous amounts of data lead to it being a good move, but against a human its acutally a dumb move, or the reverse where it predicts it's self, but it's some move that is often overlooked by humans.

something along the lines of what you said with it spaming thunderbolt on groundtypes to keep skarm out, but not so obvious?

obi · Oct 9, 2012

It seems like it's time to update this.

Technical Machine has made quite a bit of progress since my last post, but it still has far to go.

The big news is partial support for generation 5. I've done most of the grunt work for this. I just have to add support for side effects of moves, as well as new items / abilities. However, it could probably go through a generation 5 battle without crashing. I'm mostly waiting on some stats that TM depends on, so I would expect TM to start being in generation 5 battles next month.

Based on how easy it was to do this (I started on it for real last week), I'm optimistic about adding previous and future generations just as quickly.

Technical Machine primarily supports battling on POv1. It does not yet support Pokemon Showdown or PO2. However, I have made some improvements that should allow this to be fairly easy to fix. I just have to implement their protocols, but the core work is already done.

I also have some exciting news for people who want to use some of my code directly instead of just reading about it.

I am currently building a team builder that uses several of TM's components, plus some new stuff. I'll just copy my post from the stats thread, because it outlines my plan pretty well:

I'm writing a full-fledged team builder that will be a component of Technical Machine (but will also just be a stand-alone application). My plan is for it to generate a team based on a few criteria, but also assist in making changes. In other words, you could build your team with it, filling in as much detail as you like, and Technical Machine would then fill in the rest for you. I'm even thinking about stuff like "What effect would changing this EV spread to that EV spread have?" and it would calculate change in physical / special defensiveness.

I have some stats that I would be interested in and would begin using immediately:

Taking into account EVs, nature, and item, what is the exact Speed each Pokemon has? One thing that would be especially useful is an evidence-based approach to Speed. If there are no Pokemon that sit at 237 or 238, then there is no point in me EVing a Pokemon to 238. I want to be able to do something like "For X more EVs, I can outspeed Y% more Pokemon.". I would probably also make some sort of graph showing Speed distribution and post that, so people can see it visually. Perhaps my program could include the graph and display your location on the curve, along with a percentile.

Move set stats that parallel team mate stats. If I have statistics like "On Pikachu, 29% of Pokemon with Thunderbolt also have Surf. 23% have Hidden Power. 13% have Quick Attack.". From that, I could construct a complete move set given any partial set of moves, just as I do for Pokemon prediction. Of course, more "complete" stats would be "Given that I have seen a Pikachu with Thunderbolt on a team with Geodude with Tackle, 28% of Pikachu also have Surf.", but I imagine that those stats would be much, much larger.

Technical Machine also now has the ability to generate a team at random. The random selections, however, are weighted by usage stats and previous picks for the team. As an example of a maximum random team that TM recently generated, it made a team with Dusknoir, Blissey, Skarmory, Celebi, Tentacruel, Swampert.

The "Team Predictor" app is now really a team generator application. You can type in as many Pokemon as you want (between 0 and 6) as well as the maximum number of random Pokemon you want TM to generate. It will pick a random number between 0 and your number and generate that number of Pokemon randomly (weighted by usage). It will then predict the remainder of the team as usual. So if I want a team that has Scyther and Infernape, I can specify those Pokemon, and then set the "randomness" setting to 4 (it's on a scale of 0 to 6, but because I picked 2, 4 and 6 are the same thing), this is one possible team it can generate:

Scyther (100.0% HP) @ Life Orb ** Scyther
Ability: Technician
- Aerial Ace
- Swords Dance
- Quick Attack
- Bug Bite
Infernape (100.0% HP) @ Life Orb ** Infernape
Ability: Blaze
- Close Combat
- Grass Knot
- Fire Blast
- U-turn
Celebi (100.0% HP) @ Leftovers ** Celebi
Ability: Natural Cure
- Recover
- Grass Knot
- Thunder Wave
- Leaf Storm
Tyranitar (100.0% HP) @ Choice Scarf ** Tyranitar
Ability: Sand Stream
- Crunch
- Stone Edge
- Earthquake
- Pursuit
Swampert (100.0% HP) @ Leftovers ** Swampert
Ability: Torrent
- Earthquake
- Stealth Rock
- Ice Beam
- Roar
Zapdos (100.0% HP) @ Leftovers ** Zapdos
Ability: Pressure
- Thunderbolt
- Roost
- Heat Wave
- Hidden Power

Here is another team

Scyther (100.0% HP) @ Life Orb ** Scyther
Ability: Technician
- Aerial Ace
- Swords Dance
- Quick Attack
- Bug Bite
Infernape (100.0% HP) @ Life Orb ** Infernape
Ability: Blaze
- Close Combat
- Grass Knot
- Fire Blast
- U-turn
Metagross (100.0% HP) @ Lum Berry ** Metagross
Ability: Clear Body
- Earthquake
- Meteor Mash
- Explosion
- Bullet Punch
Feraligatr (100.0% HP) @ Leftovers ** Feraligatr
Ability: Torrent
- Waterfall
- Dragon Dance
- Ice Punch
- Earthquake
Gengar (100.0% HP) @ Life Orb ** Gengar
Ability: Levitate
- Shadow Ball
- Focus Blast
- Thunderbolt
- Substitute
Electivire (100.0% HP) @ Expert Belt ** Electivire
Ability: Motor Drive
- ThunderPunch
- Ice Punch
- Earthquake
- Cross Chop

And another

Scyther (100.0% HP) @ Life Orb ** Scyther
Ability: Technician
- Aerial Ace
- Swords Dance
- Quick Attack
- Bug Bite
Infernape (100.0% HP) @ Life Orb ** Infernape
Ability: Blaze
- Close Combat
- Grass Knot
- Fire Blast
- U-turn
Scizor (100.0% HP) @ Choice Band ** Scizor
Ability: Technician
- Bullet Punch
- Superpower
- U-turn
- Pursuit
Jirachi (100.0% HP) @ Leftovers ** Jirachi
Ability: Serene Grace
- Iron Head
- Fire Punch
- Thunderbolt
- Ice Punch
Breloom (100.0% HP) @ Toxic Orb ** Breloom
Ability: Poison Heal
- Spore
- Focus Punch
- Substitute
- Seed Bomb
Metagross (100.0% HP) @ Lum Berry ** Metagross
Ability: Clear Body
- Earthquake
- Meteor Mash
- Explosion
- Bullet Punch

(I hope nobody uses a Fire move!)

I'm working on adding an EV optimizer to TM as well.

SpecsX · Oct 9, 2012

Forgive me for stating the obvious, but this is amazing stuff!

EDIT: None of those teams have spinners.

PK Gaming · Oct 9, 2012

This isn't really a constructive comment, but... I have to thank you. I was suffering from BOREDUM, saw the your, and the sheer amount of concentrated awesome protruding from it obliterated my boredom. Great work man, and thanks for helping me out incidentally!

Frozen UK · Oct 10, 2012

Can't wait for this! Upon completion this will be one of the best things to ever happen with Pokemon! I can't even start to think how good this will be and how much it will improve players globally. I have no idea what it takes to do this since I am useless with computers, but from what I've understood it takes a lot. So good luck and keep it up, amazing accomplishment so far.

Maestroke · Oct 11, 2012

You rock, good job man! And goodluck in perfecting it! This'll be an awesome tool :). I read the warstory too :o

jagged_angel · Oct 23, 2012

Ahhh david stone this is so exciting - sounds like TM is going to be a strong candidate for my next long-term relationship. Haha, jokes aside, this is already an incredible achievement and I look forward to playing against TM very soon.

Technical Machine: A Pokemon bot

aurora

aujourd'hui à

obi

formerly david stone

WinstonShnozwick

obi

formerly david stone

obi

formerly david stone

obi

formerly david stone

PK Gaming

Persona 5

Ithilanor

WinstonShnozwick

Cooky

Banned deucer.

Unbreakable

The Truth

Nova

snitches get stitches

obi

formerly david stone

WinstonShnozwick

DHR-107

Robot from the Future

Asbestospoison

obi

formerly david stone

lethminite

obi

formerly david stone

SpecsX

PK Gaming

Persona 5

Frozen UK

Maestroke

jagged_angel

Users Who Are Viewing This Thread (Users: 1, Guests: 0)