Discussion Alternative Suspect Reqs

cityscapes

Take care of yourself.
is a Tiering Contributoris a Community Contributor Alumnus
(im not very good at writing these, sorry if this is boring to read through)

Suspect tests are an integral part of Smogon, and represent the influence of competent players on tiering. Suspect reqs are meant to weed out the people with insufficient knowledge of the metagame to leave only an informed pool of voters. The system of making players start with a fresh alt is pretty effective at this job, but it comes with some flaws that I want to fix with this new proposal.

Getting reqs is, for lack of a better word, a slog. Even with a relatively fast team, it can still take two hours to get out of low ladder, and another hour or so of mid ladder is required before the reqs are finished, assuming you didn't get lucked too many times. Combine this with the value placed on consistency over experimentation by the GXE system and the near complete lack of actually good players, and you have a grind that no one wants to sit through.

My proposal is to allow people to use existing alts with a separate requirement, most likely a higher Glicko-1 rating with a limit on how high your RD can be. Here's a quick overview of these terms:

  • Glicko-1 rating functions similar to the Elo system. It goes up when you win and down when you lose, according to your opponent's score. If your Glicko-1 rating is much lower than your opponent's, your score will increase more if you win and decrease less if you lose, and vice versa. Basically, the same intuitive Elo mechanics that we're all used to.
  • RD (rating deviation) is what differentiates Glicko-1 from other rating systems. The higher your RD is, the more your Glicko-1 rating goes up and down when you play games. RD naturally decreases a little bit for each game you play with a lower limit of 25, and it naturally increases a little bit per day when you're inactive. GXE is not calculated when your RD is above 100, instead showing a "-".

So an alternate requirement might look like this:
Existing alt with Glicko-1 rating above 1850 and RD below 40

What this does is reward players who are both skilled and already active on ladder without making them sit through three hours of low-mid ladder. The main argument against the former council reqs system was that "being on council" was an arbitrary status that didn't necessarily indicate skill in the current metagame. This system is different because it explicitly benefits people who already play regularly.

Formerly skilled but inactive players will have RD values too high to vote. Since you're rewarded more for winning and punished more for losing when your RD is high, an inactive player will be at risk of losing too much Glicko-1 rating if they play poorly while lowering their RD through laddering. Meanwhile, a formerly inactive player who performs well in the current metagame will be rewarded. Additionally, players who have been inactive for only a couple months won't need to play that many games, while players with several months of inactivity will need more games.

This is just speculation, but I believe this proposal might also increase ladder quality, because top players are encouraged to be consistent on ladder even outside of suspect tests. If you have the skill to consistently win games, the new system would allow you to play less games overall and also ensure that you only play against competent opponents, instead of the 2 hours of facing bad players that really says nothing about your skill. Additionally, there isn't much time constraint on it, so if you're busy IRL during a suspect test then you can just be active on ladder before it starts.

(Side note: GXE behaves similarly to my proposal, but in my opinion it doesn't punish inactivity as much, only decreasing around .1-.3% even for very high RD values from my experience)

Just to clarify, I'm not suggesting we remove the current system of getting reqs. Suspect tests are a great way to get into new metagames, and you shouldn't have to be previously active to participate. But when players have already shown themselves to be more than proficient in a metagame, we shouldn't needlessly waste three hours of their time. Because of this, I think that we should have this new system in addition to the old one, to benefit both types of players while still ensuring that only those who are skilled in the metagame can vote.

Thanks for reading.
 
Last edited:

ShootingStarmie

Bulletproof
is a Team Rater Alumnusis a Forum Moderator Alumnusis a Tiering Contributor Alumnus
Hey, was reading through this thread and I had a few things I didn't personally understand or agree with. Hope some clarification can be made on this.

Additionally, players who have been inactive for only a couple months won't need to play that many games
Is this something we really want as a competitive community? For as long as I can remember the metagame tends to shift a lot faster than "a couple of months", and with the new practices of Gamefreak releasing DLC every couple of months, players who have been inactive for this long probably have outdated knowledge. To me the point of a suspect test is to see how a metagame acts with or without a certain Pokemon, and I don't think previous knowledge of an old metagame really cuts it when determining smogon's tierlist.

But when players have already shown themselves to be more than proficient in a metagame, we shouldn't needlessly waste three hours of their time.
Is this really that big of an ask for players who are looking to shape our metagame? Maybe I'm in the minority, but I can afford to spend 3 hours laddering within a 2 week time period, especially if this is my hobby. Even if for some reason 3 hours is too much to ask of people, is this new alternative of acquiring suspect reqs really going to save that much time? I can't imagine this new system saving more than 1 hour at most. My problem with this line of thinking is that in theory you could do away with laddering all together if the person in question has a good enough reputation. I have no doubt top tour players could obtain voting reqs, but assuming they don't take part in any discussion threads, or any laddering on the suspect ladder, do they really deserve to? As someone who has voted in the past, and has taken part in recent suspect tests and discussion, to me gaining reqs is an achievement, something to be proud of, and something to be earned (there's a reason we have a badge for it after all). This is somewhat of a strawman, I know you're not arguing for certain players to get reqs based on their reputation alone, but I hope the point I'm trying to make still comes across clearly.

tl;dr
Months of inactivity shouldn't be encouraged by our voting playerbase
3 hours of laddering is a small ask for people looking to alter our metagame
I don't believe the new system will save any significant time
 

cityscapes

Take care of yourself.
is a Tiering Contributoris a Community Contributor Alumnus
Hello, thanks for your response. I'll try to address everything here.

First of all, what I called "variance" in the original post I will now be referring to as "RD" or "Rating Deviation", which is its proper name. I apologize for any confusion this may have caused. (edit: i have updated the op to reflect this)
Is this something we really want as a competitive community? For as long as I can remember the metagame tends to shift a lot faster than "a couple of months", and with the new practices of Gamefreak releasing DLC every couple of months, players who have been inactive for this long probably have outdated knowledge. To me the point of a suspect test is to see how a metagame acts with or without a certain Pokemon, and I don't think previous knowledge of an old metagame really cuts it when determining smogon's tierlist.
I was writing up a response for this involving estimating how long it would take to get the RD down after 2 months (guessed 20-30... I was way off lol) but then realized that it'd make my case a lot more credible if I just did the math.

Assuming RD reqs of 40, and assuming the player has 30 RD before going inactive (I would call this pretty active), this is how many games it would take a player to achieve reqs, based on how long they were previously inactive:

As you can see, players who have been inactive for two months are forced to play tons of games to catch up, while players who have been active in the past three weeks have almost all their time saved. This graph looks pretty good to me.

But I'm a BH kid, and I don't really know how quickly standard tiers develop or how much we should penalize inactivity. Because of this, I made two more graphs for stricter RD reqs that can be used for faster developing tiers.


Even if the most conservative 35 RD one was implemented, punishing even the slightest of inactivity, I would still prefer it to the current system because it saves time for constantly active and competent ladder players.
I made several assumptions in the creation of these graphs due to not being able to find all of the necessary data/information. If I'm wrong on any of it please correct me.
  • I assumed that RD decayed once a day, because that's how Elo works and I thought I saw something here (Line 117).
  • The RD of the players you face on the ladder affects how fast your own RD goes down. I took a random sampling of 36 players from random ladder games on the server (kept clicking refresh on watch a battle) and got a normal distribution for RDs with mean 57.55556 and stdev 36.8855. I thought this data was bad for higher ladder, which tends to have more serious players and decays inactive ones, so I estimated an RD of 45 ± 30 for these (with hard caps of 25 and 130).
  • I assumed that the rating of the ladder players is also on a normal curve of [the player's rating] ± 130. Don't really know how accurate this is, but finding out the correct values sounds like a ton of effort.
  • I assumed that the rating periods end every time the player finishes a game, because you can see your Glicko/GXE update in real-time. I wasn't too sure on this, so I also simulated 5 games per rating period, but the results were pretty much identical (within .002 RD after 50 games) so I don't think this one is as significant.
  • Since the ladder players' ratings and RDs were randomized on normal distributions, none of the estimates (like the graph above) are absolute, but unless one of the other assumptions is way off, it should be pretty close in practice.
Is this really that big of an ask for players who are looking to shape our metagame? Maybe I'm in the minority, but I can afford to spend 3 hours laddering within a 2 week time period, especially if this is my hobby. Even if for some reason 3 hours is too much to ask of people, is this new alternative of acquiring suspect reqs really going to save that much time? I can't imagine this new system saving more than 1 hour at most.
I can't speak for everyone here (only one other person in my community is super vocal about this) but for me, laddering for reqs can't really be compared to every other way of playing this game. It has neither the experimentation appeal of using fun teams on ladder nor the intensity of facing good players in prestigious tournaments; the optimal way of getting reqs is to load up a boring but effective team and play to your team's strengths rather than actually adapting to opposing players (unless you're facing the occasional good player). This is not good Pokemon, in my opinion; even if my proposal doesn't save that much time, it is designed to cut down on the amount of bad Pokemon that's played.

Also, I do think that it can save quite a bit of time. Looking at the OU ladder, several players wouldn't need to ladder at all to get reqs, and several more could get through with about 20 games. And this is 3 hours of low-mid ladder saved for everyone who uses this at all.
I have no doubt top tour players could obtain voting reqs, but assuming they don't take part in any discussion threads, or any laddering on the suspect ladder, do they really deserve to? As someone who has voted in the past, and has taken part in recent suspect tests and discussion, to me gaining reqs is an achievement, something to be proud of, and something to be earned (there's a reason we have a badge for it after all).
The way I see it, people who are at the top of the ladder and actively laddering, playing with and against suspect-worthy elements, have not only proven themselves worthy of reqs-- they've already specifically achieved them. It seems silly to me to refuse entry to the same players who have demonstrated more than enough capability in the exact environment being tested.

Hope this covered everything relevant.
 
Last edited:

Wigglytuff

mad @ redacted in redacted
is a Tiering Contributoris a Dedicated Tournament Host Alumnus
The issue with allowing any alt to qualify under your requirements is that any number of people can share a singular ladder alt easily, and they frequently do (as you mentioned, low ladder is boring and a slog to get through, so people share accounts to skip the grind). I used to run into alts such as user(s) "BOUGHT A BIG K" on the OU ladder pretty frequently, which I think has at least four people on it? While this problem is also present on suspect alts, at least it's explicitly disallowed to share suspect alts. There's no rule against sharing regular accounts. Under your proposal, well connected people that aren't active in a metagame could simply ask a friend that is for access to one of their laddering alts for easy/instant access to a vote. Or for a more innocuous scenario, you're an OU player drafted for UUPL and you ask your manager for a high ladder account to prep/practice. You ladder for 20-30 games with teams given to you by your manager and you maintain a high ranking. 3 weeks later, there's a suspect test. Should you be allowed to qualify under your requirements? Is there a reasonable and feasible way to stop this from occurring?

Hikari's post in the council reqs thread has suggestions for alternative and less grindy ways of getting reqs, such as suspect tours or tournament qualifications. I believe those are more suitable methods that are less prone to abuse. Not to mention that it's generally easier to get reqs for experienced players than when this thread was posted. I'm not sure when this shift happened, but nowadays you can get reqs in 30 games if you get 82 gxe as opposed to the flat 40 minumum we used to have.
 
Last edited:

Dorron

BLU LOBSTAH
is a Top Social Media Contributoris a Community Contributoris a Tiering Contributoris a Top Contributoris a Smogon Media Contributoris a Site Content Manager Alumnusis a Forum Moderator Alumnusis a defending World Cup of Pokemon Champion
I think the current system for reqs is broken in tiers with a smaller playerbase such as OMs. Literally any decent player can take a sample and rush the same 4 people who are playing the tier with unviable teams for like 20/25 wins, almost guaranteeing you get the reqs. You know and have learnt nothing about the tier after playing the same people so you might not have even encountered the suspected Pokemon in your whole run. What's the point of suspects if the people voting don't know a thing about the threat? Indeed, in tiers with such a small list of qualified voters, a group of people with malicious purposes could perfectly change the suspect result. Let's take the Pikachu NFE suspect test I qualified for as an example. 23 people qualified and 17 of them voted Ban (5 No Ban), meaning that if a group of 8 people (not hard to find) decided to mess up the tier, they would have been able to do so voting No Ban (the % of Ban needed was 60%, and 17 out of 30 is 56.6%), resulting in the tier not being enjoyable for at least another month before council decided to resuspect or quickban it, potentially reducing its brief playerbase even more.

I totally support and agree the idea of an alternative for suspects tests such as with tournaments like in the RBY UU Dragonite suspect. It is true that these might have a small player list to vote (I don't really know as this was done for an old gen low tier) and the problem I exposed above might be present here too, but the difference is that in suspect tournaments you don't have unlimited tries and you play against, people who at least spent their time on signing up to the tour besides random people in ladder, so you are supposed to face the threat more oftenly and better players. Moreover, these suspect tests would not only include people from the suspect tournaments but also people from official Smogon tournaments who reached advanced rounds / won the tournament, meaning there would be at least a group of people who really know the tier and will vote reasonably. This would also avoid three boring hours laddering to get reqs, it usually gets very boring after the first 15 games and would both prevent the problem of sharing alts and the time spent in laddering.

Suspect tournaments were done a few years ago more frequently. I would like to know the reason why they were left apart to be replaced with the current system, so we could try looking for ways to make them feasible. Appealing to everybody who was in the forums when that happened to know why this was changed.
 

AM

is a Community Leader Alumnusis a Community Contributor Alumnusis a Tiering Contributor Alumnusis a Contributor Alumnusis a Battle Simulator Moderator Alumnusis a Past WCoP Champion
LCPL Champion
Thing about suspect tours is you have to find people that are willing to host them, and host them well, generally on a much shorter notice due to the 2 week timeframe of these suspects normally. I know one of the RBY UU live tours was all messed up due to a sign up discrepancy and somebody dropping out and it was kind of a mess making what would have been a short tour last forever. Reqs for that suspect specifically were based on those tours and a certain threshold at the time for its inclusion in UUFPL. I think it's okay for tier leaders to kind of decide if they want to include these tours for their suspects on their own instead of just throwing a blanket policy across all suspects. I know other more niche formats do it already due to lower playerbases anyways.

The time it takes to get reqs with the usual 30-50 game requirements isn't that bad unless the tier isn't very active on the ladder, which is a separate issue all together. The slog, as described above, isn't really going to change with any sort of way getting reqs through ladder. I think the status quo is okay and tier leaders can just adjust where needed based on activity. These reqs aren't even asking you to be at the top of the ladder within the game limit, in most cases just hit mid-ladder to show the person on the otherside of those reqs has a brain at the very least.
 

Coconut

W
is a Top Social Media Contributoris a Community Leaderis a Community Contributoris a Top Tiering Contributoris a Contributor to Smogonis a Tutor Alumnusis a Senior Staff Member Alumnusis a Battle Simulator Staff Alumnus
LC Leader
I would like to know the reason why they were left apart to be replaced with the current system, so we could try looking for ways to make them feasible. Appealing to everybody who was in the forums when that happened to know why this was changed.
The main issues with suspect tours are timezone-related issues, amount of entrants, and that chances are you're getting reqs in a significantly smaller amount of games than literally everyone else.
 
Thing about suspect tours is you have to find people that are willing to host them, and host them well, generally on a much shorter notice due to the 2 week timeframe of these suspects normally. I know one of the RBY UU live tours was all messed up due to a sign up discrepancy and somebody dropping out and it was kind of a mess making what would have been a short tour last forever. Reqs for that suspect specifically were based on those tours and a certain threshold at the time for its inclusion in UUFPL.
I think this is a really good point, but I think it’s also a potentially solvable one. Objectively speaking, the live tour hosts exist; smogon tour garners enough signups to host larger tours that it could fill the need 10 times over,. Even if we assume that Smogon Tour’s prestige makes it enormously more attractive and we assume nobody new would sign up to help out their favorite tier, it aught to be at least usually workable. So that leaves the question of why it isn't currently working.

Speaking from my personal experience, I started signing up to host smogon tour because it was easy. I don't mean this in the sense that the tour itself was a piece of cake, but that it's very simple to get to the point of hosting. There's a very visible thread that clearly outlines what is being looked for and what times you need to be available in order to host. Once you're signed up, you get to know if you're in promptly and get access to a full guide on how to host as well as having others to reach out to.

There's more than one way we could solve this, but to list a few:
1. We could make a general suspect tour host thread, like Smogon tour has. When a tier posts a suspect test, they make a post about what times (I'd advise something similar to smogon tour weekend times but there's options) they'd like the tour to run at, and ask for signups. Then make a general guide for suspect tour hosting; most of the smogon tour stuff should apply, but not all. I'd be willing to help with this if needed.
1a. Make a specific thread for each suspect. Probably better if you want to get hosts from the community interested in helping a specific test, probably worse for general hosts.
2. Use the hosts we already have. If you go to the smogtours discord and drop a line in the tour hosts channel, you'll be reaching most of our livetour hosts at once. There's no guarantee, but it would work relatively fast and get responses who definitely can do it.
3. Have people on council host it. It's a skill like any other, if you're willing to learn it isn't too difficult. Probably not every metagame can do this but that doesn't mean it's unworkable.
 

Fiend

someguy
is a Social Media Contributoris a Community Contributoris a Top Contributoris a Top Team Rater Alumnusis a Community Leader Alumnusis a Tiering Contributor Alumnusis a Smogon Media Contributor Alumnus
I think live suspect tours are too flawed to be used. There's a few big reasons:

They are simply not equitable to all players (timezones which already effect ladder activity now have sharper effects on access) and in LC's experience these got somewhat low sign ups when we attempted them prior. These also award suspect voting to a few users, who achieve this in arguably less effort and certainly less games when compared pursing reqs via the ladder. In fact, this seems to be why this suggestion keeps appearing? I do not think that this aligns with the purpose of the suspect test in the first place. The stark differentiation of effort required is part of why council reqs were discarded. Additionally, there's a big ask for several people's time (host, other players, and then a player who wins reqs) with the tangible benefit only going to one user. It is very hard to desire running a handful of live tours for the outcome to be providing a few easier suspect reqs to a selection of users biased by host availability. LC ran this experiment before. I don't expect the results to be better if we were to do it again. We've instead started to explore other lines for incorporating tournaments into a system of earning reqs. LC ran a ladder tour prior to a suspect test, and used results from the ladder tour to double for the suspect reqs for qualifying players. I think exploring more items like this can be fruitful, but are circumstantial. I would rather make reqs more obtainable on the ladder or explore other options which rewards active players regardless of their availability in a 10 day window.
 
Last edited:

cityscapes

Take care of yourself.
is a Tiering Contributoris a Community Contributor Alumnus
Seeing meri’s suggestion go through made me want to once again revisit this thread. The idea I had to get around the issue of shared alts brought up by Wigglytuff was to only allow people to get reqs only on an alt voiced on the Smogtours server. Looking at the rules, each person can only have one voiced alt that matches their Smogon name. Additionally, voiced alts are used to play in tournaments similar to ladder tour alts, meaning sharing them is, if not against the rules, at least an extremely stupid thing to do.

So to reiterate, the proposal is:
  • Users can qualify to vote in a suspect by having reqs on an account with Smogtours voice.
  • Stricter RD requirements compared to normal reqs can be used to prevent users from qualifying after being inactive. (See the first few posts of this thread.)

What does everyone think of this? Sorry if it’s too incoherent I’m writing it on not very much sleep
 

Wigglytuff

mad @ redacted in redacted
is a Tiering Contributoris a Dedicated Tournament Host Alumnus
on the TOILET, apologies for brevity
meaning sharing them is, if not against the rules, at least an extremely stupid thing to do.
not against rules, see pinkacross RMTs where its stated that account is shared with Storm Zone to reach elo/gxe peaks (sz's tourban is for something unrelated (rip s*ccer discussion while laddering for olt)). only against the rules when you're in a tournament game

allow people to get reqs only on an alt voiced on the Smogtours server. Looking at the rules, e
same rules also say that smogon account must be 1month+ old, meaning new people that are inclined to join smogon to vote in a suspect test will have to wait at least one suspect test out, or perhaps just never sign up anyway. i do not think this to be wise if we want to incentivize suspect participation. yea they can go through the traditional route but if the point is to let active players skip The Grind, why shouldnt the measure allow established ladderers who just havent signed up to smogon through?


am in favor of tournament reqs/suspect tours, perhaps with shiny tc more ppl will be inclined to participate
 

Users Who Are Viewing This Thread (Users: 1, Guests: 0)

Top