Thanks @electrolyte!Hi. I come with information.
I just talked to a director at MTurk on the phone who wanted to provide some reassurance and answers for us. Here are a few points that should help:
- The new guidelines are not a new direction Amazon's taking. They're meant to be a clarification of the original TOS that was written almost 13 years ago, now with updated language acknowledging how workers work these days.
- There will not be any mass suspension over people using scripts like they do currently. (I asked this specifically.)
- People who will be suspended for script use are the same people who would have been suspended before. He told me directly: "they'll know what they're doing is wrong." It'll be people who are intentionally scamming, cheating, or doing things we all know are wrong.
- I pointed out that some of the language is confusing, such as the parts that say using scripts to sort/search are okay but you can't be hitting the site too frequently. He acknowledged this and will get clarification on what "too frequently" means, but again repeated the above point about people who will be suspended will be those who who would have been before, who know they're doing the wrong thing.
- I'm meeting him in person on Friday afternoon to get more information and ask more questions. After that meeting, I should have more detailed details to share.
Reading stuff like that scenario amazes me. I have trouble enough understanding the most basic scripts and stuff and getting it to help me with mturk, so I generally stick to the basic few I understand well enough, like PC now.I totally agree what you said is possible (likely even being done)...more the matter of being able to grab enough hits through the piranhas/sharks and the throttling to make it worth it.
But can someone explain to me what 'storing data' is exactly? I understand now what it is in that scenario of someone somehow going through a whole batch without being logged in to first gather data of the hits, but it also sounds like even more simple users can somehow store data.I think it is now fairly obvious to me that there is no way in Hell that Amazon would actually systematically punish things like pounding the mturk server or storing data, since I am certain that at least some people who work for mturk know that the very viability of mturk hinges on these things.
"Storing data" means that your are intentionally downloading the contents of a HIT to your computer to use for some purpose other than completing the HIT.But can someone explain to me what 'storing data' is exactly? I understand now what it is in that scenario of someone somehow going through a whole batch without being logged in to first gather data of the hits, but it also sounds like even more simple users can somehow store data.
I think the scenario that has come up the most related to the "storing data" part has been about scripts, such as HIT Database and MTurkSuite, that store information from the site."Storing data" means that your are intentionally downloading the contents of a HIT to your computer to use for some purpose other than completing the HIT.
Is there a specific HIT type or scenario that you are concerned about?
Here's some more simple examples of storing data that is prohibited...
1) If you are validating a grocery store receipts for 411Richmond or Ibotta, and you right-click and save the image of somebody's receipt to your computer... (Technically, the image is already being saved to your computer as a temporary file, but Amazon knows this and doesn't care.)
2) If you are taking a survey with a hypothetical scenario and you copy/paste the contents of the survey into a Word document on your computer to save it for later. (Keep in mind, some surveys discourage even temporarily copy/pasting text... But I've encountered other surveys where they encourage keeping notes, including copying text... However, the understanding is that the notes taken are not saved or shared with others)
3) If you are working on a survey containing embedded videos (those not hosted on YouTube) and you save those video to your computer to watch outside of working on the HIT.
Keep in mind that all content that you are seeing on your computer screen during any HIT is technically already stored on your computer, but usually as a temporary file. This is 100% okay. This is like how images, such as logos and headers, from your favorite websites are saved in your browser's cache so that when you view those websites again, the website loads slightly faster. If you've every completed a HIT with an embedded PDF, then that PDF is automatically stored somewhere on your computer as a temporary file. Again, that's 100% okay because that's how internet browsers work...
Thanks! Also, please do take a pictures when you visit MTruck HQ (if they are cool with it), as others have asked you. It would be fun to see some of the wizards behind the curtain.I think the scenario that has come up the most related to the "storing data" part has been about scripts, such as HIT Database and MTurkSuite, that store information from the site.
I will get clarity on this on Friday.
Thank you very much for all you do and for sharing it with us. When I saw the update this morning I was really worried (mostly about things like HDB and MTSuite that I use mostly just for analytics because I cannot remember all hundreds of thousands of HITs I have done. Your update is very reassuring.Hi. I come with information.
I just talked to a director at MTurk on the phone who wanted to provide some reassurance and answers for us. Here are a few points that should help:
- The new guidelines are not a new direction Amazon's taking. They're meant to be a clarification of the original TOS that was written almost 13 years ago, now with updated language acknowledging how workers work these days.
- There will not be any mass suspension over people using scripts like they do currently. (I asked this specifically.)
- People who will be suspended for script use are the same people who would have been suspended before. He told me directly: "they'll know what they're doing is wrong." It'll be people who are intentionally scamming, cheating, or doing things we all know are wrong.
- I pointed out that some of the language is confusing, such as the parts that say using scripts to sort/search are okay but you can't be hitting the site too frequently. He acknowledged this and will get clarification on what "too frequently" means, but again repeated the above point about people who will be suspended will be those who who would have been before, who know they're doing the wrong thing.
- I'm meeting him in person on Friday afternoon to get more information and ask more questions. After that meeting, I should have more detailed details to share.
Thank you for further clarification. This is always kinda what I gathered from the TOS/guidelines before, but the way you explained it made it all the more clear for me. I don't do things like that (not smart enough to do it even if I wanted to haha) so I am very relieved to see y'all updating with clearer descriptions of acceptable script usage. I was too worried to use any of my scripts today until I read this thread.Here's a scenario that illustrates just the type of worker that these new policy changes apply to:
1. Malicious worker decides they want to easily make $300-$500/day the dishonest way.
2. Worker makes a script to scrape batches of HITs (e.g. Barcodes) shortly after they drop. This is done while logged-out. The script opens each HIT in the batch, extracts the HIT ID and the UPC and saves them, and then skips to the next HIT. This process generates a ton of page calls to Mturk.com (10's-100's per minute) The whole batch can be scraped within minutes of it dropping.
3. All UPCs and corresponding MTurk HIT IDs are stored offline in a custom offline database, and the UPCs are automatically searched via another script against existing online UPC databases. UPCs that result in no products being found are flagged. The UPC search results from the online databases are automatically saved to the workers offline database (this takes <10 minutes)
4. Dishonest worker now logs into MTurk, and runs another script that previews each Barcode HIT in the batch and checks each against their custom database. Once a HITs ID containing a UPC that corresponded to no product is encountered, the script automatically accepts the HIT, checks the box for "No Product Found", and then auto-submits the HIT. Script moves on to find the next HIT, and repeats the process until the batch is exhausted.
Such a malicious worker would violate all three of these 3 key policy points:
1. High frequency page calls
2. Extracting & storing data
3. Substituting their human judgement for letting a script automatically search for and input the information into the HIT.
To reiterate what was recently discussed above, this is VERY different from how the average Turker uses scripts and does not apply to something like HitDB
Hel
- The new Participation Agreement says "you will not reject Tasks performed by Workers without good cause" which means Requesters who reject without good cause are in violation of the Participation Agreement.
If this means better communication from Mturk from now on regarding suspensions, this is the best news from them in a while.If a worker is doing something that raises a red flag at Amazon, you're not going to be outright suspended immediately. They will send a warning email (the one that says "we've noticed unusual activity with your account"). If you send an email back and ask what the problem was, they will tell you exactly what they noticed and what the concern is. They should not be replying with a generic "you just violated the participation agreement" type of email and should be saying exactly what the problem was that they noticed. If you get an automated or unhelpful response to your inquiry, please tell me and I will follow up because that should not be happening now.
That's certainly interesting. Of course won't stop them from stomping off in a huff when you mention it to them but it's a start.Hello! I have a few more clarifications and other things from my meeting today:
If you find a requester rejecting without good cause or asking workers to return HITs that can't be found, you can point out to the requester that they are acting in violation of the Participation and/or Acceptable Use policies. This will hopefully have more weight than just telling requesters that doing those things sucks for workers.
- The new Acceptable Use policy says requesters cannot "knowingly publish HITs that Workers will be required to return after accepting them". This means requesters that post a batch and tell workers they don't accept N/A or that they should return HITs that are not found are in violation of the Acceptable Use policy.
I'm hopeful. It sounds like they really want to do better and be better and I'm willing to give them the chance.That's certainly interesting. Of course won't stop them from stomping off in a huff when you mention it to them but it's a start.
so what happens if a requester doesn't want to follow the rules for the things you just posted?? I have a feeling VRapi is not going to follow the rules and still keep telling you to return the address/addresses that you didn't find.
yea its going to be interesting to see what he does... If anyone brings it up to him.
Interesting point....
Thanks so much for the info!Hello! I have a few more clarifications and other things from my meeting today:
Two other things:
- I'll reiterate what I said earlier about concern about using scripts. You're fine unless you're doing something you know is wrong, like scamming or running scripts that are so powerful that they're damaging the website or causing problems for other workers. Me running HIT Scraper to search for HITs I want to do isn't nearly enough to break MTurk or cause other problems, so things like that are fine.
- I did ask about HIT Database and similar scripts that save/store information on the worker's computer. I would not be worried about that.
- If a worker is doing something that raises a red flag at Amazon, you're not going to be outright suspended immediately. They will send a warning email (the one that says "we've noticed unusual activity with your account"). If you send an email back and ask what the problem was, they will tell you exactly what they noticed and what the concern is. They should not be replying with a generic "you just violated the participation agreement" type of email and should be saying exactly what the problem was that they noticed. If you get an automated or unhelpful response to your inquiry, please tell me and I will follow up because that should not be happening now.
- There was a question about whether the part of the new Participation Agreement that says "you will use your human intelligence and independent judgment to perform Tasks" means you can't talk to other workers. It means you can't share answers, as in you can't share completion codes or tell people how to respond in surveys or what to submit as answers. This is the same as before, and stuff we wouldn't allow on the forum, either.
- Some workers have been mentioning over the last few days that they've been getting signed out of the new site when they sign into the old site, or vice versa. If that's happening, it's a bug. If this is still happening for you, please send me a PM with whatever details you're comfortable with me sharing (worker ID, what you were doing when it happened, screenshots, etc.) and I'll get them to the team.
If you find a requester rejecting without good cause or asking workers to return HITs that can't be found, you can point out to the requester that they are acting in violation of the Participation and/or Acceptable Use policies. This will hopefully have more weight than just telling requesters that doing those things sucks for workers.
- The new Participation Agreement says "you will not reject Tasks performed by Workers without good cause" which means Requesters who reject without good cause are in violation of the Participation Agreement.
- The new Acceptable Use policy says requesters cannot "knowingly publish HITs that Workers will be required to return after accepting them". This means requesters that post a batch and tell workers they don't accept N/A or that they should return HITs that are not found are in violation of the Acceptable Use policy.
I hope that helps!
Same as any other requester who violates TOS. We can point it out to them. Some requesters are receptive and say they had no idea and fix whatever they're doing wrong because they don't want to violate TOS of a site they're using. Others just don't and won't care. But at least now we have this to point to. We didn't have anything like that before.so what happens if a requester doesn't want to follow the rules for the things you just posted?? I have a feeling VRapi is not going to follow the rules and still keep telling you to return the address/addresses that you didn't find.
This is really starting to sound like a win to me. No sane person, at least in the US, is going to spend much time working for $2/hr, and the viability of mturk seems to hinge on workers finding ways to do much better than that. I suspect the Amazon side of mturk recognizes that, to some degree or even a considerable degree. But if they are promising "human intelligence" to requesters then they have to make a reasonable effort to ensure that this is what requesters get. This doesn't mean requesters have to get it in exactly the way they anticipate, but I accept that Amazon should strive to assure they get it somehow. My hope is that the Amazon side of mturk will endeavor to use "human intelligence" in making decisions regarding anyone, workers or requesters, who seems to deviate from expectations. It sounds like they might be aiming for that. Maybe we can all win.Hello! I have a few more clarifications and other things from my meeting today:
Two other things:
- I'll reiterate what I said earlier about concern about using scripts. You're fine unless you're doing something you know is wrong, like scamming or running scripts that are so powerful that they're damaging the website or causing problems for other workers. Me running HIT Scraper to search for HITs I want to do isn't nearly enough to break MTurk or cause other problems, so things like that are fine.
- I did ask about HIT Database and similar scripts that save/store information on the worker's computer. I would not be worried about that.
- If a worker is doing something that raises a red flag at Amazon, you're not going to be outright suspended immediately. They will send a warning email (the one that says "we've noticed unusual activity with your account"). If you send an email back and ask what the problem was, they will tell you exactly what they noticed and what the concern is. They should not be replying with a generic "you just violated the participation agreement" type of email and should be saying exactly what the problem was that they noticed. If you get an automated or unhelpful response to your inquiry, please tell me and I will follow up because that should not be happening now.
- There was a question about whether the part of the new Participation Agreement that says "you will use your human intelligence and independent judgment to perform Tasks" means you can't talk to other workers. It means you can't share answers, as in you can't share completion codes or tell people how to respond in surveys or what to submit as answers. This is the same as before, and stuff we wouldn't allow on the forum, either.
- Some workers have been mentioning over the last few days that they've been getting signed out of the new site when they sign into the old site, or vice versa. If that's happening, it's a bug. If this is still happening for you, please send me a PM with whatever details you're comfortable with me sharing (worker ID, what you were doing when it happened, screenshots, etc.) and I'll get them to the team.
If you find a requester rejecting without good cause or asking workers to return HITs that can't be found, you can point out to the requester that they are acting in violation of the Participation and/or Acceptable Use policies. This will hopefully have more weight than just telling requesters that doing those things sucks for workers.
- The new Participation Agreement says "you will not reject Tasks performed by Workers without good cause" which means Requesters who reject without good cause are in violation of the Participation Agreement.
- The new Acceptable Use policy says requesters cannot "knowingly publish HITs that Workers will be required to return after accepting them". This means requesters that post a batch and tell workers they don't accept N/A or that they should return HITs that are not found are in violation of the Acceptable Use policy.
I hope that helps!