Danbooru

Upcoming Changes for Upload and Approval Complaints

Posted under General

Instead of searching for complex algorithms to weigh the score, how about weeding out approvers based on how many of their approvals got subsequently flagged and deleted? In the end, that's exactly what we don't want them to do: approve posts that need to be flagged afterwards.

Oh, so if we take approver report for example, you suggest to store percentiles for every peer group per approver, like this?


Approver

Basic percentile

touhou

rating:e

touhou rating:e

Type-kun

90

50

50

30

I see where you are coming from, but we won't be having five peer groups. We'll have dozens just for popular copyrights, not even counting in combinations.

Wouldn't that make report rather hard to use? I think some sort of average indicator still should be present, just to serve as an initial signal that something is wrong.

NWSiaCB said:

The "Comments" page is actually much more useful as a way to find interesting comics and filter out the random bulk porn.

The number of comments is actually an interesting metric(*), but interpreting it right isn't simple.

A high number of comments could be because of (at least) three reasons:

  • Part of a popular copyright, series/comic or artist etc. Assuming e.g. every 100th viewer will on average leave a comment on an image the aforementioned images will get a higher number of comments simply because more people look at them.
  • The topic of the images is (very) controversial. Because these images will lead to discussion (with likely multiple comments by the same poster(s)) their number of comments will generally be high. Their quality on the other hand may or may not be high.
  • Not (necessarily) part of a popular copyright / by a popular artist but very interesting, very high quality / very detailed or very original / creative. These images might not have a very high number of views but most people who view the image will leave a comment.(**)

The third category of comments is the most interesting if you want to find high quality images, but I don't know if it is possible to computationally find them. I think their number of comments per view(er)s would be relatively high, but the number of views / viewers that is necessary to compute that value is not recorded I think.

(*) It seems to be currently impossible to search for images with at least n comments. Maybe a comments metatag would make sense so that you can search for comments:>n like you can currently search for score:>n or favcount:>n
(**) the number of favorites per view(er)s might also be higher than average

If approvers are judged on the score of their approvals, won't that create an incentive for approvers to vote up every post they approve?

Remember Goodhart's Law: "When a measure becomes a target, it ceases to be a good measure."

I don't know how much that would incentive in this case, since its a minor 1 or 2 points. I sometimes favorite/vote-up posts I approve because I like them, as well.

Type-kun said:
Oh, so if we take approver report for example, you suggest to store percentiles for every peer group per approver, like this?
<snip>
Wouldn't that make report rather hard to use? I think some sort of average indicator still should be present, just to serve as an initial signal that something is wrong.

Yes, that's what I'm suggesting we store; no, that's not what I had in mind for presentation. Dumping all that information as numbers in a huge table is indeed going to be hard to digest and make sense of. I had something graphical in mind. Let me see if I can find a bit of time this weekend to mock something up.

Saladofstones said:

I don't know how much that would incentive in this case, since its a minor 1 or 2 points. I sometimes favorite/vote-up posts I approve because I like them, as well.

Among the median scores of posts approved by janitors (in the last 3 months), the median of medians is 4. So 1 or 2 points is actually huge.

lkjh098 said:

Among the median scores of posts approved by janitors (in the last 3 months), the median of medians is 4. So 1 or 2 points is actually huge.

To back this up with an actual example, if I bumped all the 1 and 2 score posts I've approved in the last months I'd jump from 37% score:+3 to 77%.

Borrator said:

Instead of searching for complex algorithms to weigh the score, how about weeding out approvers based on how many of their approvals got subsequently flagged and deleted? In the end, that's exactly what we don't want them to do: approve posts that need to be flagged afterwards.

So. Very. This.
(of course this should include all posters who have auto-approval rights for their own posts)

Also, I do hope that the process of making roles granular also (or should I say foremost?) includes clear distinction between rights to approve others' posts and rights for self-approval (auto-approval) own posts. With the latter being much more exclusive than the former (because of the highest level of trust required) ie. absolutely the reverse to what we have on danbooru right now.

Finally, I hope there would be some kind of realistic leniency for approvers with their current and overall activities. Reading about their monthly reviews makes me worry about some kind of rat race for the monthly approval quota for them. I assume most of them have some kind of life and because of it even prolonged drops of activities are only to be expected. Personally I'd value more an approver who approves less and even irregular but is flawless than the one who is an approving machine yet prone to errors (resulting in flagged and deleted posts).

r0d3n7z said:

I had something graphical in mind. Let me see if I can find a bit of time this weekend to mock something up.

Oh. Didn't think of graphical representation. Now that I have thought about it, though, perhaps something like an improvised gradient bar graph?

1) Get all the percentiles
2) Sort them in ascending order, no-hits last
3) Assign something like color red to 0th, color yellow to 50th and bright green to 100th percentiles, color in-between accordingly. Color no-hits white.
4) Represent each percentile with a narrow bar colored accordingly, so it would look something like this: http://puu.sh/iVCyi.png

Also, I'm interested in how representative that metric actually would be. I'll do some number crunching with obsolete DB dumps if I have time.

Updated

Type-kun said:
I'm interested in how representative that metric actually would be. I'll do some number crunching with obsolete DB dumps if I have time.

If you could help give an idea of what the score distributions for some popularity factors / peer groups look like, that'd be fantastic. Histograms with X-axis score, Y-axis number of posts with that score.

I'll try to get score distributions for peer groups done today before midnight (mind you, it's +5 GMT here). @r0d3n7z, do you need those in some particular format? Simplest query will have output in peer-group|score|count format, it should be enough to build a histogram; write here or DMail me if some further processing is necessary.

Also, I'm planning to build a list of peer groups for copyrights having 10k+ posts, including {{original}: both stand-alone and combined with rating metatags and `comic` tag. Anything else I should include?

First results are here :3 For now, just peer groups used and post counts per group. Will calculate histograms for peer groups with 500+ posts, others seem rather useless.

E: subsequent analysis moved to topic #11864

Updated

I am sorry if I missed something here, but I'd like to specify. What about being able to delete and approve already deleted posts? I go through ALL deleted posts manully, I still have my list of 600+ decent deleted pics to approve, and I can't go on bugging janitors to do it for me now that there is a real opportunity for me to do something. (Not to mention my amount of flags and people complaining about me flooding the mod queue). Untie my hands, please ;_;

Not sure what you're really asking for, but at this point anyone with the post approval flag can immediately approve and thereby undelete posts that have passed into "standard" deletion (double-deletion removes them from the database completely). Generally when a new approver is added they tend to go undelete a small number of posts right away; I'm not sure going through and approving 600+ deleted posts would be the most appreciated maneuver, though, especially for a user "on trial."

There really is no such thing as "flooding" the mod queue within reason. On an image that's so bad that the thumbnail betrays it as being subpar it takes maybe a second to hide from the individual's queue forever. If the thumbnail warrants inspection of the image, it's the same time plus however long the user's internet connection takes to download the image. Basically if you stay within the 10-posts-per-day that your account currently allows there shouldn't be a problem (and anyone who complains about that level is probably too lazy to have the job in the first place). Now, if you became an approver (thus gaining infinite flag rights) and dumptrucked a few hundred posts into the modqueue people would not be impressed.

It comes down to common sense and courtesy, really.

Well, basically I guess I'm asking for the right to bypass both the 1 appeal and 10 flag per day limitations, because I have so much planned to do and I don't want to wait to help. Of course I wouldn't approve all 600 at once and all, I just waited and asked for so long to be promoted, I guess I just got too exited.

Type-kun said:
@r0d3n7z, do you need those in some particular format? Simplest query will have output in peer-group|score|count format, it should be enough to build a histogram; write here or DMail me if some further processing is necessary.

Ah, sorry, been away for a couple days. peer-group|score|count is fantastic. I'm primarily interested in histograms for preliminary analysis. The shape of the histogram will inform choices of how to set percentiles of interest and/or what granularity at which to take percentiles.

Still gotta get around to doing that mockup of the graphical visualization I talked about last week, sorry

albert said:

I wanted to detail my plan addressing complaints about uploads and approvals. Previous thread on the issue.

Phase One

  • I'm removing the Janitor user level. I will be promoting most of the existing ones to Moderator depending on feedback and how friendly/helpful they have been on the forums and comments.
  • In place of Janitor there will be a new user role determining whether or not you can approve posts. This means just because you are a mod or an admin doesn't mean you can approve posts. This is part of a larger move to make rights more granular. Eventually I will move uploads, tagging, forum and comment moderation into separate roles.
  • Existing janitors, mods, admins will all start out with this role.
  • This will happen within the next week.

Phase Two

  • I will start aggressively recruiting new approvers. Every month 3-5 users with good uploads or favorites will be selected and gain the approval flag. After a month their approvals will be reviewed and they'll either keep or lose their approval right.
  • This will happen towards the end of July.

Phase Three

  • I will start removing the approval flag from older janitors. Specifically I will be targeting low quartile scores in the Janitor Trial Report. If this ends up affecting you please don't take it personally. Consider the next few weeks a grace period to start becoming more selective of your approvals.
  • This will happen towards the end of August.

If you have questions or concerns or ideas not listed here let me know. I'll update this original post as plans change.

albert forgot to add:

Phase Four

  • Everyone and their dog now have Fabulous Secret Powers.
  • I will grab some popcorn while watching Danbooru die slow and painful death.

Seriously, what's the point in fixing something that isn't broken? We could just fire a single guy, but we decided to hire a hundred new guys like him instead.

D'Eye said:

Seriously, what's the point in fixing something that isn't broken? We could just fire a single guy, but we decided to hire a hundred new guys like him instead.

From my perspective this is actually working to try and fix a problem. We've already experienced a few times of inviting and adding new approvers in the past. The activity of many of those users will wane over time though and the population of fairly active approvers will go down until it hits a point where we again have to add more approvers. This new system makes that process a more normal procedure for the site. It allows us to maintain an active population of approvers by establishing a system to determine which users qualify to be approvers, a means of monitoring approver performance, and a system that will retire approvers who underperform.

While this might not seem like something that is broken, it has been a consistent issue in the past. It is definitely an improvement to set up a system that prevents us from reaching the point where we begin to see the symptoms of this problem. Furthermore the problem that you're having will likely be resolved by this solution, as it establishes a system to judge and demote approvers by.