Danbooru

[Prototype] User Report Ver 6.1

Posted under General

@BrokenEagle98
For the Pools: I think there should be a column be added that counts how many of the added pools were removed by someone else.
So for example user A added 100 pools in total, but 25 (25%) were removed by User B afterwards. It's easier to see if there are possible pool vandals or how much the pooling is actually "worth" of that person.

Hmmm... what about edit wars, i.e. from your example, user A goes back and adds those 25 posts to the pool, user B then goes back and removes those 25 posts from the pool, and so on and so forth.

Even if User B was "correct" with his edits, he'll have approximately the same amount of Removes as User A. I quoted correct because many of the pools are subjective, so how can one tell what truly belongs to a pool...?

BrokenEagle98 said:

Hmmm... what about edit wars, i.e. from your example, user A goes back and adds those 25 posts to the pool, user B then goes back and removes those 25 posts from the pool, and so on and so forth.

Even if User B was "correct" with his edits, he'll have approximately the same amount of Removes as User A. I quoted correct because many of the pools are subjective, so how can one tell what truly belongs to a pool...?

It should only be counted Adds and Removals. If there are doubles, then it is that way. We should ALWAYS look behind those numbers of course. Otherwise we can just discard the Reports anyway^^. For example for taggers: One can add 500 tags to a posts. Great number, but if over 7/8 are misuse, then the number loses its shine.

Here's what I came up with...

Note: Obs Adds==Obsolete Adds, Obs Rems==Obsolete Removes

If it looks good, I'll incorporate it into the next release/update...

BrokenEagle98 said:

Here's what I came up with...

Note: Obs Adds==Obsolete Adds, Obs Rems==Obsolete Removes

If it looks good, I'll incorporate it into the next release/update...

What is "OBS" standing for?

Obs, as in Obsolete. Check the note below the table.
"Note: Obs Adds==Obsolete Adds, Obs Rems==Obsolete Removes"

Basically, they're the columns you asked for, i.e. add a post that gets later removed, or remove a post that later gets added...

Looks like it's an implementation of a "translator report":"https://github.com/r888888888/reportbooru/commit/43d3f6a71a5e7c85bfdf15b4ebfee4a145f1392a], similar to mine in that it counts the adds/removals of all of the translator tags, i.e. translated, check_translation, partially_translated, translation_request, commentary, check_commentary, commentary_request.

A reason why it might be empty is because it is being generated right now. When writing a file to disk, the OS reserves the name in the file system which is why it's showing up as empty.

Created another Supply vs Demand chart like back in forum #120141.
Note: Search numbers pulled from the Popular Searches report.

Will add it to the prototype report at the next iteration, as it gives a helpful indicator to uploaders of what people are looking for.

BrokenEagle98 said:

as it gives a helpful indicator to uploaders of what people are looking for.

Animated futanari bestiality bdsm rape involving Donald Trump, you mean?

Can't see myself uploading any of that any time soon.

I'm not getting what you mean by scores, since median score is based on the scores. (specifically it's the Q2 value)

As for favorite count, there could be a category named median favcount... is that what you're looking for...?

BrokenEagle98 said:

I'm not getting what you mean by scores, since median score is based on the scores. (specifically it's the Q2 value)

As for favorite count, there could be a category named median favcount... is that what you're looking for...?

It's already there what I said: Make an own category for median score and/or favorite count, since some users aren't mentioned there, although they should lead this category (Ars AND CodeKyuubi, I assume). Or maybe not creating a new category, but maybe the lower border from 300 to..whatever...200 maybe? 300 posts in a month are still quite a lot, but more than enough to get a good grasp of the user's uploading habits :o.

May I ask why the reports aren't actualized the same time? Some reports are still from the beginning of November while the latest Approver report is from November 20th.
I think the reports should be actualized simultaneously.

Just some updates since (i think) the reports are beginning to stabilize.

I think including the standard deviation on the tagger report is useful information since the average can be misleading depending on the sample. Median count might be useful too.

Many of the upload/approver reports already include confidence intervals for deletion and negative scores, but it's straightforward to add confidence intervals for expected score also. To be succinct, this number represents a lower bound on the expected average score. Whether or not score is an accurate indicator of quality will always be up for debate, but it's useful to have this information.

Putting these here for comparison. Did a direct comparison with the same time periods in an attempt to validate the results of both reports, however this did not yield exactly the same results...¯\_(ツ)_/¯

Despite that, many of the numbers for both versions of the report are still very close to each other which is good. Not sure why the numbers are so far off for the artist report. Gonna go back and check my code again.

Going through the reports on Isshiki though, it's clear for the artist and wiki page reports that create events are being counted as part of the editing events instead of being mutually exclusive. This was determined by comparing the Total, Creates, and all other editing columns, as the sum of any editing column plus the Creates column should not exceed the Totals column. This issue is demonstrated below.

  • Isshiki Artist Report (1st row): 370 Creates + 371 Name = 741 > 472 Total
  • Isshiki Wiki Page Report (1st row): 90 Creates + 317 Oth Name = 407 > 364 Total

Will be submitting an issue on Reportbooru for the above. Also, now that the other reports were updated on 4 Dec, will be running those reports as well as a comparison.

Reportbooru issue # 2
https://github.com/r888888888/reportbooru/issues/2

Data

http://isshiki.donmai.us/reports/contributor_uploads/2016-11-29_v1.html

http://isshiki.donmai.us/reports/post_changes/2016-11-18_v1.html

http://isshiki.donmai.us/reports/artist_commentaries/2016-11-28_v1.html

http://isshiki.donmai.us/reports/artists/2016-11-15_v1.html

http://isshiki.donmai.us/reports/wiki_pages/2016-11-15_v1.html

Updated

It will be... most of my computer's cycles went to generating the above comparisons though. I also still have to do comparisons against the other reports that got generated on 4 Dec.

I'll sneak some time in this weekend though to update it, as it's mostly an automated process by now...

Shouldn't the vandalism parts be deleted from the reports? I see some users there, like iphn who are doing a great job at their tagging. Yet, this user is listed under the vandalism section.
I mean, I could understand it if there are a lot of reverted posts, but is this really the case?

All I want to say is that the two van dalism sections could put users in a wrong spotlight.

issue #2690

That report is just a rough first attempt. It's being placed on Isshiki so that others can comment on it to help refine it. Not sure what the metrics are to record an instance as vandalism...? I'll have to take a look through the code later to see how it works...

Edit:

Looks like what it's doing is counting every tag edit combination by Member-level users, and displays all that have a count >= 5.

Perhaps a name change would be appropriate to remove the negative connotation, because as you pointed out, many of those edits are valid.

Thoughts?

Updated