Danbooru

Should we tag every little thing?

Posted under General

If you are looking for bedroom settings, you ought to be searching bedroom, not clock (tag added BTW). If you are looking for something involving clocks, would an image like post #634879 even be useful? Wouldn't you expect something more along the lines of post #276186 or post #582626?

I can understand the motivation behind tagging every little thing simply to recover specific images. In that case you are sort of using the taglist as a more memorable and accessable "fingerprint" than the image's MD5 hash. But it's sort of a mutually exclusive goal to that of ensuring that a given tag returns appropriate images.

Updated

I figured if you're searching for something as generic as clock, you're likely trying to find a specific post that you'd seen before anyway.

Of course, there could be other perfectly valid reasons, but I at least don't know anyone with a clock fetish.

Either way, isn't it better for the user searching clock to decide for himself whether he wants the image or not? If an image has a clock, however insignificant in the tagger's opinion, and isn't tagged as such, there's no practical way for our hypothetical user to even known of its existence.

Soljashy said:
I figured if you're searching for something as generic as clock, you're likely trying to find a specific post that you'd seen before anyway.

Of course, there could be other perfectly valid reasons, but I at least don't know anyone with a clock fetish.

No, but there are more tags than just clock. And I have on many occasions before searched for depictions of X, hoping to get the most representative samples.

Either way, isn't it better for the user searching clock to decide for himself whether he wants the image or not? If an image has a clock, however insignificant in the tagger's opinion, and isn't tagged as such, there's no practical way for our hypothetical user to even known of its existence.

The same logic that goes behind "I want to reduce these 4212 short_bikini posts to something more manageable" holds here. If you're looking for clocks, you won't be too happy to find that for every image with a significant clock depiction you have 100 where the tag is strained. It makes it more or less useless by diluting the content.

All right, let's forget about clocks, then. Instead, let's look at character tags.

In forum #30437, it was established that we always tag characters, regardless of whether their presence is indicated by a 10x10 pixel silhouette, by the logic that it should be for the user doing the search to decide whether or not he wants the resulting image and potentially relevant data should not be drained out.

Does the same not apply to general tags?

IMHO character tags are a bit different, as you can't really go above a certain number in a given pic, and even in cameo appearances the character is definitely there and they might be interesting in their own way. OTOH, for generic tags you can come up with more or less an arbitrary number in most pictures.

I too am going to quote 0xCCBA696 from the other thread:

0xCCBA696 said:
it's obviously better to include borderline cases in a tag than to exclude them. When they are included, people who want the borderline cases will get them, and people who don't want them will have to deal with a small* percentage of their search results being extraneous (from their perspective). When borderline cases are excluded, people who don't want the borderline cases [...] but people who do will be totally shut out, because it's normally impossible to find those cases any other way.

Shinjidude > Split the clock results with gentags:<10 and you're most probably okay.

Assuming I'm looking for clocks it's possible that I'm actually looking for stuff that lies somewhere in the image (yet visible enough of course) rather than being the main point. Honestly that's not less likely than being looking for clocks in the first place like Sojashy said.

It's not failproof, but depending on the focus you want you can reasonably toy with the gentags and/or chartags thresholds step by step to narrow the results. This works pretty well for foreground/background characters as well.
This should already be enough to make everybody happy I guess.

> Again, when performing a blind search ("let's see what the site has", exploring), a certain degree of noise is acceptable, since the very existence of your ideal image is unreliable, which means:
- having no idea of what you'll find makes you interested in a greater variety of results,
- you're most likely taking time to observe,
- variably relevant stuff that isn't part of your initial goal can happen to be interesting as well.

Noise isn't so annoying here. The more contents the better.
And what happens to be noise for you might be the opposite for the next person browsing through the same tag.

But retrieving a specific post is a completely different state of mind since you're in a hurry. And trying to retrieve a post with no significant identifier means hundreds of undesired search resulsts, all of which are noise.
This is much, much more annoying.

As long as the tag you're about to add to a post is likely to be useful to someone for whatever purpose, I think it's worth it, and "exploring" works fine without needing a priority imho.

Updated

Soljashy said:
In forum #30437, it was established that we always tag characters, regardless of whether their presence is indicated by a 10x10 pixel silhouette...

If you'll notice I was actually against that decision in that thread, and never really reversed my opinion.

Cyberia-Mix said:
Split the clock results with gentags:<10 and you're most probably okay.

I'm not sure how useful gentags:<10 is for screening out over-tagged posts tags based on how I normally post, because I actually do tend to be rather thorough.

It's relatively easy to go through 10 general tags on a character's personal appearance much of the time. That's even leaving out ancillary or insignificant items.

See what you get with user:shinjidude gentags:>10 chartags:1. Most of those don't even have background settings.

Cyberia-Mix said:
Assuming I'm looking for clocks it's possible that I'm actually looking for stuff that lies somewhere in the image (yet visible enough of course) rather than being the main point. Honestly that's not less likely than being looking for clocks in the first place like Sojashy said.

I can see being interested in posts where something is somewhat less prominent in addition to those where they are more prominent. That's basically just expanding your scope.

I can't really think of a situation where you wouldn't want your posts to be proportionally more relevant though, or where you would prefer to get a lot of noise where what you are looking for is only barely relevant.

Cyberia-Mix said:
It's not failproof, but depending on the focus you want you can reasonably toy with the gentags and/or chartags thresholds step by step to narrow the results. This works pretty well for foreground/background characters as well.
This should already be enough to make everybody happy I guess.

It may be the only way to make both methods function to some degree with the system we have, but I don't really see that it is particularly effective. Especially so since it depends a lot on how a post is tagged, and penalizes well-tagged posts from appearing.

Cyberia-Mix said:
Noise isn't so annoying here. The more contents the better.
And what happens to be noise for you might be the opposite for the next person browsing through the same tag.

It depends on how much noise you get. Some tags are bound to more often appear as background items, and as such if every little thing is always tagged will end up with more noise in proportion to images where those items are featured prominently.

Also as I say above, I can't think of a situation where someone searching with a specific tag would prefer to get more noise than signal.

Some noise is tolerable, especially when it prevents relevant posts from being screened out, but more noise than signal is almost always a bad situation.

Cyberia-Mix said:
But retrieving a specific post is a completely different state of mind since you're in a hurry. And trying to retrieve a post with no significant identifier means hundreds of undesired search resulsts, all of which are noise.
This is much, much more annoying.

To be honest, I handle this situation with favorites. I don't know that I've ever come up with an image I wanted to find quickly that I couldn't by filtering my favorites with a tag or two.

Updated

Shinjidude said:
To be honest, I handle this situation with favorites. I don't know that I've ever come up with an image I wanted to find quickly that I couldn't by filtering my favorites with a tag or two.

Assuming your number of favourites keeps growing, you will eventually run into the same problems searching them.

Since I have yet to see a convincing argument against the "looking for a specific image" scenario, let's have another look at the "searching for images in general" scenario.

Say two people are interested in crustaceans, for sake of argument. One of them is only interested in finding posts that feature these creatures prominently. The other one is frickin' nuts over them and would like to see anything vaguely related to crustaceans.

What seems like noise to Person One is the only means Person Two has by which to find what he's looking for.

I'm actually not arguing against using tags for "looking for a specific image". I understand the system, and it's a pretty effective way of putting a handle to a specific image.

What I'm saying though is that the method of tagging that optimizes for that system (tagging everything possible even if it's almost irrelevant to the image), necessarily dilutes the posts returned for any given tag, and introduces noise (regardless of where your sensitivity to noise might be).

As for the crustaceanophiles, certainly different people have different thresholds for what constitutes noise and what constitutes signal. I'm not even arguing that we not tag things that are only moderately relevant. I'm arguing against tagging things that are almost entirely insignificant or are actually hard to find. Even the crab nut isn't going to go gaga over something he can't make out without a lot of effort.

It's long been held that we don't tag, say, thighhighs if they are only visible by a few pixels at the bottom of the screen. Way back in forum #414, jxh2154 reiterates exactly this. Why should it be any different if an item is only 25 pixels in the background?

Updated

Shinjidude said:
It's long been held that we don't tag, say, thighhighs if they are only visible by a few pixels at the bottom of the screen.

Funny you say that because I came across such a post today for the first time, and when I stumbled upon the thighhighs tag I was like "what the hell? where?" *notices thin black border at bottom* "oh right, dear, good call", and I left it as is.
Too bad I can't find it back now. I've already forgotten everything else about this picture.

Anyway I guess we mostly agree. Replying on some points though.

Shinjidude said:
I'm not sure how useful gentags:<10 is for screening out over-tagged posts tags based on how I normally post, because I actually do tend to be rather thorough.

I hear you on that.
True that I'd be excluding the kind of posts you mention if I stopped at the first step. I wouldn't take the risk to miss those. The point is more about probability ordering.

It's simply based on the assumption that the main elements on an image are the most likely to be tagged. Therefore the tag you're browsing is more likely to be important on posts with fewer tags.
10 is more of an example than an average, but I'm not teaching you that not everyone on danbooru tags posts as thoroughly as you do anyway.
(Well, from a quick check, 46% of danbooru's posts from 2009 onwards actually fall into the gentags:<10 category, and 58% overall.)

As I said it's far from being perfect, but you don't get the same overall kind of results depending on the level of detail you've searched for. Doesn't work so bad on the clock example at least.

Shinjidude said:
I can see being interested in posts where something is somewhat less prominent in addition to those where they are more prominent. That's basically just expanding your scope.

I can't really think of a situation where you wouldn't want your posts to be proportionally more relevant though, or where you would prefer to get a lot of noise where what you are looking for is only barely relevant.

It was a bit of an abstract situation.
But in my case, I can imagine myself being interested in images presenting highly detailed indoor settings.
These kinds of things can only be reached through approximations since they have no accurate descriptions. The prominent stuff here becomes noise because it's likely to get tagged before what I'm actually looking for, if not in place of it (see post #488026).

I'm most probably not looking for a specific item, but the said item can be a good start for identifying a certain type of place. Items combinations are unlikely to give results, so you're stuck with some gentags:>## then. Noise is kind of difficult to avoid here.
In case of post #634879, the clock tag would be overkill because the global detail level is not significant enough (the only useful case I can think of for this tag would be the combination with bed or bedroom, which is sort of limited).
In cases like post #608663 or post #605707 (yes something went terribly wrong with Sakuya's constrast here) however, the detail level is such that it still might be interesting. Guess they make good examples for where do you stop tagging actually.

(This discussion made me discover the room and indoors tags while writing this post, too bad they're underused and need some reworking.)

Shinjidude said:
To be honest, I handle this situation with favorites. I don't know that I've ever come up with an image I wanted to find quickly that I couldn't by filtering my favorites with a tag or two.

The whole thing is that I've absolutely no way to predict which posts I might be wanting to see again in the future. I might start saving potential stuff in a giant folder somewhere but it doesn't sound quite reasonable. And, well, I prefer my favorites the way they are now.

Updated

I generally try to add high numbers of tags, but do not take it too seriously. After all, Danbooru has so many hastily tagged images that there will always be some stuff with barely the character names and little, if anything, else.

An example of my ponderings:

I tag Kirisame Marisa's hair ribbon a lot. It is not a very prominent accessory to begin with, but I think it is a relevant part of "generic Marisa" as I know her. Now sometimes I run into a pic where the ribbon is barely visible. I know, people who are after illustrations of girls with hair ribbons would never want that. But on the other hand, I do not want that pic when I occasionally search for Marisa without a ribbon. So I usually add the tag.

Well I'm for not tagging every little thing. Unless it's a character, a reference, or something that the artist intended to be noticed like pikachu in post #559865 and crab in post #560059

These are more the kind of tags that people would use to find a specific image. I mean, people would remember the pikachu in post #559865 or the crab in post #560059 but they wouldn't remember the clock in post #634879

I also agree with Katajanmarja on tagging things characters usually have. When I upload Kasumi/Misty from pokemon pics I always use the side_ponytail tag, I think its useful, because sometimes I do look for images of Kasumi with her hair loose

also, how do you type tags so they're links on here?

Roarchu said: Should we tag every little thing?

No. Tagging inconsequential items like the clock in that image does more harm than good in my eyes.

It's better than being a lazy tagger, yes, but I don't think it's actually helpful.

But there's deep seated disagreement over how much of a role "prominence" should play in tagging. I think it should be taken into consideration very strongly but others disagree.

Also, just to elaborate on what OxC said, it's double square brackets and double curly brackets.

See also the dtext help file: http://danbooru.me/help/dtext

  • 1
  • 2