[Fsa-guatemala] Language and translation of rights

Daniel Kahn Gillmor dkg at fifthhorseman.net
Tue Sep 16 15:23:42 EDT 2008


Thanks to Jamie for setting up the list, and thanks to everyone for a
very interesting discussion yesterday.

I do most of my thinking while riding my bike, and my bike ride home
From the meeting was filled with ideas about what translation means
for this project.  Apologies that this e-mail is so long.

I've started writing up a concrete specification of how the workshop
actually happens here:

 https://support.mayfirst.org/wiki/internet_rights_workshop/specification

Please edit it! 

The specification linked above includes definition of some vocabulary
terms in the context of this exercise.  These might be obvious to most
people, but they aren't always obvious to me.  I hope that by making
the definitions explicit, we can communicate more clearly.

I've also defined "L" as the number of distinct supported languages
for any particular Workshop, for the sake of discussion.

OK, on to the ideas about how language plays into this exercise:

 * what is "native" -- the Scribe from each Group will by default use
   the interface in her preferred language.  Do we want to record this
   information per scribe?  Do we believe that the "native" language
   chosen by the scribe is also the "native" language for the
   associated group?  Does this matter?  Would it be useful to know
   that a group is fluent in one language or another?  Should we
   provide a mechanism (other than the localization of the Scribe
   interface itself) for a group to record this information?  Or
   should groups be able to self-identify a level of fluency in each
   language?  If so, what do we intend "Group fluency in Language X"
   to mean for a heterogeneous Group?  What if, before beginning, the
   Scribe was asked to agree or disagree with a few assertions like
   "My group has members who can read Language X comfortably"

 * What does "endorsement" mean for a monolingual Group?  When L = 1,
   we assume that each group speaks (some flavor of) the language of
   the workshop.  But when L > 1, a group which is unable to even read
   one of the languages is really flying blind in their endorsements.
   Can you imagine endorsing something in a language that is opaque to
   you?  What does an endorsement like this actually mean?  If we know
   something about the Group itself (e.g. the self-identified fluency
   level idea floated above), does that say something about the nature
   of each endorsement?  What about a group in close physical
   proximity to another Group with different fluency levels?  What
   about a group chatting with another group in an overlapping
   language?

 * An "incomplete" state for Rights: we talked about having a Right
   that does not have Localizations in all L languages be considered
   "incomplete".  Incomplete Rights would not be eligible for
   endorsement, but would be editable.  The group that marks an
   incomplete Right as "completed" would be automatically added as an
   endorser (though like all endorsements, this one could be revoked
   by the endorsing group at any time).  I'm not sure the "incomplete"
   state is a useful one for this exercise, for the following reasons:

   Would we have the "incomplete" flag applied to the Right as a
   whole, or would apply it to each Localization?  Where L = 1, the
   distinction is moot.  When L > 1, if we apply "incomplete" to the
   individual Localizations, the interesting case is when one
   Localization is "complete" and another is "incomplete".  What is
   the functional difference between these Localizations?  In this
   state, do we restrict editing on the "completed" Localizations?  If
   so, that's the only time that we've restricted editing in the whole
   system.  And a group that wants to edit that locked Localization
   simply needs to mark the incomplete Localizations "complete" (even
   if they are nonsense), and then the Right as a whole will be
   unlocked for editing.  This is an incentive for insincere use of
   the "incomplete" flag, and i think it devalues the flag that way.
   If there is no functional difference between "incomplete" and
   "complete", what purpose does the flag serve?  Who is allowed to
   set it?  Who is allowed to clear it?

   Given the above concern, perhaps "incomplete" should apply to the
   current version of the Right as a whole.  But again, what does this
   mean functionally?  If a monolingual group wants to start getting
   endorsements on its right, but it doesn't have a translation, what
   if the group puts in the equivalent of "i don't know" in the
   unknown localization and starts soliciting endorsements anyway?
   Clearly, any legitimate endorsements could only come from other
   speakers of the same language, but maybe the group decides that
   tradeoff is worthwhile, if it means that at least endorsements can
   be gathered?  Consider a situation where L = 3, and the 3rd
   language is Magyar, unknown by most participants in the Workshop.
   This could lead to Rights being widely endorsed *without* an
   acceptable translation, or only endorsed by one language Group or
   another.

   If we're OK with rights being endorsed *without* an acceptable
   translation, then why bother with the incomplete flag in the first
   place?  Granted, this means that the system will accumulate more
   untranslated rights in the dominant language, but since any Group
   can be a "spoiler", there is already an incentive to at least try
   to make sure that all supported languages have a Localization for
   rights that they care about.  This incentive exists regardless of
   any "incomplete" flag.


A side note:

I found an interesting free machine translation utility: apertium.
This tool specializes in related-language translation
(spanish-catalan, spanish-portuguese), but apertium-en-es is in debian
lenny.  I have not tested it yet.

OK, this e-mail is way too long already.  I look forward to hearing
other people's suggestions and ideas.

      --dkg
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 826 bytes
Desc: not available
Url : http://lists.mayfirst.org/pipermail/fsa-guatemala/attachments/20080916/0ed50ce3/attachment.pgp 


More information about the Fsa-guatemala mailing list