SlideShare une entreprise Scribd logo
1  sur  54
Parlez-vous zwo-null, señor?
Providing Cross-National and Multi-Lingual
Web Communities


                                                October 22th, 2008
                                                    Andreas Ravn
                                              Senior Software Engineer
                                             namics (deutschland) gmbh
                                                              Hamburg




   1
Worth ladies and Mr.,
   I be pleased that nevertheless
or other one grants too to this early
that away was not afraid and to this
lecture with the title parlez vous two
      zero senor to appear here.



  3
With one another will we in
    the next grants a hopefully
completely stimulating trip the world
     from chances and tücken
  multilingual Web of communities
   to undertake and now geht' s
      also equivalent loosely.



   4
Parlez-vous zwo-null, señor?
Providing Cross-National and Multi-Lingual
Web Communities


                                                October 22th, 2008
                                                    Andreas Ravn
                                              Senior Software Engineer
                                             namics (deutschland) gmbh
                                                              Hamburg




   5
Web 1.0




   6
Web 1.0

»   Content centrally created by editorial staff
»   Internal Quality Assurance for content
»   Content inheritance
»   Control over
    – website structure
    – Languages used
    – Localized content




     7
8
9
10
Web 2.0 applications (and the like)

»   social networks
»   communities
»   content portals
»   grassroot
»   aggregation sites
    (lists, social
    bookmarking)
»   crowdsourcing sites
»   discussion groups
»   weblogs
»   wikis
»   …
    13
In Web 2.0

» User generated content
   –    content elements
   –    reviews
   –    comments
   –    profiles
   –    discussions


» User generated structure
   –    tags
   –    lists
   –    links
   –    rating, reputation


   14
Where does your content come from?




                                           wikis
                                                         Content portals
       weblogs         Social networks

                                                                           forums
                  aggregation                      Communities

 Centrally generated             content and structure               User-generated




  15
Web 2.0

» Provider holds less control over
   – Content creation
   – Categorization of content
   – Quality of content

» How to organize a multi-lingual community?
» How to organize a community that has members in
   different countries?




   16
Why bother?




 17
Why mix content and users in the first place?

» more members = more content, more exchange
» maybe higher quality level
» community model may require multi-country context (or
   benefit from)
» Yes, we’re international!




   18
Now what‘s so difficult about this?




  19
« Localization
     is a technical problem
           and will be solved
           by my developers
                in no time. »
                              — a customer
                (localization not online yet)




20
Localization issues.

»   Technical
»   Content
»   Organizational
»   Cultural
»   Marketing




    21
Issue: Pinpoint the user.

» Where does my user come from and what languages
   does he speak?
    –   browser language
    –   chosen domain
    –   IP address tables
    –   profiles
» If wrong, the user knows better, and can configure.




   22
Issue: Encoding, character sets, codepages

» Special characters
   – iso-8859-1 = latin
        iso-8859-5 = cyrillic
        iso-8859-9 = greek
        …
» Solution (mostly): UTF-8
   – Now standard; generally not a
        problem anymore
   – Check software components




   23
Issue: Content and structure

» User Generated Content, sure.
» Structure: e.g. tags




   24
artist bad email gift
 handy schmuck fick
     pain   chef garage mist



25
Can we translate user generated content?

» Automatically
   – Translation services exist
   – e.g. Google Language API (translation, language
        detection)
   – Problems:
         – General quality
         – special vocabulary
         – fixed terms, technical and functional terms
         – UGC: You speak on behalf of the user!
              – bad translation reduces the author’s credibility




   26
Can we translate user generated content? (2)

» Automatically
» Editorial
   – Lots of work!
   – Featured user articles, focus on „best rated“
» Community
   – Involve community
   – Reward advertising




   28
Localization Is Not Just Translation

» Simpler aspects: Date/Time Formats, Timezones,
   Currencies, Temperature, Measures
» Work & research: Audio, video
» Culture: Meaning, customs and taste (e.g. image types,
   colors) vary in in different countries
» Don‘t forget: legislative provisions, etc.




    29
Issue: Searching (Full Text Search)

» What language are the search terms in?
   – Use context (language, profile, country) or language
        detection
   – Spelling
» How to display the result?
   – Tokenizing, stemming, similarity/proximity
   – Result set: what to show? Mix languages?
   – Result structure: relevance!




   30
More issues.

» Geocoding: different standards
» Administration: e.g. user support




   32
Some Approaches.

Lösungsvorschlag.




          14. Juli 2006
     Vorname Name, Funktion



    33
Approach 1
„Web 1.0“




  34
1. „Web1.0“.

» How it works.
   – Separate communities for every language and/or country.
» Advantages
   – No localization problems
» Disadvantages
   – No intl. web community at all
   – Redundancy
   – Critical user mass for each community




   35
Approach 2
     „Laissez-faire“




36
2. Laissez-faire.

» How it works.
   – The user is free to use any language he likes.
   – The frontend is localized, thus giving the impression of a
        “web 1.0” site.
» Advantages
   – Easy to implement
   – Better language level in the individual language
» Disadvantages
   – Navigation, search, categorization diffuse – result quality
        suffers




   37
Approach 3
„Common Ground“




  38
3. „Common Ground“

» How it works
   – Offer the community platform only in the “lingua franca”
» Advantages
   – Includes most of your users.
» Disadvantages
   – Most common language still leaves out large relevant
        groups (e.g. French also a large community)
   – Language level altogether lower (in foreign language)




   39
Approach 4
„S.O.D.“




  40
4. „S.O.D.“

» How it works.
   – Provider chooses the language to be used.
   – Frontend is available only in this language
» Advantages
   – Easy to implement
» Disadvantages
   – Foreign users are not encouraged to contribute content




   41
Approach 5
„Babelfish“




  42
5. „Babelfish“

» How it works.
   – Contributors use their own language to enter content
   – All content is presented in the user’s own language
» Advantages
   – Maximum coverage of the community
   – No need to learn Esperanto or Volapük! ;-)
» Disadvantages
   – Automatic translation necessary. Quality issues.
   – Cultural problems




   43
What Others Do




  45
How much localization
do you need?




 46
47
http://der-mo.net/relationBrowser
             48
How much localization do you need?

» Check against your community needs
   – e.g.: local.ch in Switzerland
   – covers all of Switzerland, but
        – CH has strict(-ish) regional language zones
        – services are used mostly locally




   49
Controlled vocabulary




 50
Prefer „controlled vocabulary“.

» Structured vs. unstructured data
» Avoid free text (where you can):
   – Text analysis is complicated and still nonsatisfying
» What are your „core assets“?
» What data can be made structured?
   – data that is language-neutral (e.g. Names)
   – Prefer pre-defined options to free text
         – naturally pre-defined (e.g. male/female)
         – definable by yourself (e.g. categorizations at ebay)
   – easily translatable and probably already available data
        (e.g. City names)




   51
What data to show to the user?

» Be consistent.
» A good practice:
   – show all structured data in a translated form (e.g. in a
        user profile)
   – Let user enter his spoken languages in his profile (and
        show those)
   – offer to present other-language data
         – either in native language
         – or translated („Would you like to see…“)
         – shows awareness of multi-language problem




   53
Gentle integration of other language/country
content
» Priorize for relevancy (once again)
» You may offer automatic translations…
   – … but ask before doing so…
   – … and clearly label an automatic translation.
   – Check quality. Allow user feedback.
» Decide according to your community model, if country
   content can be mixed
   – also cultural issues




   54
Internationalization from the beginning


 » Every bit of content has an origin
        – For each tiny content element, save the language and
          region.
 » Some software systems offer multilanguage support
        – e.g. Drupal: user indicates his home country, the primary
          language, other languages he speaks
        – Automatic composition of regionalized pages
 » Use language-neutral page templating
        – Technical issue: Separate user interface and code, keep
          page templates language-neutral




   55
Translation Files




   MSG_NEW_MAIL_ALERT = „Good
   morning, {USER_FIRST_NAME}.
   You have {NUM_NEW_MAILS}
   new mails.“



   56
Grammar and special formats




 {USER_NAME} {START_DATE} {END_DATE}
 {TXT_IS_ON_VACATION}




  57
„Andreas Ravn from 18.10.2008 to
21.10.2008 is on vacation.“




 58
Cultural issues




    Welcome, Andreas!
   Welcome, Mr. Ravn!



   59
So what should I do?




   60
Thank you.                          Andreas Ravn

Danke.
Mahalo.
Merci villmah.                      namics (deutschland) gmbh
Ookini arigatou.                    andreas.ravn@namics.com
                                    http://www.namics.com
Néá'êshemeno.
Wokol a wala.
Dêkuji.
Bedankt.
Paljon kiitoksia.
Gracias.
Nagyon köszönöm.
Qujanarssuaq.
Rakux u kapamaxemaxes namen dimo.




  61

Contenu connexe

Similaire à Parlez-vous zwei-null, señor?

APIs and SDKs: Breaking Into and Succeeding in a Specialty Market
APIs and SDKs: Breaking Into and Succeeding in a Specialty MarketAPIs and SDKs: Breaking Into and Succeeding in a Specialty Market
APIs and SDKs: Breaking Into and Succeeding in a Specialty MarketScott Abel
 
Northern Arizona State ACM talk (10/08)
Northern Arizona State ACM talk (10/08)Northern Arizona State ACM talk (10/08)
Northern Arizona State ACM talk (10/08)Joshua Drake
 
More than a 1000 words
More than a 1000 wordsMore than a 1000 words
More than a 1000 wordsTimothy Kunau
 
ILUG 2008 Templates, Templates Everywhere
ILUG 2008 Templates, Templates EverywhereILUG 2008 Templates, Templates Everywhere
ILUG 2008 Templates, Templates EverywhereKevin Pettitt
 
Building Scalable Backends with Go
Building Scalable Backends with GoBuilding Scalable Backends with Go
Building Scalable Backends with GoShiju Varghese
 
From Inspiration to Activation: Making Online Collaborative Communities Work
From Inspiration to Activation: Making Online Collaborative Communities WorkFrom Inspiration to Activation: Making Online Collaborative Communities Work
From Inspiration to Activation: Making Online Collaborative Communities WorkCommunitySense
 
Chapter 3_Multimedia Design.pdf
Chapter 3_Multimedia Design.pdfChapter 3_Multimedia Design.pdf
Chapter 3_Multimedia Design.pdfHeryMach1
 
TypeScript - Javascript done right
TypeScript - Javascript done rightTypeScript - Javascript done right
TypeScript - Javascript done rightWekoslav Stefanovski
 
Introducing Joost Widgets (2007 talk)
Introducing Joost Widgets (2007 talk)Introducing Joost Widgets (2007 talk)
Introducing Joost Widgets (2007 talk)Dan Brickley
 
FOSDEM 2009 Thunderbird 3 talk
FOSDEM 2009 Thunderbird 3 talkFOSDEM 2009 Thunderbird 3 talk
FOSDEM 2009 Thunderbird 3 talkdavidascher
 
Living in a multiligual world: Internationalization for Web 2.0 Applications
Living in a multiligual world: Internationalization for Web 2.0 ApplicationsLiving in a multiligual world: Internationalization for Web 2.0 Applications
Living in a multiligual world: Internationalization for Web 2.0 ApplicationsLars Trieloff
 
ALOE - Combining User Generated Content and Traditional Metadata
ALOE - Combining User Generated Content and Traditional MetadataALOE - Combining User Generated Content and Traditional Metadata
ALOE - Combining User Generated Content and Traditional MetadataMartin Memmel
 
How to get started in Open Source!
How to get started in Open Source!How to get started in Open Source!
How to get started in Open Source!Pradeep Singh
 
Living in a Multi-lingual World: Internationalization in Web and Desktop Appl...
Living in a Multi-lingual World: Internationalization in Web and Desktop Appl...Living in a Multi-lingual World: Internationalization in Web and Desktop Appl...
Living in a Multi-lingual World: Internationalization in Web and Desktop Appl...adunne
 
Domain Specific Languages
Domain Specific LanguagesDomain Specific Languages
Domain Specific LanguagesLakshan Perera
 
Evergreen Documentation Lightning Talk
Evergreen Documentation Lightning TalkEvergreen Documentation Lightning Talk
Evergreen Documentation Lightning TalkEvergreen ILS
 

Similaire à Parlez-vous zwei-null, señor? (20)

APIs and SDKs: Breaking Into and Succeeding in a Specialty Market
APIs and SDKs: Breaking Into and Succeeding in a Specialty MarketAPIs and SDKs: Breaking Into and Succeeding in a Specialty Market
APIs and SDKs: Breaking Into and Succeeding in a Specialty Market
 
Northern Arizona State ACM talk (10/08)
Northern Arizona State ACM talk (10/08)Northern Arizona State ACM talk (10/08)
Northern Arizona State ACM talk (10/08)
 
Web1 2
Web1 2Web1 2
Web1 2
 
More than a 1000 words
More than a 1000 wordsMore than a 1000 words
More than a 1000 words
 
ILUG 2008 Templates, Templates Everywhere
ILUG 2008 Templates, Templates EverywhereILUG 2008 Templates, Templates Everywhere
ILUG 2008 Templates, Templates Everywhere
 
Tel Vortrag
Tel VortragTel Vortrag
Tel Vortrag
 
Building Scalable Backends with Go
Building Scalable Backends with GoBuilding Scalable Backends with Go
Building Scalable Backends with Go
 
From Inspiration to Activation: Making Online Collaborative Communities Work
From Inspiration to Activation: Making Online Collaborative Communities WorkFrom Inspiration to Activation: Making Online Collaborative Communities Work
From Inspiration to Activation: Making Online Collaborative Communities Work
 
Chapter 3_Multimedia Design.pdf
Chapter 3_Multimedia Design.pdfChapter 3_Multimedia Design.pdf
Chapter 3_Multimedia Design.pdf
 
TypeScript - Javascript done right
TypeScript - Javascript done rightTypeScript - Javascript done right
TypeScript - Javascript done right
 
Introducing Joost Widgets (2007 talk)
Introducing Joost Widgets (2007 talk)Introducing Joost Widgets (2007 talk)
Introducing Joost Widgets (2007 talk)
 
FOSDEM 2009 Thunderbird 3 talk
FOSDEM 2009 Thunderbird 3 talkFOSDEM 2009 Thunderbird 3 talk
FOSDEM 2009 Thunderbird 3 talk
 
Web 2.0
Web 2.0Web 2.0
Web 2.0
 
Web 2.0
Web 2.0Web 2.0
Web 2.0
 
Living in a multiligual world: Internationalization for Web 2.0 Applications
Living in a multiligual world: Internationalization for Web 2.0 ApplicationsLiving in a multiligual world: Internationalization for Web 2.0 Applications
Living in a multiligual world: Internationalization for Web 2.0 Applications
 
ALOE - Combining User Generated Content and Traditional Metadata
ALOE - Combining User Generated Content and Traditional MetadataALOE - Combining User Generated Content and Traditional Metadata
ALOE - Combining User Generated Content and Traditional Metadata
 
How to get started in Open Source!
How to get started in Open Source!How to get started in Open Source!
How to get started in Open Source!
 
Living in a Multi-lingual World: Internationalization in Web and Desktop Appl...
Living in a Multi-lingual World: Internationalization in Web and Desktop Appl...Living in a Multi-lingual World: Internationalization in Web and Desktop Appl...
Living in a Multi-lingual World: Internationalization in Web and Desktop Appl...
 
Domain Specific Languages
Domain Specific LanguagesDomain Specific Languages
Domain Specific Languages
 
Evergreen Documentation Lightning Talk
Evergreen Documentation Lightning TalkEvergreen Documentation Lightning Talk
Evergreen Documentation Lightning Talk
 

Dernier

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 

Dernier (20)

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 

Parlez-vous zwei-null, señor?

  • 1. Parlez-vous zwo-null, señor? Providing Cross-National and Multi-Lingual Web Communities October 22th, 2008 Andreas Ravn Senior Software Engineer namics (deutschland) gmbh Hamburg 1
  • 2. Worth ladies and Mr., I be pleased that nevertheless or other one grants too to this early that away was not afraid and to this lecture with the title parlez vous two zero senor to appear here. 3
  • 3. With one another will we in the next grants a hopefully completely stimulating trip the world from chances and tücken multilingual Web of communities to undertake and now geht' s also equivalent loosely. 4
  • 4. Parlez-vous zwo-null, señor? Providing Cross-National and Multi-Lingual Web Communities October 22th, 2008 Andreas Ravn Senior Software Engineer namics (deutschland) gmbh Hamburg 5
  • 6. Web 1.0 » Content centrally created by editorial staff » Internal Quality Assurance for content » Content inheritance » Control over – website structure – Languages used – Localized content 7
  • 7. 8
  • 8. 9
  • 9. 10
  • 10. Web 2.0 applications (and the like) » social networks » communities » content portals » grassroot » aggregation sites (lists, social bookmarking) » crowdsourcing sites » discussion groups » weblogs » wikis » … 13
  • 11. In Web 2.0 » User generated content – content elements – reviews – comments – profiles – discussions » User generated structure – tags – lists – links – rating, reputation 14
  • 12. Where does your content come from? wikis Content portals weblogs Social networks forums aggregation Communities Centrally generated content and structure User-generated 15
  • 13. Web 2.0 » Provider holds less control over – Content creation – Categorization of content – Quality of content » How to organize a multi-lingual community? » How to organize a community that has members in different countries? 16
  • 15. Why mix content and users in the first place? » more members = more content, more exchange » maybe higher quality level » community model may require multi-country context (or benefit from) » Yes, we’re international! 18
  • 16. Now what‘s so difficult about this? 19
  • 17. « Localization is a technical problem and will be solved by my developers in no time. » — a customer (localization not online yet) 20
  • 18. Localization issues. » Technical » Content » Organizational » Cultural » Marketing 21
  • 19. Issue: Pinpoint the user. » Where does my user come from and what languages does he speak? – browser language – chosen domain – IP address tables – profiles » If wrong, the user knows better, and can configure. 22
  • 20. Issue: Encoding, character sets, codepages » Special characters – iso-8859-1 = latin iso-8859-5 = cyrillic iso-8859-9 = greek … » Solution (mostly): UTF-8 – Now standard; generally not a problem anymore – Check software components 23
  • 21. Issue: Content and structure » User Generated Content, sure. » Structure: e.g. tags 24
  • 22. artist bad email gift handy schmuck fick pain chef garage mist 25
  • 23. Can we translate user generated content? » Automatically – Translation services exist – e.g. Google Language API (translation, language detection) – Problems: – General quality – special vocabulary – fixed terms, technical and functional terms – UGC: You speak on behalf of the user! – bad translation reduces the author’s credibility 26
  • 24. Can we translate user generated content? (2) » Automatically » Editorial – Lots of work! – Featured user articles, focus on „best rated“ » Community – Involve community – Reward advertising 28
  • 25. Localization Is Not Just Translation » Simpler aspects: Date/Time Formats, Timezones, Currencies, Temperature, Measures » Work & research: Audio, video » Culture: Meaning, customs and taste (e.g. image types, colors) vary in in different countries » Don‘t forget: legislative provisions, etc. 29
  • 26. Issue: Searching (Full Text Search) » What language are the search terms in? – Use context (language, profile, country) or language detection – Spelling » How to display the result? – Tokenizing, stemming, similarity/proximity – Result set: what to show? Mix languages? – Result structure: relevance! 30
  • 27. More issues. » Geocoding: different standards » Administration: e.g. user support 32
  • 28. Some Approaches. Lösungsvorschlag. 14. Juli 2006 Vorname Name, Funktion 33
  • 30. 1. „Web1.0“. » How it works. – Separate communities for every language and/or country. » Advantages – No localization problems » Disadvantages – No intl. web community at all – Redundancy – Critical user mass for each community 35
  • 31. Approach 2 „Laissez-faire“ 36
  • 32. 2. Laissez-faire. » How it works. – The user is free to use any language he likes. – The frontend is localized, thus giving the impression of a “web 1.0” site. » Advantages – Easy to implement – Better language level in the individual language » Disadvantages – Navigation, search, categorization diffuse – result quality suffers 37
  • 34. 3. „Common Ground“ » How it works – Offer the community platform only in the “lingua franca” » Advantages – Includes most of your users. » Disadvantages – Most common language still leaves out large relevant groups (e.g. French also a large community) – Language level altogether lower (in foreign language) 39
  • 36. 4. „S.O.D.“ » How it works. – Provider chooses the language to be used. – Frontend is available only in this language » Advantages – Easy to implement » Disadvantages – Foreign users are not encouraged to contribute content 41
  • 38. 5. „Babelfish“ » How it works. – Contributors use their own language to enter content – All content is presented in the user’s own language » Advantages – Maximum coverage of the community – No need to learn Esperanto or Volapük! ;-) » Disadvantages – Automatic translation necessary. Quality issues. – Cultural problems 43
  • 40. How much localization do you need? 46
  • 41. 47
  • 43. How much localization do you need? » Check against your community needs – e.g.: local.ch in Switzerland – covers all of Switzerland, but – CH has strict(-ish) regional language zones – services are used mostly locally 49
  • 45. Prefer „controlled vocabulary“. » Structured vs. unstructured data » Avoid free text (where you can): – Text analysis is complicated and still nonsatisfying » What are your „core assets“? » What data can be made structured? – data that is language-neutral (e.g. Names) – Prefer pre-defined options to free text – naturally pre-defined (e.g. male/female) – definable by yourself (e.g. categorizations at ebay) – easily translatable and probably already available data (e.g. City names) 51
  • 46. What data to show to the user? » Be consistent. » A good practice: – show all structured data in a translated form (e.g. in a user profile) – Let user enter his spoken languages in his profile (and show those) – offer to present other-language data – either in native language – or translated („Would you like to see…“) – shows awareness of multi-language problem 53
  • 47. Gentle integration of other language/country content » Priorize for relevancy (once again) » You may offer automatic translations… – … but ask before doing so… – … and clearly label an automatic translation. – Check quality. Allow user feedback. » Decide according to your community model, if country content can be mixed – also cultural issues 54
  • 48. Internationalization from the beginning » Every bit of content has an origin – For each tiny content element, save the language and region. » Some software systems offer multilanguage support – e.g. Drupal: user indicates his home country, the primary language, other languages he speaks – Automatic composition of regionalized pages » Use language-neutral page templating – Technical issue: Separate user interface and code, keep page templates language-neutral 55
  • 49. Translation Files MSG_NEW_MAIL_ALERT = „Good morning, {USER_FIRST_NAME}. You have {NUM_NEW_MAILS} new mails.“ 56
  • 50. Grammar and special formats {USER_NAME} {START_DATE} {END_DATE} {TXT_IS_ON_VACATION} 57
  • 51. „Andreas Ravn from 18.10.2008 to 21.10.2008 is on vacation.“ 58
  • 52. Cultural issues Welcome, Andreas! Welcome, Mr. Ravn! 59
  • 53. So what should I do? 60
  • 54. Thank you. Andreas Ravn Danke. Mahalo. Merci villmah. namics (deutschland) gmbh Ookini arigatou. andreas.ravn@namics.com http://www.namics.com Néá'êshemeno. Wokol a wala. Dêkuji. Bedankt. Paljon kiitoksia. Gracias. Nagyon köszönöm. Qujanarssuaq. Rakux u kapamaxemaxes namen dimo. 61