🐢 Mood: Delighted that I saw 5 turtles and 2 frogs at the local pond.

🎵 Soundtrack: Brazilian Lounge Playlist

📚 Reading Github Issues

Started today by reading the 25 new issues.

🇳🇴 🇩🇰Matching Preferred Against Supported Languages

Consider an application, such as a web site, with support for multiple languages in its user interface. When a user arrives with a list of preferred languages, the application must decide which language it should use in its presentation to the user. This requires finding the best match between the languages the application supports and those the user prefers. This post explains why this is a difficult decision and how Go can help. - “Language and Locale Matching in Go Blog Post”

From reading the x/text/language pkg docs I learned that each language has a tag and can have a local.

For example

  • en: English
  • en-US: American English
  • en-GB: British English
  • az: Azerbaijani
  • az-Latn: Azerbaijani written in Latin script
  • az-Arab: Azerbaijani written in Arabic script

The matcher has a bunch of complicated information based on each language and the script it is written in. You can compare two languages and see how close they are:

Language A Language B Confidence in Match
French French Exact
French Canadian French High
Simplified Chinese Traditional Chinese Low
Simplified Chinese English No

So the server has list of supported languages, a user reports which languages they speak, and then the matcher will give the best match:

package main

import (
	"fmt"

	"golang.org/x/text/language"
)

func main() {

	supportedLanguages := []language.Tag{
		language.English,
		language.BritishEnglish,
		language.French,
		language.Afrikaans,
		language.BrazilianPortuguese,
		language.EuropeanPortuguese,
		language.SimplifiedChinese,
	}

	languagesUserSpeaks := []language.Tag{
		language.Danish,
		language.MustParse("en-au"),
	}

	m := language.NewMatcher(supportedLanguages)

	lang, _, conf := m.Match(languagesUserSpeaks...)

	fmt.Printf("Chose %v with %v confidence", lang, conf)
}

// This outputs: Chose en-GB-u-rg-auzzzz with High confidence

But wtf is en-GB-u-rg-auzzzz?!

The comments in golang were very unhelpful. I finally found this unicode document that explains that the prefix -u-rg- and the suffix zzzz indicate a region override. This means that the language en-GB was chosen (because it knows that British English is the closest to Australian English). Then au is the region override, which is basically additional information so a server could alter the information regarding default currency, default calendar, default measurement system, etc.

During this investigation I also found this comment listing a bunch of unfixed bugs in the language package. Maybe something to come back to if I ever run out of things to do.

🐞 The bugs

So in researching this language stuff and reproducing the already reported bug I noticed that the example for Matcher was not running properly. I looked at the source code and noticed an odd TODO at the bottom.

When I ran the test locally as is, it passed. When I removed the TODO it failed. That TODO was put there 4 years ago, back before any region overrides were added to the matched languages.

🎉🎉 So I submitted my first two issues

  1. x/text/language: ExampleMatcher test is broken. This one I submitted a fix for.
  2. testing: false positives for examples with comments at end

📎 How are examples rendered?

I think there is a 3rd issue though. Not only was there a false positive when I ran go test, but ALSO the example website couldn’t render the test properly. It was missing the last }.

I think this was related to the TODO as well, but I can’t prove it, because I can’t change it and rerender the website (or I haven’t figured out how to yet).

I started to look into how the rendering is happening. I found this Examples function in the docs package that reads from a test file and then turns all the data into an ast.

Here is my example test file:

package main

import "fmt"

func ExampleThisTestShouldNotPass() {
	fmt.Println("meow")
	// Output:
	// bark

	// TODO
}

Here is my code that parses it:

package main

import (
	"fmt"
	"go/ast"
	"go/doc"
	"go/parser"
	"go/token"
	"log"
)

func main() {
	fset := token.NewFileSet()
	node, err := parser.ParseFile(fset, "examples_test.go", nil, parser.ParseComments)
	if err != nil {
		log.Fatal(err)
	}
	examples := doc.Examples(node)

	first := examples[0]

	fmt.Printf("Name: %v\n Doc: %v\n Code: %v\n Output: %v\n", first.Name, first.Doc, first.Code, first.Output)
	fmt.Printf("Play: %v\n Comments: %v\n\n", first.Play, first.Comments)

	ast.Print(fset, first.Code)
}

Here is the output:

$ go run main.go
Name: ThisTestShouldNotPass
 Doc:
 Code: &{65 [0xc000010130] 119}
 Output:
Play: <nil>
 Comments: [0xc00000c078 0xc00000c0a8]

     0  *ast.BlockStmt {
     1  .  Lbrace: examples_test.go:5:37
     2  .  List: []ast.Stmt (len = 1) {
     3  .  .  0: *ast.ExprStmt {
     4  .  .  .  X: *ast.CallExpr {
     5  .  .  .  .  Fun: *ast.SelectorExpr {
     6  .  .  .  .  .  X: *ast.Ident {
     7  .  .  .  .  .  .  NamePos: examples_test.go:6:2
     8  .  .  .  .  .  .  Name: "fmt"
     9  .  .  .  .  .  }
    10  .  .  .  .  .  Sel: *ast.Ident {
    11  .  .  .  .  .  .  NamePos: examples_test.go:6:6
    12  .  .  .  .  .  .  Name: "Println"
    13  .  .  .  .  .  }
    14  .  .  .  .  }
    15  .  .  .  .  Lparen: examples_test.go:6:13
    16  .  .  .  .  Args: []ast.Expr (len = 1) {
    17  .  .  .  .  .  0: *ast.BasicLit {
    18  .  .  .  .  .  .  ValuePos: examples_test.go:6:14
    19  .  .  .  .  .  .  Kind: STRING
    20  .  .  .  .  .  .  Value: "\"meow\""
    21  .  .  .  .  .  }
    22  .  .  .  .  }
    23  .  .  .  .  Ellipsis: -
    24  .  .  .  .  Rparen: examples_test.go:6:20
    25  .  .  .  }
    26  .  .  }
    27  .  }
    28  .  Rbrace: examples_test.go:11:1
    29  }

From my very untrained eye it looks like it is grabbing that last bracket. So this doesn’t explain why it wasn’t rendering correctly on the website. However it is not grabbing the output properly. Maybe that is the source of the second issue I opened.

🧐 What about all those TODOs from Friday?

I haven’t forgotten them! I swear!