Why Infer Types?

May 26, 2022

Someone at work asked the following question:

Why write code like this:

foo := getFoo()
bar, err := getBar()

instead of this:

var foo Type = getFoo()

var err error
var bar Type

bar, err = getBar(foo)

Isn’t the latter more explicit? Isn’t explicit better? It’ll be easier to review because you’ll be able to see all the types.

Well, yes and no.

For one thing, between the name of the function you’re calling and the name of the variable you’re assigning to, the type is obvious most of the time, at least at the high level.

userID, err := store.FindUser(username)

Maybe you don’t know if userID is a UUID or some custom ID type, or even just a numeric id… but you know what it represents, and the compiler will ensure you don’t send it to some function that doesn’t take that type or call a method on it that doesn’t exist.

In an editor, you’ll be able to hover over it to see the type. In a code review, you may be able to as well, and even if not, you can rely on the compiler to make sure it’s not being used inappropriately.

One big reason to prefer the inferred type version is that it makes refactoring a lot easier.

If you write this code everywhere:

var foo FooType = getFoo()
doSomethingWithFoo(foo)

Now if you want to change getFoo to return a Foo2, and doSomethingWithFoo to take a Foo2, you have to go change every place where these two functions are called and update the explicitly declared variable type.

But if you used inference:

foo := getFoo()
doSomethingWithFoo(foo)

Now when you change both functions to use Foo2, no other code has to change. And because it’s statically typed, we can know this is safe, because the compiler will make sure we can’t use Foo2 inappropriately.

Does this code really care what type getFoo returns, or what type doSomethingWithFoo takes? No, it just wants to pipe the output of one into the other. If this shouldn’t work, the type system will stop it.

So, yes, please use the short variable declaration form. Heck, if you look at it sideways, it even looks kinda like a gopher :=

Safer Enums

May 13, 2022

Series: Bite Sized Go

How to “do” enums is a common problem in Go, given that it doesn’t have “real” enums like other languages. There’s basically two common ways to do it, the first is just typed strings:

type FlagID string
const (
    FooBar FlagID = “FooBar”
    FizzBuzz FlagID = “FizzBuzz”
)
func IsEnabled(id FlagID) bool {

The problem with this is that string literals (really, string constants) in Go will get converted to the correct type, so you’d still be able to call IsEnabled(“foo-bar”) without the compiler complaining.

A common replacement is to use numeric constants:

type FlagID int
const (
    FooBar FlagID = iota
    FizzBuzz
)
func IsEnabled(id FlagID) bool {

This is nice, because it would be pretty odd to see code like IsEnabled(4). But the problem then becomes that you can’t easily print out the name of the enum in logs or errors.

To fix this, someone (Rob Pike?) wrote stringer, which generates code to print out the name of the flags… but then you have to remember to run stringer, and it’s a bunch of (really) ugly code.

The solution to this was something I first heard suggested by Dave Cheney (because of course it was), and is so simple and effective that I can’t believe I had never thought of it before. Make FlagName into a very simple struct:

type FlagID struct {
    name string
}
func (f FlagID) String() { return f.name } 

var (
    FooBar = FlagID{ “FooBar” }
    FizzBuzz = FlagID{ “FizzBuzz” }
)

func IsEnabled(id FlagID) bool {

Now, you can’t call IsEnabled(“nope”), because the constant string can’t be converted into a struct, so the compiler would complain.

There’s no size difference between a string and a struct{ string } and it’s just as easy to read as a straight string. Because of the String() method, you can pass these values to %s etc in format strings and they’ll print out the name with no extra code or work.

The one tiny drawback is that the globals have to be variables instead of constants, but that’s one of those problems that really only exists in the theoretical realm. I’ve never seen a bug from someone overwriting a global variable like this, that is intended to be immutable.

I’ll definitely be using this pattern in my projects going forward. I hope this helps some folks who are looking to avoid typos and accidental bugs from stringly typed code in Go.

Error Flags

Apr 7, 2021

Error wrapping in go 1.13 solved a major problem gophers have struggled with since v1: how to add context to errors without obscuring the original error, so that code above could programmatically inspect the original error. However, this did not – by itself – solve the other common problems with errors: implementation leakage and (more generally) error handling.

Fragile Error Handling

In 2016, Dave Cheney wrote a blog post that includes a section titled “Assert errors for behaviour, not type”. The gist of the section is that you don’t want code to depend on implementation-specific error types that are returned from a package’s API, because then, if the implementation ever changes, the error handling code will break. Even four and a half years later, and with 1.13’s new wrapping, this can still happen very easily.

For example, say you’re in an HTTP handler, far down the stack in your data layer. You’re trying to open a file and you get an os.ErrNotExist from os.Open. As of 1.13, you can add more context to that error without obscuring the fact that it’s an os.ErrNotExist. Cool, now the consumers of that code get a nicer error message, and if they want, they can check os.IsNotExist(err) and maybe return a 404 to the caller.

Right there, your web handler is now tied to the implementation details of how your backend, maybe 4 levels deep in the stack, stores data. If you decide to change your backend to store data in S3, and it starts returning s3.ObjectNotFound errors, your web handler won’t recognize that error, and won’t know to return 404. This is barely better than matching on the error string.

Dave’s Solution - Interfaces

Dave proposes creating errors that fulfill interfaces the code can check for, like this:

type notFound interface {
        NotFound() bool
}
 
// IsNotFound returns true if err indicates the resource doesn’t exist.
func IsNotFound(err error) bool {
        m, ok := err.(notFound)
        return ok && m.NotFound()
}

Cool, so now you can ensure a consistent API without relying on the implementation-specific type of the error. Callers just need to check for IsNotFound, which could be fulfilled by any type. The problem is, it’s missing a piece. How do you take that os.NotExistErr and give it a IsNotFound() method? Well, it’s not super hard, but kind of annoying. You need to write this code:

// IsNotFound returns true if err indicates the resource doesn’t exist.
func IsNotFound(err error) bool {
        n, ok := err.(notFound)
        return ok && n.NotFound()
}
// MakeNotFound wraps err in an error that reports true from IsNotFound.
func MakeNotFound(err error) error {
    if err == nil {
        return nil
    }
    return notFoundErr{error: err}
}        

type notFound interface {
        NotFound() bool
}

type notFoundErr struct {
    error
}

func (notFoundErr) NotFound() bool {
    return true
}

func (n notFoundErr) Unwrap() error {
    return n.error
}

So now we’re at 28 lines of code and two exported functions. Now what if you want the same for NotAuthorized or ? 28 more lines and two more exported functions. Each just to add one boolean of information onto an error. And that’s the thing… this is purely used for flow control - all it needs to be is booleans.

A Better Way - Flags

At Mattel, we had been following Dave’s method for quite some time, and our errors.go file was growing large and unwieldy. I wanted to make a generic version that didn’t require so much boilerplate, but was still strongly typed, to avoid typos and differences of convention.

After thinking it over for a while, I realized it only took a slight modification of the above code to allow for the functions to take the flag they were looking for, instead of baking it into the name of the function and method. It’s of similar size and complexity to IsNotFound above, and can support expansion of the flags to check, with almost no additional work. Here’s the code:

// ErrorFlag defines a list of flags you can set on errors.
type ErrorFlag int

const (
NotFound = iota + 1
NotAuthorized
	// etc
)

// Flag wraps err with an error that will return true from HasFlag(err, flag).
func Flag(err error, flag ErrorFlag) error {
	if err == nil {
		return nil
	}
	return flagged{error: err, flag: flag}
}

// HasFlag reports if err has been flagged with the given flag.
func HasFlag(err error, flag ErrorFlag) bool {
	for {
		if f, ok := err.(flagged); ok && f.flag == flag {
			return true
		}
		if err = errors.Unwrap(err); err == nil {
			return false
		}
	}
}

type flagged struct {
	error
	flag ErrorFlag
}

func (f flagged) Unwrap() error {
	return f.error
}

To add a new flag, you add a single line to the list of ErrorFlags and you move on. There’s only two exported functions, so the API surface is super easy to understand. It plays well with go 1.13 error wrapping, so you can still get at the underlying error if you really need to (but you probably won’t and shouldn’t!).

Back to our example: the storage code can now keep its implementation private and flag errors from the backend with return errors.Flag(err, errors.NotFound). Calling code can check for that with this:

if errors.HasFlag(err, errors.NotFound) {
    // handle not found
}

If the storage code changes what it’s doing and returns a different underlying error, it can still flag it with that with the NotFound flag, and the consuming code can go on its way without knowing or caring about the difference.

Supporting Errors.Is and Errors.As

This is an update in 2022, and I realized that there’s an easier way to do this that properly supports errors.Is and errors.As. In go 1.20, there will be an errors.Join method that can let you combine two errors into one where either one will be found by errors.Is and errors.As. Until then, you can use github.com/natefinch/wrap. Then you can just define the flags as straight errors.

package flags

var (
	NotFound = errors.New("not found")
	AlreadyExists = errors.New("already exists")
	// etc
)

Then, as long as the package wraps its errors with those flags (using a package like Wrap or the upcoming errors.Join), you can check for the flag with the normal functions:

    user, err := store.FindUser(id)
	if errors.Is(err, flags.NotFound) {
		// return 404
	}

The nice thing is that you can still get the behavior of the original error because this is non-destructive wrapping. So if you need some low level detail of the underlying error, you can get it.

Indirect Coupling

Isn’t this just sentinel errors again? Well, yes, but that’s ok. In 2016, we didn’t have error wrapping, so anyone who wanted to add info to the error would obscure the original error, and then your check for err == os.ErrNotExist would fail. I believe that was the major impetus for Dave’s post. Error wrapping in Go 1.13 fixes that problem. The main problem left is tying error checks to a specific implementation, which this solves.

This solution does require both the producer and the consumer of the error to import the error flags package and use these same flags, however in most projects this is probably more of a benefit than a problem. The edges of the application code can easily check for low level errors and flag them appropriately, and then the rest of the stack can just check for flags. Mattel does this when returning errors from calling the database, for example. Keeping the flags in one spot ensures the producers and consumers agree on what flag names exist.

In theory, Dave’s proposal doesn’t require this coordination of importing the same package. However, in practice, you’d want to agree on the definition of IsNotFound, and the only way to do that with compile-time safety is to define it in a common package. This way you know no one’s going to go off and make their own IsMissing() interface that gets overlooked by your check for IsNotFound().

Choosing Flags

In my experience, there are a limited number of possible bits of data your code could care about coming back about an error. Remember, flags are only useful if you want to change the application’s behavior when you detect them. In practice, it’s not a big deal to just make a list of a handful of flags you need, and add more if you find something is missing. Chances are, you’ll think of more flags than you actually end up using in real code.

Conclusion

This solution has worked wonders for us, and really cleaned up our code of messy, leaky error handling code. Now our code that calls the database can parse those inscrutable postgres error codes right next to where they’re generated, flag the returned errors, and the http handlers way up the stack can happily just check for the NotFound flag, and return a 404 appropriately, without having to know anything about the database.

Do you do something similar? Do you have a totally different solution? I’d love to hear about it in the comments.

How to Pay Remote Workers

Jul 26, 2019

Series: Remote Work

Pay remote workers the same as you’d pay local workers. Or vice versa if your local workers are cheap.

That’s it. That’s the blog post.

It’s 2019, folks. Average home internet speeds are more than enough for video conferencing and every single laptop has a built-in video camera. Conference room video hardware has come way down in price and gone way up in quality. Everyone collaborates via Slack and email and Jira and wikis and shared documents in the cloud anyway. Our code is hosted in the cloud, ci/cd in the cloud, deployed to the cloud. Why on earth would it matter where your desk is?

The truth is, it doesn’t matter. With extremely low effort, any company can hire remote folks and have them be productive, collaborative members of a team. I should know, I’ve done it for the last 8 years.

One thing that always comes up with remote employees is “how much should I pay them?” I’m not exactly sure why this is even a question…. actually, yes, I am sure. Because companies are cheap and want to pay employees as little as possible. They are, after all, a business. So I guess the question is more accurately asked “How can I justify paying my employees less while still getting great talent?”

The answer is always the same - cost of living adjustments. The theory is that you pay everyone equitably, so they all sustain the same standard of living. i.e. you pay the person in San Francisco enough for rent and food and spending money for a new XBox every month. You do the same for the person in rural Ohio - rent, food, Xbox every month.

Just like a meritocracy, on its face, this sounds perfectly fair. But peek under the surface, and it’s easily dismissed as false equivalence. Why is a SF apartment four times the cost of the same apartment in rural Ohio? Because of supply and demand. Because people believe the apartment in SF is worth more, so they’re willing to pay more. Why do they believe that? Because the apartment in SF is near awesome restaurants, easy public transportation, lots of great similar-minded folks, etc. etc.

These are attributes of the apartment that don’t fit on a spreadsheet of square footage, number of bedrooms, and lot size… but they have a huge effect on the price of the home. Clearly, that is what you’re paying for when you buy a $500k studio in SF.

So, if the house in SF is clearly more valuable than an equivalent-sized one in rural Ohio… why should the company subsidize paying for those invisible benefits that come with a house in SF? Would you pay someone more who lived in a bigger house in the same city? Why not? Why is it ok for companies to subsidize the location-based value of a home, but not the value derived from square-footage or lot size?

To put a finer point on it… would you pay someone less who lives on the wrong side of the tracks in the same city? That’s still location-based, isn’t it?

The thing is, the value of money isn’t actually different in SF and rural Ohio. Buying an XBox from Amazon costs the same in both places. $3000 a month in rent for a studio or $3000 a month in mortgage for a 4 bedroom house…. still costs you $3000. If you live in SF, you’re saying that studio’s location is worth $3000 a month to you. If you live in rural Ohio, you’re saying the extra bedrooms and big backyard are worth $3000 a month to you.

…so why would you pay the person in Ohio less?

Someone on Twitter mentioned they understood paying people more who live in high cost of living areas, but thought it would be weird to pay people less who live in low cost of living areas…. but it’s really the exact same thing. You pay the person in SF more, and you’re just paying everyone else less. You can’t have it one way and not the other.

Does this mean you have to compete with Google’s salaries if your company is in rural Ohio? Yes and no. It’s true that Google and the other big-five tech companies pay people a lot more. But that’s true even in Silicon Valley. I’ve interviewed at lots of SF companies that weren’t able to compete with those kind of salaries either, but they still get to hire a lot of great talent. The big five may have a lot of devs, but they can’t hire all the devs. And since hiring is really hard, they don’t even get all the best devs. The big five mostly pay a lot of money to keep the other four from poaching… i.e. they’re really only competing with each other.

So, you might not have to compete with the Googles of the world, but you probably do have to compete with the Salesforces, Stripes, and (previous to acquisition) Githubs. While those companies generally pay more than some random tech company, it’s not double or triple. It’s like 30% more. And honestly, developers are worth that much. Basically every company in existence needs developers, or needs to pay a service vendor for specialized software.

Hiring managers - the onus is on you to stop this predatory and unfair hiring practice. Don’t accept it as “just the way things are”. Speak up against it. Fight to get your remote developers the same salary and benefits your on-site folks get. Their work is just as valuable to the company as the local folks, paying them less is unfair, insulting, and wrong.

Do or Do Not

Jun 7, 2019

The proposal

There’s a new Go proposal in town - try(). The gist is that it adds a builtin function try() that can wrap a function that returns (a, b, c, …, error), and if the error is non-nil, it will return from the enclosing function, and if the error is nil, it’ll return the rest of the return values.

This is how it looks in code:

func doIt() (string, int, error){
    return "Daisy", 45, io.EOF
}

func tryIt() error {
    name, age := try(doIt())
    // use name, age
    return nil
}

In the above, if doIt returns a non-nil error, tryIt will exit at the point where try is called, and will return that error.

Complications

So here’s my problem with this… it complicates the code. It adds points where your code can exit from inside the right hand side of a statement somewhere. It can make it very easy to miss the fact that there’s an early exit statement in the code.

The above is simplistic, it could instead look like this:

func tryIt() error {
    fmt.Printf("Hi %s, happy %vth birthday!\n", try(doIt())
    // do other stuff
    return nil
}

At first blush, it would be very easy to read that code and think this function always returns nil, and that would be wrong and it could be catastrophically wrong.

The Old Way

In my opinion, the old way (below) of the original code is a lot more readable. The exit point is clearly called out by the return keyword as well as the indent. The intermediate variables make the print statement a lot more clear.

func tryIt() error {
    name, age, err := doIt()
    if err != nil {
        return err
    }
    fmt.Printf("Hi %s, happy %vth birthday!\n", name, age)
    return nil
}

Oh, and did you catch the mismatched parens on the Printf statement in the try() version of tryIt() above? Me neither the first time.

Early Returns

Writing Go code involves a LOT of returning early, more than any other popular language except maybe C or Rust. That’s the real meat of all those if err != nil statements… it’s not the if, it’s the return.

The reason early returns are so good is that once you pass that return block, you can ignore that case forever. The case where the file doesn’t exist? Past the line of os.Open’s error return, you can ignore it. It no longer exists as something you have to keep in your head.

However, with try, you now have to worry about both cases in the same line and keep that in your head. Order of operations can come into play, how much work are you actually doing before this try may kick you out of the function?

One idea per line

One of the things I have learned as a go programmer is to eschew line density. I don’t want a whole ton of logic in one line of code. That makes it harder to understand and harder to debug. This is why I don’t care about missing ternary operator or map and filter generics. All those do is let you jam more logic into a single line, and I don’t want that. That makes code hard to understand, and easier to misunderstand.

Try does exactly that, though. It encourages you to put a call getting data into a function that then uses that data. For simple cases, this is really nice, like field assignment:

p := Person{
    Name: try(getUserName()),
    Age: try(getUserAge()),
}

But note how even here, we’re trying to split up the code into multiple lines, one assignment per line.

Would you ever write this code this way?

p := Person{Name: try(getUserName()), Age: try(getUserAge())}

You certainly can, but holy crap, that’s a dense line, and it takes me an order of magnitude longer to understand that line than it does the 4 lines above, even though they’re just differently formatted version of the exact same code. But this is exactly what will be written if try becomes part of the language. Maybe not struct initialization, but what about struct initialization functions?

p := NewPerson(try(getUserName()), try(getUserAge()))

Nearly the same code. Still hard to read.

Nesting Functions

Nesting functions is bad for readability. I very rarely nest functions in my go code, and looking at other people’s go code, most other people also avoid it. Not only does try() force you to nest functions as its basic use case, but it then encourages you to use that nested function nested in some other function. So we’re going from NewPerson(name, age) to NewPerson(try(getUserName()), try(getUserAge())). And that’s a real tragedy of readability.

Hiring Remote

May 14, 2019

Series: Remote Work

I have been working remotely for about 8 years now. I’ve worked at companies that did it poorly, and companies that did it well. Let me define remote for a minute. I mean fully remote. Like, I can count on one hand the number of times per year I see my coworkers in person and have fingers left over.

I was the first remote employee in my division at Mattel, and I helped guide the culture toward supporting remote employees. Mattel, for its part, has been very supportive, and honestly did many things right without even really thinking about them as supporting remote employees.

I have a lot of thoughts about remote work, and I’ll probably turn this into a series of posts on the subject. For now, I’m going to start at the beginning - hiring.

I’ve interviewed at over a dozen companies that support remote employees. How the interviewing process goes tells me a lot about whether or not a company really supports remote employees.

When I interviewed at Canonical, the whole interview was remote. I never saw anyone in person until after I got my offer, and then it was just a run to the nearest office to sign paperwork. I literally never met any of my coworkers in person until our first offsite about three months in. And that’s totally ok.

When I interviewed at Mattel it was much the same, except I was brought on as a contractor first, which allowed me to prove myself, and then they were happy to just continue letting me do my thing as a full time employee 3000 miles away from the rest of the team. I pushed my boss to hire more remote devs, and we now have a team that is almost fully remote.

Many places I’ve interviewed want you to come onsite after some number of interviews to “meet the team”. While this is ok, it tends to make me think those places aren’t as fully bought into remote culture. There was no office to go into at Canonical. Meeting the team was getting on a google hangout (and that’s fine).

If you buy into remote culture, meeting someone in a video chat should be good enough. After all, that’s how you’re going to interact with them 99% of the time. Just as the whiteboard is a relic of another time, I believe the onsite interview is a relic if the job is remote. (If the job is not remote, then I think it’s pretty important to get a handle on how the person interacts in person with other people… but that’s not what we’re talking about.)

I don’t code on a whiteboard at work, and I don’t code in a meeting room with another developer at work either. And honestly, they almost never ask me to code in that meeting room. It’s all talk and drawing architecture on a whiteboard. Which, like, seriously, save yourself the plane ticket and hotel charge and just let me do that over hangouts.

The problem with having me come on site is that it’s a 2 day thing. I have to leave work early to catch a plane the night before, spend the night in a hotel, get up mildly jetlagged, interview all day, then take a redeye home unless I want to spend a second night in the hotel and get home at like 3 in the afternoon the next day. If I did that for every job I interviewed for last time I was looking, I would have had to take a full month off… it’s just not scalable…. and it’s rough on my family.

Speaking of family, let’s talk about onboarding. Some companies will onboard you remotely. This is great. Paperwork can be tricky, but it’s doable with a notary public (that’s what I did for Mattel). Otherwise, going onsite for onboarding is fine… you sign paperwork, get a company laptop, some swag, etc. All that could be mailed out, but I get that paperwork can be tricky remote.

But that’s like… maybe 3 days if everything goes really slowly. Many places want a week onsite for onboarding. Buh…. to do what? If there are significant things I can only do while onsite, we’re probably going to have problems when I go back to work at my house for months at a time. Also, aren’t many of my coworkers remote, so won’t most of them not even be there? One place even mentioned onboarding was two weeks onsite.

Two weeks is an eternity. I have young kids, and there is zero chance I’m going anywhere onsite for two weeks. I work remote so I can be with my kids. I’m sure a lot of more senior devs out there are in the same position. Making your onboarding process long makes you much less desirable to anyone with a family. Don’t draw it out any longer than necessary. As a mediocre white man who is used to getting his way, I might feel comfortable asking for a reduced onsite, but I bet many other developers who are not so privileged might not.

So, to sum up - if you really want to show you support remote developers, instead of just saying you do, start with the interview. Make as much of your interview process remote as possible, and then make your onboarding as painless as possible. It’ll save you time, it’ll save your candidates time, it’ll save the company money, and it’ll make everyone happier.

Retooling Retool

May 11, 2019

I was so happy when I discovered retool. It’s a go tool that builds and caches go binaries into a local directory so that your dev tools stay in sync across your team. It fixes all those problems where slight difference in binary versions produce different output and cause code churn. We use it at Mattel for our projects, because we tend to have a large number of external tools that we use for managing code generation, database migrations, release management, etc.

However, retool doesn’t work very well with modules, and trying to run it with modules turned off sometimes misbehaves, and some tools just fail to compile that way.

So what to do? Well, it turns out that in the module world, retool can be replaced by a very small mage script:

func Tools() error {
	update, err := envBool("UPDATE")
	if err != nil {
		return err
	}

	if err := os.MkdirAll("_tools", 0700); err != nil {
		return err
	}
	wd, err := os.Getwd()
	if err != nil {
		return err
	}
	env := map[string]string{"GOBIN": filepath.Join(wd, "_tools")}
	args := []string{"get"}
	if update {
		args = []string{"get", "-u"}
	}
	for _, t := range tools {
		err := sh.RunWith(env, "go", append(args, t)...)
		if err != nil {
			return err
		}
	}
	return nil
}

This code is pretty simple — it ensures the _tools directory exists (which is where retool puts its binaries as well, so I just reused that spot since our .gitignore already ignored it). Then it sets GOBIN to the _tools directory, so binaries built by the go tool will go there, and runs go get importpath@<tag|hash>. That’s it. The first time, it’ll take a while to download all the libraries it needs to build the binaries into the modules cache, but after that it’ll figure out it doesn’t need to do anything pretty quick.

Now just use the tool helper function below in your magefile to run the right versions of the binaries (and/or add _tools to your PATH if you use something like direnv).

// tool runs a command using a cached binary.
func tool(cmd string, args ...string) error {
	return sh.Run(filepath.Join("_tools", cmd), args...)
}

Now all the devs on your team will be using the same versions of their (go) dev tools, and you don’t even need a fancy third party tool to do it (aside from mage). The list of tools then is just a simple slice of strings, thusly:

var tools = []string{
	"github.com/jteeuwen/go-bindata/go-bindata@6025e8de665b31fa74ab1a66f2cddd8c0abf887e",
	"github.com/golang/protobuf/protoc-gen-go@v1.3.1",
	"gnorm.org/gnorm@v1.0.0",
	"github.com/goreleaser/goreleaser@v0.106.0",
}

For most maintained libraries, you’ll get a nice semver release number in there, so it’s perfectly clear what you’re running (but for anything without tags, you can use a commit hash).

I’m really happy that this was as straightforward as I was hoping it would be, and it seems just as usable as retool for my use case.

Init Is Bad and You Should Feel Bad

Apr 12, 2019

func init() in Go is a weird beast. It’s the only function you can have multiples of in the same package (yup, that’s right… give it a try). It gets run when the package is imported. And you should never use it.

Why not? Well, there’s a few reasons. The main one is that init is only useful for setting global state. I think it’s pretty well accepted that global state is bad (because it’s hard to test and it makes concurrency dangerous). So, by association init is bad, because that’s all it can do.

But wait, there’s more that makes it even worse. Init is run when a package is imported, but when does a package get imported? If a imports b and b imports c and b and c both have init functions, which one runs first? What if c has two init functions in different files? You can find out, but it’s non-obvious and it can change if you import code differently. Not knowing the order in which code executes is bad. Normal go code executes top to bottom in a very clear and obvious order. There’s good reason for that.

How do you test init functions? Trick question, you can’t. It’s not possible to test the state of a package before init and then make sure the state after init is correct. As soon as your test code runs, it imports the package and runs init right away. Ok, maybe that’s not 100% true, you can probably do some hackery in init to check if you’re running under go test and then not run the init logic… but then your package isn’t set up the way it expects, and you’d have to write a test specifically named to run first, to test init… and that’s just horrible (and nobody does that, so it’s basically always untested code).

Ok, so there’s the reasons not to use it… now what do you do instead? If you want state, use a struct. Instead of global variables on the package, use fields on a struct. The package-level functions become methods, and the init function becomes a constructor.

This fixes all the aforementioned problems. You get rid of global variables, so if you have two different parts of your code using the same package, they don’t stomp on each other’s settings etc. You can run tests without worrying that a previous test modifies global state for a later test. It’s clear and obvious how to test before and after a constructor gets called. And finally, there’s a clear and normal order to the initialization of things. You don’t have to wonder what gets called when, because it’s just normal go functions.

As a corollary… this means you shouldn’t use underscore imports either (since they’re generally only useful for triggering init functions). These imports (import _ "github.com/foo/db") are used for their side effects, like registering sql/db drivers. The problem is that these are, by definition, setting global variables, and those are bad, as we’ve said. So don’t use those either.

Once you start writing code with structs instead of globals and init, you’ll find your code is much easier to test, easier to use concurrently, and more portable between applications. So, don’t use init.

…Axel Wagner mentioned on Twitter that this looked too dogmatic, and he’s right. This is programming, there are infinite possible programs, and thus there will always be exceptions to every rule. I think it’s really rare that init is the right choice, and you should only come to that decision after trying other options and ensuring you take into consideration things like startup order, concurrent access, and testing.

Starlight

Dec 7, 2018

I’d like to announce starlight - https://github.com/starlight-go/starlight.

Starlight wraps google’s Go implementation of the starlark python dialect (most notably found in the Bazel build tool). Starlight makes it super easy for users to extend your application by writing simple python-like scripts that interact seamlessly with your current Go code… with no boilerplate on your part.

What is Starlark?

Starlark is a subset of python that removes some of the more advanced features, but keeps the easy to read-and-write feel. For the purposes of this article, to avoid confusion between starlight (my package) and starlark (the language), I’ll be referring to the code as python (since starlark code is a subset of python code), but there are some small differences (described in the previous link).

Parser by google

The parser and runner are maintained by google’s bazel team, which write starlark-go. Starlight is a wrapper on top of that, which makes it so much easier to use starlark-go. The problem with the starlark-go API is that it is more built to be a used as configuration, so it assumes you want to get information out of starlark and into Go. It’s actually pretty difficult to get Go information into a starlark script…. unless you use starlight.

Easy two-way interaction

Starlight has adapters that use reflection to automatically make any Go value usable in a starlark script. Passing an *http.Request into a starlark script? Sure, you can do name = r.URL.Query()["name"][0] in the python without any work on your part.

Starlight is built to just work the way you hope it’ll work. You can access any Go methods or fields, basic types get converted back and forth seamlessly… and even though it uses reflection, it’s not as slow as you’d think. A basic benchmark wrapping a couple values and running a starlark script to work with them runs in a tiny fraction of a millisecond.

The great thing is that the changes made by the python code are reflected in your go objects, just as if it had been written in Go. So, set a field on a pointer to a struct? Your go code will see the change, no additional work needed.

100% Safe

The great thing about starlark and starlight is that the scripts are 100% safe to run. By default they have no access to other parts of your project or system - they can’t write to disk or connect to the internet. The only access they have to the outside is what you give them. Because of this, it’s safe to run untrusted scripts (as long as you’re not giving them dangerous functions to run, like os.RemoveAll). But at the same time, if you’re only running trusted scripts, you can give them whatever you want (http.Get? Sure, why not?)

Example

Below is an example of a webserver that changes its output depending on the python script it runs. This is the full code, it’s not truncated for readability… this is all it takes.

First the go web server code. Super standard stuff, except a few lines to run starlight…

package main

import (
	"fmt"
	"log"
	"net/http"

	"github.com/starlight-go/starlight"
)

func main() {
	http.HandleFunc("/", handle)
	port := ":8080"
	fmt.Printf("running web server on http://localhost%v?name=starlight&repeat=3\n", port)
	if err := http.ListenAndServe(port, nil); err != nil {
		log.Fatal(err)
	}
}

func handle(w http.ResponseWriter, r *http.Request) {
	fmt.Println("handling request", r.URL)
	// here we define the global variables and functions we're making available
	// to the script.  These will define how the script can interact with our Go
	// code and the outside world.
	globals := map[string]interface{}{
		"r":       r,
		"w":       w,
		"Fprintf": fmt.Fprintf,
	}
	_, err := starlight.Eval("handle.star", globals, nil)
	if err != nil {
		fmt.Println(err)
	}
}

And the python handle.star:

# Globals are:
# w: the http.ResponseWriter for the request
# r: the *http.Request
# Fprintf: fmt.Fprintf

# for loops and if statements need to be in functions in starlark
def main():
  # Query returns a map[string][]string
  
  # this gets a value from a map, with a default if it doesn't exist
  # and then takes the first value in the list.
  repeat = r.URL.Query().get("repeat", ["1"])[0]
  name = r.URL.Query().get("name", ["starlight"])[0]

  for x in range(int(repeat)):
    Fprintf(w, "hello %s\n", name)

  # we can use pythonic truthy statements on the slices returned from the map to
  # check if they're empty.
  if not r.URL.Query().get("repeat") and not r.URL.Query().get("repeat"):
    w.Write("\nadd ?repeat=<int>&name=<string> to the URL to customize this output\n")

  w.Write("\ntry modifying the contents of output.star and see what happens.\n")

main()

You can run this example by running go get github.com/starlight-go/starlight and using go run main.go in the example folder. You can then update the python and watch the changes the next time you hit the server. This just uses starlight.Eval, which rereads and reparses the script every time.

Caching

In a production environment, you probably want to only read a script once and parse it once. You can do that with starlight’s Cache. This cache takes a list of directories to look in for scripts, which it will read and parse on-demand, and then store the parsed object in memory for later use. It also uses a cache for any load() calls the scripts use to load scripts they depend on.

Work Ongoing

Starlight is still a work in progress, so don’t expect the API to be perfectly stable quite yet. But it’s getting pretty close, and there shouldn’t be any earth shattering changes, but definitely pin your imports. Right now it’s more about finding corner cases where the starlight wrappers don’t work quite like you’d expect, and supporting the last few things that aren’t implemented yet (like channels).

Mage - make/rake for Go

Sep 19, 2018

A Brief History

A question came up at the Framingham Go meetup a while back about why something like Gradle hasn’t taken hold in the Go community. I can’t say that I know for sure what the answer is - I don’t speak for the community - but, I have some guesses. I think part of it is that many projects don’t need a full-fledged build tool - for your typical Go networked server or CLI tool, a single binary built with go build is probably fine.

For more complex builds, which may require more steps than just compile and link, like for bundling static assets in a web server or generating code from protobufs, for example, many people in the Go community reach for Make. Personally, I find that unfortunate. Makefiles are clearly pretty cool for a number of reasons (built-in CLI, dependencies, file targets). However, Make is not Windows friendly, and it has its own language and conventions that you need to learn on top of the oddity that is Bash scripting. Finally, it doesn’t let you leverage the Go community’s two greatest resources - go programmers and go code.

Maybe `go run`? Maybe not

The above is the start of a blog post I’ve had half written for two years. I started to go on to recommend using go run make.go with a go file that does the build for you. But in practice, this is problematic. If you want your script to be useful for doing more than one thing, you need to implement a CLI and subcommands. This ends up being a significant amount of work that then obscures what the actual code is doing… and no one wants to maintain yet another CLI just for development tasks. In addition, there’s a lot of chaff you have to handle, like printing out errors, setting up logging etc.

The Last Straw

Last summer there were a couple questions on r/golang about best practices for using Makefiles with Go… and I finally decided I’d had enough.

I looked around at what existed for alternatives - rake was the obvious pattern to follow, being very popular in the Ruby community. pyinvoke was the closest equivalent I saw in python. Was there something similar in Go? Well, sort of, but not exactly. go-task is written in Go, but tasks are actually defined in YAML. Not my cup of tea. Mark Bates wrote grift which has tasks written in Go, but I didn’t really like the ergonomics… I wanted just a little more magic.

I decided that I could write a tool that behaved pretty similarly to Make, but allowed you to write Go instead of Bash, and didn’t need any special syntax if I did a little code parsing and generation on the fly. Thus, Mage was born.

What is Mage?

Docs: magefile.org
Github: github.com/magefile/mage

Mage is conceptually just like Make, except you write Go instead of Bash. Of course, there’s a little more to it than that. In Mage, like in Make, you write targets that can be accessed via a simple CLI. In Mage, exported functions become targets. Any of these exported functions are then runnable by running mage <func_name> in the directory where the magefile lives, just like you’d run make <target_name> for a make target.

What is a Magefile?

A magefile is simply a .go file with the mage build tag in it. All you need for a magefile is this:

//+build mage

package main

Mage looks for all go files in the current directory with the mage build tag, and compiles them all together with a generated CLI.

There are a few nice properties that result from using a build tag to mark magefiles - one is that you can use as many files as you like named whatever you like. Just like in normal go code, the files all work together to create a package.

Another really nice feature is that your magefiles can live side by side with your regular go code. Mage only builds the files with the mage tag, and your normal go build only builds the files without the mage tag.

Targets

A function in a magefile is a target if it is exported and has a signature of func(), func()error, func(context.Context), or func(context.Context)error. If the target has an error return and you return an error, Mage will automatically print out the error to its own stderr, and exit with a non-zero error code.

Doc comments on each target become CLI docs for the magefile, doc comments on the package become top-level help docs.

//+build mage

// Mostly this is used for building the website and some dev tasks.
package main

// Builds the website.  If needed, it will compact the js as well.
func Build() error {
   // do your stuff here
   return nil
}

Running mage with no arguments (or mage -l if you have a default target declared) will print out help text for the magefiles in the current directory.

$ mage
Mostly this is used for building the website and some dev tasks.

Targets:
 build    Builds the website.

The first sentence is used as short help text, the rest is available via mage -h <target>

$ mage -h build
mage build:

Builds the website.  If needed, it will compact the js as well.

This makes it very easy to add a new target to your magefile with proper documentation so others know what it’s supposed to do.

You can declare a default target to run when you run mage without a target very easily:

var Default = Build

And just like Make, you can run multiple targets from a single command… mage build deploy clean will do the right thing.

Dependencies

One of the great things about Make is that it lets you set up a tree of dependencies/prerequisites that must execute and succeed before the current target runs. This is easily done in Mage as well. The github.com/magefile/mage/mg library has a Deps function that takes a list of dependencies, and runs them in parallel (and any dependencies they have), and ensures that each dependency is run exactly once and succeeds before continuing.

In practice, it looks like this:

func Build() error {
   mg.Deps(Generate, Protos)
   // do build stuff
}

func Generate() error {
   mg.Deps(Protos)
   // generate stuff
}

func Protos() error {
   // build protos
}

In this example, build depends on generate and protos, and generate depends on protos as well. Running build will ensure that protos runs exactly once, before generate, and generate will run before build continues. The functions sent to Deps don’t have to be exported targets, but do have to match the same signature as targets have (i.e. optional context arg, and optional error return).

Shell Helpers

Running commands via os/exec.Command is cumbersome if you want to capture outputs and return nice errors. github.com/magefile/mage/sh has helper methods that do all that for you. Instead of errors you get from exec.Command (e.g. “command exited with code 1”), sh uses the stderr from the command as the error text.

Combine this with the automatic error reporting of targets, and you easily get helpful error messages from your CLI with minimal work:

func Build() error {
   return sh.Run("go", "build", "-o", "foo.out")
}

Verbose Mode

Another nice thing about the sh package is that if you run mage with -v to turn on verbose mode, the sh package will print out the args of what commands it runs. In addition, mage sets up the stdlib log package to default to discard log messages, but if you run mage with -v, the default logger will output to stderr. This makes it trivial to turn on and off verbose logging in your magefiles.

How it Works

Mage parses your magefiles, generates a main function in a new file (which contains code for a generated CLI), and then shoves a compiled binary off in a corner of your hard drive. The first time it does this for a set of magefiles, it takes about 600ms. Using the go tool’s ability to check if a binary needs to be rebuilt or not, further runs of the magefile avoid the compilation overhead and only take about 300ms to execute. Any changes to the magefiles or their dependencies cause the cached binary to be rebuilt automatically, so you’re always running the newest correct code.

Mage is built 100% with the standard library, so you don’t need to install a package manager or anything other than go to build it (and there are binary releases if you just want to curl it into CI).

Conclusion

I’ve been using Mage for all my personal projects for almost a year and for several projects at Mattel for 6 months, and I’ve been extremely happy with it. It’s easy to understand, the code is plain old Go code, and it has just enough helpers for the kinds of things I generally need to get done, taking all the peripheral annoyances out of my way and letting me focus on the logic that needs to be right.

Give it a try, file some issues if you run into anything. Pull requests more than welcome.

Handle and Check - Let's Not

Sep 6, 2018

There’s a new error handling design proposed here. It’s…. not great.

Handle is a new keyword that basically defines a translation that can be applied to errors returned from the current function:

func printSum(a, b string) error {
	handle err { return fmt.Errorf("error summing %v and %v: %v", a, b, err ) }
	x := check strconv.Atoi(a)
	y := check strconv.Atoi(b)
	fmt.Println("result:", x + y)
	return nil
}

Check applies the handler and returns if the error passed into it is not nil, otherwise it returns the non-error value.

Handle, in my opinion is kind of useless. We can already do this today with functions thusly:

func printSum(a, b string) (err error) {
	check := func(err error) error { 
        return fmt.Errorf("error summing %v and %v: %v", a, b, err )
    }
	x, err := strconv.Atoi(a)
    if err != nil {
        return check(err)
    }
	y, err := strconv.Atoi(b)
    if err != nil {
        return check(err)
    }
	fmt.Println("result:", x + y)
	return nil
}

That does literally the same thing as check and handle above.

The stated reason for adding check and handle is that too many people just write “return err” and don’t customize the error at all, which means somewhere at the top of your program, you get this inscrutable error from deep in the bowels of your code, and you have no idea what it actually means.

It’s trivial to write code that does most of what check and handle do… and no one’s doing it today (or at least, not often). So why add this complexity?

Check and handle actually make error handling worse. With the check and handle code, there’s no required “error handling scope” after the calls to add context to the error, log it, clean up, etc. With the current code, I always have an if statement that I can easily slot more lines into, in order to make the error more useful and do other things on the error path. With check, that space in the code doesn’t exist. There’s a barrier to making that code handle errors better - now you have to remove check and swap in an if statement. Yes, you can add a new handle section, but that applies globally to any further errors returns in the function, not just for this one specific error. Most of the time I want to add information about one specific error case.

So, for example, in the code above, I would want a different error message for A failing Atoi vs. B failing Atoi…. because in real code, which one is the problem may not be obvious if the error message just says “either A or B is a problem”.

Yes, if err != nil { constitutes a lot of Go code. That’s ok. That’s actually good. Error handling is extremely important. Check and handle don’t make error handling better. I suspect they’ll actually make it worse.

A refrain I often state about changes requested for Go is that most of them just involve avoiding an if statement or a loop. This is one of them. That’s not a good enough reason to change the language, in my opinion.

Go2 Contracts Go Too Far

Sep 5, 2018

So, I don’t really like the contracts defined here. They seem complicated to understand, and duplicate a lot of what interfaces already do, but in a much clunkier fashion.

I think we can do 90% of what the design given can do, with 20% of the added complexity.

Most of my objection comes from two things:

First the syntax, which adds “type parameters” as yet another overloaded meaning for stuff in parentheses (we already have: argument lists, return values, function calls, type conversion, type assertion, and grouping for order of operations).

Second, the implicit nature of how contracts are defined by a random block of code that is sorta like go code, but not actually go code.

Syntax

This is a generic function as declared in the contracts code:

func Print(type T)(s []T) {
	for _, v := range s {
		fmt.Println(v)
	}
}

The (type T) here defines a type parameter. In this case it doesn’t tell us anything about the type, so it’s effectively like interface{}, except that it magically works with slices the way we all thought interfaces should work with slices back in the day – i.e. you can pass any slice into this, not just []interface{}.

Are we now going to have func(type T)(input T)(output T){}? That’s crazy.

Also, I don’t like that the type parameters precede the arguments… isn’t the whole reason that we have Go’s unusual ordering that we acknowledge that the name is more important than the type?

Here’s my fix… since contracts are basically like interfaces, let’s actually use interfaces. And let’s make the contracty part last, since it’s least important:

func Print(s []interface{}:T) {
	for _, v := range s {
		fmt.Println(v)
	}
}

So here’s the change in a nutshell. You use a real interface to define the type of the argument. In this case it’s interface{}. This cuts out the need to define a contract separately when we already have a way of defining an abstract type with capabilities. The : tells the compiler that this is a parameterized type, and T is the name given that type (though it’s not used anywhere).

More Complex Types

added this section to help remove some confusion people had with the proposal

More complex functions with multiple contract types are just as easily done:

func Map(vals []interface{}:X, f func(x X) interface{}:Y) []Y {
	ret := make([]Y, len(vals))
	for i := range vals {
		ret[i] = f(vals[i])
	}
	return ret
}

:X defines a type in this scope which is constrained by the interface that precedes it (in this case, there’s no constraint). Y defines a separate type… then inside the scope you can reference those types.

Contract Definitions as Code Are Hard

Specifying contracts via example code is going to age about as well as specifying time formats via example output. -me on Twitter

The next example in the design is

contract stringer(x T) {
	var s string = x.String()
}

func Stringify(type T stringer)(s []T) (ret []string) {
	for _, v := range s {
		ret = append(ret, v.String())
	}
	return ret
}

Wait, so we have to redefine the Stringer interface? Why? WHy not just use a Stringer interface? Also, what happens if I screw up the code, like this?

contract stringer(x T) {
	s := x.String()
}

You think the error message from that is going to be good? I don’t.

Also, this allows an arbitrarily large amount of code in contract definitions. Much of this code could easily imply restrictions that you don’t intend, or be more general than you expect.

contract slicer(x T) {
	s := x[0]
}

Is that a map of int to something? Or is it a slice? Is that just invalid? What would the error message say, if so? Would it change if I put a 1 in the index? Or -1? Or “1”?

Notably… a lot of really smart gophers who have been programming in Go for years have difficulty defining contracts that are conceptually simple, because there is so much implied functionality in even simple types.

Take a contract that says you can accept a string or a []byte… what do you think it would look like?

If you guessed this with even your second or third try…

contract stringOrBytes(s S) {
    string(s)
    s[0]
    s[:]
    S([]byte{})
}

…then I applaud you for being better at Go than I am. And there’s still questions about whether or not this would fail for len(s) == 0 (answer: it won’t, because it’s just type checked, not actually run… but, see what I mean about implications?) Also, I’m not even 100% sure this is sufficient to define everything you need. It doesn’t seem to say that you can range over the type. It doesn’t say that indexing the value will produce a single byte.

Lack of Names and Documentations

The biggest problem with contracts defined as random blocks of code is their lack of documentation. As above, what exactly a bit of code means in a contract is actually quite hard to distill when you’re talking about generic types. And then how do you talk about it? If you have your function that takes your locally defined stringOrByte, and someone else has theirs defined as robytes, but the contents are the same (but maybe in a different order with different type names)… how can you figure out if they’re compatible?

Is this the same contract as above?

contract robytes(t T) {
    T([]byte{})
    t[5:10]
    string(t)
    t[100]
}

Yes, but it’s non-trivial to see that it is (and if it wasn’t, you’d probably have to rely on the compiler to tell you).

Imagine for a moment if there were no io.Reader or io.Writer interfaces. How would you talk about functions that write to a slice of bytes? Would we all write exactly the same interface? Probably not. Look at the lack of a Logging interface, and how that affected logging across the ecosystem. io.Reader and io.Writer make writing and reading streams of bytes so nice because they’re standardized, because they are discoverable. The standardization means that everyone who writes streams of bytes uses the exact same signature, so we can compose readers and writers trivially, and discover new ways to compose them just by looking for the terms io.Reader and io.Writer.

Just Use Interfaces, and Make Some New Built-in Ones

My solution is to mainly just use interfaces and tag them with :T to denote they’re a parameterized type. For contracts that don’t distill to “has a method”, make built-in contract/interfaces that can be well-documented and well-known. Most of the examples I’ve seen of “But how would you do X?” boil down to “You can’t, and moreover, you probably shouldn’t”.

A lot of this boils down to “I trust the stdlib authors to define a good set of contracts and I don’t want every random coder to throw a bunch of code in a contract block and expect me to be able to understand it”.

I think most of the useful contracts can be defined in a small finite list that can live in a new stdlib package, maybe called ct to keep it brief. ct.Comparable could mean x == x. ct.Stringish could mean “string or []byte or a named version of either”… etc.

Most of the things that fall outside of this are things that I don’t think you should be doing. Like, “How do you make a function that can compare two different types with ==?” Uh… don’t, that’s a bad idea.

One of the uses in the contract design is a way to say that you can convert one thing to another. This can be useful for generic functions on strings vs []byte or int vs int64. This could be yet another specialized interface:

package ct

// Convertible defines a type that can be converted into T.
type Convertible:T contract

// elsewhere

func ParseUint64(v ct.Convertible:uint64) {
    i, err := strconv.ParseUint(uint64(v))
}

Conclusion

The contracts design, as written, IMO, will make the language significantly worse. Wrapping my head around what a random contract actually means for my code is just too hard if we’re using example code as the means of definition. Sure, it’s a clever way to ensure that only types that can be used in that way are viable… but clever isn’t good.

One of my favorite posts about Go is Rob Napier’s Go is a Shop-Built Jig. In it, he argues that there are many ineleagant parts to the Go language, but that they exist to make the whole work better for actual users. This is stuff like the built-in functions append and copy, the fact that slices and maps are generic, but nothing else is. Little pieces are filed off here, stapled on there, because making usage easy matters more than looking slick.

This design of contracts as written does not feel like a shop-built jig. It feels like a combination all-in-one machine that can do anything but is so complicated that you don’t even know how to even approach it or when you should use it vs the other tools in your shop.

I think we can make a smaller, more incremental addition to the language that will fix a lot of the problems that many people have with Go - lack of reusable container types, copy and paste for simple map and filter functions, etc. This will only add a small amount of complexity to the language, while solving real problems that people experience.

Notably, I think a lot of the problems generics solve are actually quite minor in the scheme of major projects. Yes, I have to rewrite a filter function for every type. But that’s a function I could have written in college and I usually only need one or two per 20,000 lines of code (and then almost always just strings).

So… I really don’t want to add a bunch of complexity to solve these problems. Let’s take the most straightforward fix we can get, with the least impact on the language. Go has been an amazing success in the last decade. Let’s move slowly so we don’t screw that up in the next decade.

Comment Your Code

Nov 17, 2017

There’s a disturbing thread that pops up every once in a while where People On The Internet say that comments are bad and the only reason you need them is because you and/or your code aren’t good enough. I’m here to say that’s bullshit.

Code Sucks

They’re not entirely wrong… your code isn’t good enough. Neither is mine or anyone else’s. Code sucks. You know when it sucks the most? When you haven’t touched it in 6 months. And you look back at the code and wonder “what in the hell was the author thinking?” (and then you git blame and it’s you… because it’s always you).

The premise of the anti-commenters is that the only reason you need comments is because your code isn’t “clean” enough. If it were refactored better, named better, written better, it wouldn’t need that comment.

But of course, what is clean and obvious and well-written to you, today, while the entire project and problem space are fully loaded in your brain… might not be obvious to you, six months from now, or to the poor schmuck that has to debug your code with their manager breathing down their neck because the CTO just ran into a critical bug in prod.

Learning to look at a piece of code that you understand, and trying to figure out how someone else might fail to understand it is a difficult skill to master. But it is incredibly valuable… one that is nearly as important as the ability to write good code in the first place. In industry, almost no one codes alone. And even if you do code alone, you’re gonna forget why you wrote some of your code, or what exactly this gnarly piece of late night “engineering” is doing. And someday you’re going to leave, and the person they hire to replace you is going to have to figure out every little quirk that was in your head at the time.

So, throwing in comments that may seem overly obvious in the moment is not a bad thing. Sometimes it can be a huge help.

Avoiding Comments Often Makes Your Code Worse

Some people claim that if you remove comments, it makes your code better, because you have to make your code clearer to compensate. I call BS on this as well, because I don’t think anyone is realistically writing sub-par code and then excusing it by slapping a comment on it (aside from // TODO: this is a temporary hack, I'll fix it later). We all write the best code we know how, given the various external constraints (usually time).

The problem with refactoring your code to avoid needing comments is that it often leads to worse code, not better. The canonical example is factoring out a complicated line of code into a function with a descriptive name. Which sounds good, except now you’ve introduced a context switch for the person reading the code.. instead of the actual line of code, they have a function call… they have to scroll to where the function call is, remember and map the arguments from the call site to the function declaration, and then map the return value back to the call site’s return.

In addition, the clarity of a function’s name is only applicable to very trivial comments. Any comment that is more than a couple words cannot (or should not) be made into a function name. Thus, you end up with… a function with a comment above it.

Indeed, even the existence of a very short function may cause confusion and more complicated code. If I see such a function, I may search to see where else that function is used. If it’s only used in one place, I then have to wonder if this is actually a general piece of code that represents global logic… (e.g. NameToUserID) or if this function is bespoke code that relies heavily on the specific state and implementation of its call site and may well not do the right thing elsewhere. By breaking it out into a function, you’re in essence exposing this implementation detail to the rest of the codebase, and this is not a decision that should be taken lightly. Even if you know that this is not actually a function anyone else should call, someone else will call it at some point, even where not appropriate.

The problems with small functions are better detailed in Cindy Sridharan’s medium post.

We could dive into long variable names vs. short, but I’ll stop and just say that you can’t save yourself by making variable names longer. Unless your variable name is the entire comment that you’re avoiding writing, then you’re still losing information that could have been added to the comment. And I think we can all agee that usernameStrippedOfSpacesWithDotCSVExtension is a terrible variable name.

I’m not trying to say that you shouldn’t strive to make your code clear and obvious. You definitely should. It’s the hallmark of a good developer. But code clarity is orthogonal to the existence of comments. And good comments are also the hallmark of a good developer.

There are no bad comments

The examples of bad comments often given in these discussions are trivially bad, and almost never encountered in code written outside of a programming 101 class.

// instantiate an error
var err error

Yes, clearly, this is not a useful comment. But at the same time, it’s not really harmful. It’s some noise that is easily ignored when browsing the code. I would rather see a hundred of the above comments if it means the dev leaves in one useful comment that saves me hours of head banging on keyboard.

I’m pretty sure I’ve never read any code and said “man, this code would be so much easier to understand if it weren’t for all these comments.” It’s nearly 100% the opposite.

In fact, I’ll even call out some code that I think is egregious in its lack of comments - the Go standard library. While the code may be very correct and well structured.. in many cases, if you don’t have a deep understanding of what the code is doing before you look at the it, it can be a challenge to understand why it’s doing what it’s doing. A sprinkling of comments about what the logic is doing and why would make a lot of the go standard library a lot easier to read. In this I am specifically talking about comments inside the implementation, not doc comments on exported functions in general (those are generally pretty good).

Any comment is better than no comment

Another chestnut the anti-commenters like to bring out is the wisdom can be illustrated with a pithy image:

Ah, hilarious, someone updated the contents and didn’t update the comment.

But, that was a problem 20 years ago, when code reviews were not (generally) a thing. But they are a thing now. And if checking that comments match the implementation isn’t part of your code review process, then you should probably review your code review process.

Which is not to say that mistakes can’t be made… in fact I filed a “comment doesn’t match implementation” bug just yesterday. The saying goes something like “no comment is better than an incorrect comment” which sounds obviously true, except when you realize that if there is no comment, then devs will just guess what the code does, and probably be wrong more often than a comment would be wrong.

Even if this does happen, and the code has changed, you still have valuable information about what the code used to do. Chances are, the code still does basically the same thing, just slightly differently. In this world of versioning and backwards compatbility, how often does the same function get drastically changed in functionality while maintaining the same name and signature? Probably not often.

Take the bug I filed yesterday… the place where we were using the function was calling client.SetKeepAlive(60). The comment on SetKeepAlive was “SetKeepAlive will set the amount of time (in seconds) that the client should wait before sending a PING request”. Cool, right? Except I noticed that SetKeepAlive takes a time.Duration. Without any other units specified for the value of 60, Go’s duration type defaults to…. nanoseconds. Oops. Someone had updated the function to take a Duration rather than an Int. Interestingly, it did still round the duration down to the nearest second, so the comment was not incorrect per se, it was just misleading.

Why?

The most important comments are the why comments. Why is the code doing what it’s doing? Why must the ID be less than 24 characters? Why are we hiding this option on Linux? etc. The reason these are important is that you can’t figure out the why by looking at the code. They document lessons learned by the devs, outside constraints imposed by the business, other systems, etc. These comments are invaluable, and almost impossible to capture in other ways (e.g. function names should document what the function does, not why).

Comments that document what the code is doing are less useful, because you can generally always figure out what the code is doing, given enough time and effort. The code tells you what it is doing, by definition. Which is not to say that you should never write what comments. Definitely strive to write the clearest code you can, but comments are free, so if you think someone might misunderstand some code or otherwise have difficulty knowing what’s going on, throw in a comment. At least, it may save them a half hour of puzzling through your code, at best it may save them from changing it or using it in incorrect ways that cause bugs.

Tests

Some people think that tests serve as documentation for functions. And, in a way, this is true. But they’re generally very low on my list of effective documentation. Why? Well, because they have to be incredibly precise, and thus they are verbose, and cover a narrow strip of functionality. Every test tests exactly one specific input and one specific output. For anything other than the most simple function, you probably need a bunch of code to set up the inputs and construct the outputs.

For much of programming, it’s easier to describe briefly what a function does than to write code to test what it does. Often times my tests will be multiple times as many lines of code as the function itself… whereas the doc comment on it may only be a few sentences.

In addition, tests only explain the what of a function. What is it supposed to do? They don’t explain why, and why is often more important, as stated above.

You should definitely test your code, and tests can be useful in figuring out the expected behavior of code in some edge cases… but if I have to read tests to understand your code in general, then that’s red flag that you really need to write more/better comments.

Conclusion

I feel like the line between what’s a useful comment and what’s not is difficult to find (outside of trivial examples), so I’d rather people err on the side of writing too many comments. You never know who may be reading your code next, so do them the favor you wish was done for you… write a bunch of comments. Keep writing comments until it feels like too many, then write a few more. That’s probably about the right amount.

Code Must Never Lie

Aug 29, 2017

If you tell the truth, you don’t have to remember anything.

—Mark Twain

In a code review recently, I asked the author to change some of their asserts to requires. Functions in testify’s assert package allow the test to continue, whereas those in the require package end the test immediately. Thus, you use require to avoid trying to continue running a test when we know it’ll be in a bad state. (side note: don’t use an assert package, but that’s another post) Since testify’s assert and require packages have the same interface, the author’s solution was to simply change the import thusly:

import (
    assert "github.com/stretchr/testify/require"
)

Bam, now all the assert.Foo calls would stop the test immediately, and we didn’t need a big changelist changing every use of assert to require. All good, right?

No.

Hell No.

Why? Because it makes the code lie. Anyone familiar with the testify package understands the difference between assert and require. But we’ve now made code that looks like an assert, but is actually a require. People who are 200 lines down in a test file may well not realize that those asserts are actually requires. They’ll assume the test function will continue processing after an assert fails. They’ll be wrong, and they could accidentally write incorrect tests because of it - tests that fail with confusing error messages.

This is true in general - code must never lie. This is a cardinal sin amongst programmers. This is an extension of the mantra that code should be written to be read. If code looks like it’s doing one thing when it’s actually doing something else, someone down the road will read that code and misunderstand it, and use it or alter it in a way that causes bugs. If they’re lucky, the bugs will be immediate and obvious. If they’re unlucky, they’ll be subtle and only be figured out after a long debugging session and much head banging on keyboard. That someone might be you, even if it was your code in the first place.

If, for some reason, you have to make code that lies (to fulfill an interface or some such), document the hell out of it. Giant yelling comments that can’t be missed during a 2am debugging session. Because chances are, that’s when you’re going to look at this code next, and you might forget that saveToMemory() function actually saves to a database in AWS’s Antarctica region.

So, don’t lie. Furthermore, try not to even mislead. Humans make assumptions all the time, it’s built into how we perceive the world. As a coder, it’s your job to anticipate what assumptions a reader may have, and ensure that they are not incorrect, or if they are, do your best to disabuse them of their incorrect assumptions.

If possible, don’t resort to comments to inform the reader, but instead, structure the code itself in such a way as to indicate it’s not going to behave the way one might expect. For example, if your type has a Write(b []byte) (int, error) method that is not compatible with io.Writer, consider calling it something other than Write… because everyone seeing foo.Write is going to assume that function will work like an io.Write. Instead maybe call it WriteOut or PrintOut or anything but Write.

Misleading code can be even more subtle than this. In a recent code review, the author wrapped a single DB update in a transaction. This set off alarm bells for me as a reviewer. As a reader, I assumed that the code must be saving related data in multiple tables, and that’s why a transaction was needed. Turned out, the code didn’t actually need the transaction, it was just written that way to be consistent with some other code we had. Unfortunately, in this case, being consistent was actually confusing… because it caused the reader to make assumptions that were ultimately incorrect.

Do the poor sap that has to maintain your code 6 months or two years down the road a favor - don’t lie. Try not to mislead. Because even if that poor sap isn’t you, they still don’t deserve the 2am headache you’ll likely be inflicting.

Adapting Functions

Jul 15, 2017

A question came up at Gophercon about using functions as arguments, and what to do when you have a function that you want to use that doesn’t quite match the signature. Here’s an example:

type Translator func(string) string

func RunTwice(translate Translator, input string) string {
    return translate(translate(input))
}

Now, what if you want to use RunTwice with a function that needs more inputs than just a string?

func Append(orig, suffix string) string {
    return orig + suffix
}

func do() {
    orig := "awesome"
    bang := "!"
    s := RunTwice(Append(orig, )) // wait, that won't work
    fmt.Println(s)
}

The answer is the magic of closures. Closures are anonymous functions that “close over” or save copies of all local variables so they can be used later. You can write a closure that captures the bang, and returns a function that’ll have the Translator signature.

func do() string {
    orig := "awesome"
    bang := "!"
    bangit := func(s string) string {
        return Append(s, bang)
    }
    return RunTwice(bangit(orig))
}

Yay, that works. But it’s not reusable outside the do function. That may be fine, it may not. If you want to do it in a reusable way (like, a lot of people may want to adapt Append to return a Translator, you can make a dedicated function for it like this:

func AppendTranslator(suffix string) Translator {
    return func(s string) string {
        return Append(s, suffix)
    }
}

In AppendTranslator, we return a closure that captures the suffix, and returns a function that, when called, will append that suffix to the string passed to the Translator.

And now you can use AppendTranslator with RunTwice.

3.5 Years, 500k Lines of Go (Part 1)

Mar 24, 2017

Series: 3.5 Years of Go

January 31st 2017 was my last day at Canonical, after working for 3.5 years on what is one of the largest open source projects written in Go - Juju.

As of this writing, the main repo for Juju, http://github.com/juju/juju, is 3542 files, with 540,000 lines of Go code (not included in that number is 65,000 lines of comments). Counting all dependencies except the standard library, Juju is 9523 files, holding 1,963,000 lines of Go code (not including comments, which clock in at 331,000 lines).

These are a few of my lessons learned from my roughly 7000 hours working on this project.

Notably, not everyone on the Juju team would agree with all of these, and the codebase was so huge that you could work for a year and not see 2/3rds of the codebase. So take the following with a grain of salt.

About Juju

Juju is service orchestration tool, akin to Nomad or Kubernetes and similar tools. Juju consists (for the most part) of exactly two binaries: a client and a server. The server can run in a few different modes (it used to be multiple binaries, but they were 99% the same code, so it was easier to just make one binary that can be shipped around). The server runs on a machine in the cloud of your choice, and copies of the binary are installed on new machines in the cloud so they can be controlled by the central server. The client and the auxiliary machines talk to the main server via RPC over websockets.

Juju is a monolith. There are no microservices, everything runs in a single binary. This actually works fairly well, since Go is so highly concurrent, there’s no need to worry about any one goroutine blocking anything else. It makes it convenient to have everything in the same process. You avoid serialization and other interprocess communication overhead. It does lend itself to making code more interdependent, and separations of concerns was not always the highest priority. However, in the end, I think it was much easier to develop and test a monolith than it would have been if it were a bunch of smaller services, and proper layering of code and encapsulation can help a lot with spaghetti code.

Package Management

Juju did not use vendoring. I think we should have, but the project was started before any of the major vendoring tools were out there, and switching never felt like it was worth the investment of time. Now, we did use Roger Peppe’s godeps (not the same as godep btw) to pin revisions. The problem is that it messes with other repos in your GOPATH, setting them to a specific commit hash, so if you ever go to build something else that doesn’t use vendoring, you’d be building from a non-master branch. However, the revision pinning gave us repeatable builds (so long as no one did anything truly heinous to their repo), and it was basically a non-issue except that the file that holds the commit hashes was continually a point of merge conflicts. Since it changed so often, by so many developers, it was bound to happen that two people change the same or adjacent lines in the file. It became such a problem I started working on an automatic resolution tool (since godeps holds the commit date of the hash you’re pinning, you could almost always just pick the newer hash). This is still a problem with glide and any similar tool that stores dependency hashes in a single file. I’m not entirely sure how to fix it.

Overall, I never felt that package management was a huge issue. It was a minor thing in our day to day work… which is why I always thought it was weird to read all the stories about people rejecting Go because of lack of package management solutions. Because most third party repos maintained stable APIs for the same repo, and we could pin our code to use a specific commit… it just was not an issue.

Project Organization

Juju is 80% monorepo (at github.com/juju/juju, with about 20% code that exists in separate repos (under github.com/juju). The monorepo section has pros and cons… It is easy to do sweeping changes across the codebase, but it also means that it doesn’t feel like you need to maintain a stable API in foo/bar/baz/bat/alt/special … so we didn’t. And that means that it would be essentially insane for anyone to actually import any package from under the main monorepo and expect it to continue to exist in any meaningful way at any future date. Vendoring would save you, but if you ever needed to update, good luck.

The monorepo also meant that we were less careful about APIs, less careful about separation of concerns, and the code was more interdependent than it possibly could have been. Not to say we were careless, but I feel like things outside the main Juju repo were held to a higher standard as far as separation of concerns and the quality and stability of the APIs. Certainly the documentation for external repos was better, and that might be enough of a determining factor by itself.

The problem with external repos was package management and keeping changes synchronized across repos. If you updated an external repo, you needed to then check in changes to the monorepo to take advantage of that. Of course, there’s no way to make that atomic across two github repos. And sometimes the change to the monorepo would get blocked by code reviews or failing tests or whatever, then you have potentially incompatible changes sitting in an external repo, ready to trip up anyone who might decide to make their own changes to the external repo.

The one thing I will say is that utils repos are nefarious. Many times we’d want to backport a fix in some subpackage of our utils repo to an earlier version of Juju, only to realize that many many other unrelated changes get pulled along with that fix, because we have so much stuff in the same repo. Thus we’d have to do some heinous branching and cherry picking and copypasta, and it’s bad and don’t do it. Just say no to utils packages and repos.

Overall Simplicity

Go’s simplicity was definitely a major factor in the success of the Juju project. Only about one third of the developers we hired had worked with Go before. The rest were brand new. After a week, most were perfectly proficient. The size and complexity of the product were a much bigger problem for developers than the language itself. There were still some times when the more experienced Go developers on the team would get questions about the best way to do X in Go, but it was fairly rare. Contrast this to my job before working on C#, where I was constantly explaining different parts of the language or why something works one way and not another way.

This was a boon to the project in that we could hire good developers in general, not just those who had experience in the language. And it meant that the language was never a barrier to jumping into a new part of the code. Juju was huge enough that no one person could know the fine details of the whole thing. But just about anyone could jump into a part of the code and figure out what 100 or so lines of code surrounding a bug were supposed to do, and how they were doing it (more or less). Most of the problems with learning a new part of the code were the same as it would have been in any language - what is the architecture, how is information passed around, what are the expectations.

Because Go has so little magic, I think this was easier than it would have been in other languages. You don’t have the magic that other languages have that can make seemingly simple lines of code have unexpected functionality. You never have to ask “how does this work?”, because it’s just plain old Go code. Which is not to say that there isn’t still a lot of complex code with a lot of cognitive overhead and hidden expectations and preconditions… but it’s at least not intentionally hidden behind language features that obscure the basic workings of the code.

Testing

Test Suites

In Juju we used Gustavo Nieyemer’s gocheck to run our tests. Gocheck’s test suite style encouraged full stack testing by reducing the developer overhead for spinning up a full Juju server and mongo database before each test. Once that code was written, as huge as it was, you could just embed that “base suite” in your test suite struct, and it would automatically do all the dirty work for you. This meant that our unit tests took almost 20 minutes to run even on a high end laptop, because they were doing so much for each test. It also made them brittle (because they were running so much code) and hard to understand and debug. To understand why a test was passing or failing, you had to understand all the code that ran before the open brace of your test function, and because it was easy to embed a suite within a suite, there was often a LOT that ran before that open brace.

In the future, I would stick with the standard library for testing instead. I like the fact that test with the standard library are written just like normal go code, and I like how explicit the dependencies have to be. If you want to run code at the beginning of your test, you can just put a method there… but you have to put a method there.

`time` in a bottle

The time package is the bane of tests and testable code. If you have code that times out after 30 seconds, how do you test it? Do you make a test that takes 30 seconds to run? Do the rest of the tests take 30 seconds to run if something goes wrong? This isn’t just related to time.Sleep but time.After or time.Ticker…. it’s all a disaster during tests. And not to mention that test code (especially when run under -race) can go a lot slower than your code does in production.

The cure is to mock out time… which of course is non-trivial because the time package is just a bunch of top level functions. So everywhere that was using the time package now needs to take your special clock interface that wraps time and then for tests you pass in a fake time that you can control. This tooks us a long time pull the trigger on and longer still to propagate the changes throughout our code. For a long time it was a constant source of flakey tests. Tests that would pass most of the time, but if the CI machine were slow that day, some random test would fail. And when you have hundreds of thousands of lines of tests, chances are SOMETHING is going to fail, and chances are it’s not the same thing as what failed last time. Fixing flakey tests was a constant game of whack-a-mole.

Cross Compilation Bliss

I don’t have the exact number of combinations, but the Juju server was built to run on Windows and Linux (Centos and Ubuntu), and across many more architectures than just amd64, including some wacky ones like ppc64le, arm64, and s390x.

In the beginning, Juju used gccgo for builds that the gc compiler did not support. This was a source of a few bugs in Juju, where gccgo did something subtly wacky. When gc was updated to support all architectures, we were very happy to leave the extra compiler by the wayside and be able to work with just gc.

Once we switched to gc, there were basically zero architecture-specific bugs. This is pretty awesome, given the breadth of architectures Juju supported, and the fact that usually the people using the wackier ones were big companies that had a lot of leverage with Canonical.

Multi-OS Mistakes

In the beginning when we were ramping up Windows support, there were a few OS specific bugs (we all developed on Ubuntu, and so Windows bugs often didn’t get caught until CI ran). They basically boiled down to two common mistakes related to filesystems.

The first was assuming forward slashes for paths in tests. So, for example, if you know that a config file should be in the “juju” subfolder and called “config.yml”, then your test might check that the file’s path is folder + “/juju/config.yml” - except that on Windows it would be folder + “\juju\config.yml”.

When making a new path, even in tests, use filepath.Join, not path.Join and definitely not by concatenating strings and slashes. filepath.Join will do the right thing with slashes for the OS. For comparing paths, always use path.ToSlash to convert a filepath to a canonical string that you can then compare to.

The other common mistake was for linux developers to assume you can delete/move a file while it’s open. This doesn’t work on Windows, because Windows locks the file when it’s open. This often came in the form of a defer file.Delete() call, which would get FIFO’d before the deferred file.Close() call, and thus would try to delete the file while it was still open. Oops. One fix is to just always call file.Close() before doing a move or delete. Note that you can call Close multiple times on a file, so this is safe to do even if you also have a defer file.Close() that’ll fire at the end of the function.

None of these were difficult bugs, and I credit the strong cross platform support of the stdlib for making it so easy to write cross platform code.

Error Handling

Go’s error handling has definitely been a boon to the stability of Juju. The fact that you can tell where any specific function may fail makes it a lot easier to write code that expects to fail and does so gracefully.

For a long time, Juju just used the standard errors package from the stdlib. However, we felt like we really wanted more context to better trace the path of the code that caused the error, and we thought it would be nice to keep more detail about an error while being able to add context to it (for example, using fmt.Errorf losing the information from the original error, like if it was an os.NotFound error).

A couple years ago we went about designing an errors package to capture more context without losing the original error information. After a lot of bikeshedding and back and forth, we consolidated our ideas in https://github.com/juju/errors. It’s not a perfect library, and it has grown bloated with functions over the years, but it was a good start.

The main problem is that it requires you to always call errors.Trace(err) when returning an error to grab the current file and line number to produce a stack-trace like thing. These days I would choose Dave Cheney’s github.com/pkg/errors, which grabs a stack trace at creation time and avoid all the tracing. To be honest, I haven’t found stack traces in errors to be super useful. In practice, unforeseen errors still have enough context just from fmt.Errorf(“while doing foo: %v”, err) that you don’t really need a stack trace most of the time. Being able to investigate properties of the original error can sometimes come in handy, though probably not as often as you think. If foobar.Init() returns something that’s an os.IsNotFound, is there really anything your code can do about it? Most of the time, no.

Stability

For a huge project, Juju is very stable (which is not to say that it didn’t have plenty of bugs… I just mean it almost never crashed or grossly malfunctioned). I think a lot of that comes from the language. The company where I worked before Canonical had a million line C# codebase, and it would crash with null reference exceptions and unhandled exceptions of various sorts fairly often. I honestly don’t think I ever saw a nil pointer panic from production Juju code, and only occasionally when I was doing something really dumb in brand new code during development.

I credit this to go’s pattern of using multiple returns to indicate errors. The foo, err := pattern and always always checking errors really makes for very few nil pointers being passed around. Checking an error before accessing the other variable(s) returned is a basic tenet of Go, so much so that we document the exceptions to the rule. The extra error return value cannot be ignored or forgotten thanks to unused variable checks at compile time. This makes the problem of nil pointers in Go fairly well mitigated, compared to other similar languages.

Generics

I’m going to make this section short, because, well, you know. Only once or twice did I ever personally feel like I missed having generics while working on Juju. I don’t remember ever doing a code review and wishing for generics for someone else’s code. I was mostly happy not to have to grok the cognitive complexity I’d come to be familiar with in C# with generics. Interfaces are good enough 99% of the time. And I don’t mean interface{}. We used interface{} rarely in Juju, and almost always it was because some sort of serialization was going on.

Next Time

This is already a pretty long post, so I think I’ll cap it here. I have a lot of more specific things that I can talk about… about APIs, versioning, the database, refactoring, logging, idioms, code reviews, etc.

Writing Go Applications with Reusable Logic

Oct 18, 2016

Series: Writing Go Applications

Writing libraries in Go is a relatively well-covered topic, I think… but I see a lot fewer posts about writing commands. When it comes down to it, all Go code ends up in a command. So let’s talk about it! This will be the first in a series, since I ended up having a lot more to say than I realized.

Today I’m going to focus on basic project layout, with the aims of optimizing for reusability and testability.

There are three unique bits about commands that influence how I structure my code when writing a command rather than a library:

Package main

This is the only package a go program must have. However, aside from telling the go tool to produce a binary, there’s one other unique thing about package main - no one can import code from it. That means that any code you put in package main can not be used directly by another project, and that makes the OSS gods sad. Since one of the main reasons I write open source code is so that other developers may use it, this goes directly against my desires.

There have been many times when I’ve thought “I’d love to use the logic behind X Go binary as a part of my code”. If that logic is in package main, you can’t.

os.Exit

If you care about producing a binary that does what users expect, then you should care about what exit code your binary exits with. The only way to do that is to call os.Exit (or call something that calls os.Exit, like log.Fatal).

However, you can’t test a function that calls os.Exit. Why? Because calling os.Exit during a test exits the test executable. This is quite hard to figure out if you end up doing it by accident (which I know from personal experience). When running tests, no tests actually fail, the tests just exit sooner than they should, and you’re left scratching your head.

The easiest thing to do is don’t call os.Exit. Most of your code shouldn’t be calling os.Exit anyway… someone’s going to get real mad if they import your library and it randomly causes their application to terminate under some conditions.

So, only call os.Exit in exactly one place, as near to the “exterior” of your application as you can get, with minimal entry points. Speaking of which…

func main()

It’s is the one function all go commands must have. You’d think that everyone’s func main would be different, after all, everyone’s application is different, right? Well, it turns out, if you really want to make your code testable and reusable, there’s really only approximately one right answer to “what’s in your main function?”

In fact, I’ll go one step further, I think there’s only approximately one right answer to “what’s in your package main?” and that’s this:

// command main documentation here.
package main

import (
    "os"

    "github.com/you/proj/cli"
)
func main{
    os.Exit(cli.Run())
}

That’s it. This is approximately the most minimal code you can have in a useful package main, thereby wasting no effort on code that others can’t reuse. We isolated os.Exit to a single line function that is the very exterior of our project, and effectively needs no testing.

Project Layout

Let’s get a look at the total package layout:

/home/you/src/github.com/you/proj $ tree
.
├── cli
│   ├── parse.go
│   ├── parse_test.go
│   └── run.go
├── LICENSE
├── main.go
├── README.md
└── run
    ├── command.go
    └── command_test.go

We know what’s in main.go… and in fact, main.go is the only go file in the main package. LICENSE and README.md should be self-explanatory. (Always use a license! Otherwise many people won’t be able to use your code.)

Now we come to the two subdirectories, run and cli.

CLI

The cli package contains the command line parsing logic. This is where you define the UI for your binary. It contains flag parsing, arg parsing, help text, etc.

It also contains the code that returns the exit code to func main (which gets sent to os.Exit). Thus, you can test exit codes returned from those functions, instead of trying to test exit codes your binary as a whole produces.

Run

The run package contains the meat of the logic of your binary. You should write this package as if it were a standalone library. It should be far removed from any thoughts of CLI, flags, etc. It should take in structured data and return errors. Pretend it might get called by some other library, or a web service, or someone else’s binary. Make as few assumptions as possible about how it’ll be used, just as you would a generic library.

Now, obviously, larger projects will require more than one directory. In fact, you may want to split out your logic into a separate repo. This kind of depends on how likely you think it’ll be that people want to reuse your logic. If you think it’s highly likely, I recommend making the logic a separate directory. In my mind, a separate directory for the logic shows a stronger committment to quaity and stability than some random directory nestled deep in a repo somewhere.

Putting it together

The cli package forms a command line frontend for the logic in the run package. If someone else comes along, sees your binary, and wants to use the logic behind it for a web API, they can just import the run package and use that logic directly. Likewise, if they don’t like your CLI options, they can easily write their own CLI parser and use it as a frontend to the run package.

This is what I mean about reusable code. I never want someone to have to hack apart my code to get more use out of it. And the best way to do that is to separate the UI from the logic. This is the key part. Don’t let your UI (CLI) concepts leak into your logic. This is the best way to keep your logic generic, and your UI manageable.

Larger Projects

This layout is good for small to medium projects. There’s a single binary that is in the root of the repo, so it’s easier to go-get than if it’s under multiple subdirectories. Larger projects pretty much throw everything out the window. They may have multiple binaries, in which case they can’t all be in the root of the repo. However, such projects usually also have custom build steps and require more than just go-get (which I’ll talk about later).

More to come soon.

Vanity Imports with Hugo

Oct 16, 2016

When working on Gorram, I decided I wanted to release it via a vanity import path. After all, that’s half the reason I got npf.io in the first place (an idea blatantly stolen from Russ Cox’s rsc.io).

What is a vanity import path? It is explained in the go get documentation. If you’re not hosted on one of the well known hosting sites (github, bitbucket, etc), go get has to figure out how to get your code. How it does this is fairly ingenious - it performs an http GET of the import path (first https then http) and looks for specific meta elements in the page’s header. The header elements tells go get what type of VCS is being used and what address to use to get the code.

The great thing about this is that it removes the dependency of your code on any one code hosting site. If you want to move your code from github to bitbucket, you can do that without breaking anyone.

So, the first thing you need to host your own vanity imports is something that will respond to those GET requests with the right response. You could do something complicated like a special web application running on a VM in the cloud, but that costs money and needs maintenance. Since I already had a Hugo website (running for free on github pages), I wanted to see if I could use that. It’s a slightly more manual process, but the barrier of entry is a lot lower and it works on any free static hosting (like github pages).

So what I want is to have go get npf.io/gorram, actually download the code from https://github.com/natefinch/gorram. For that, I need https://npf.io/gorram to serve up this meta element:

<meta name="go-import" content="npf.io/gorram git https://github.com/natefinch/gorram">

or more generally:

<meta name="go-import" content="import-prefix vcs repo-root">

Where import-prefix is a string that matches a prefix of the import statement used in your code, vcs is the type of source control used, and repo-root is the root of the VCS repo where your code lives.

What’s important to note here is that these should be set this way for packages in subdirectories as well. So, for npf.io/gorram/run, the meta tag should still be as above, since it matches a prefix of the import path, and the root of the repo is still github.com/natefinch/gorram. (We’ll get to how to handle subdirectories later.)

You need a page serving that meta tag to live at the exact same place as the import statement… that generally will mean it needs to be in the root of your domain (I know that I, personally don’t want to see go get npf.io/code/gorram when I could have go get npf.io/gorram).

The easiest way to do this and keep your code organized is to put all your pages for code into a new directory under content called “code”. Then you just need to set the “permalink” for the code type in your site’s config file thusly:

[Permalinks]
	code = "/:filename/"

Then your content’s filename (minus extension) will be used as its url relative to your site’s base URL. Following the same example as above, I have content/code/gorram.md which will make that page now appear at npf.io/gorram.

Now, for the content. I don’t actually want to have to populate this page with content… I’d rather people just get forwarded on to github, so that’s what we’ll do, by using a refresh header. So here’s our template, that’ll live under layouts/code/single.html:

<!DOCTYPE html>
<head>
  <meta http-equiv="content-type" content="text/html; charset=utf-8">
  <meta name="go-import" content="npf.io{{substr .RelPermalink 0 -1}} git {{.Params.vanity}}">
  <meta name="go-source" content="npf.io{{substr .RelPermalink 0 -1}} {{.Params.vanity}} {{.Params.vanity}}/tree/master{/dir} {{.Params.vanity}}/blob/master{/dir}/{file}#L{line}">
  <meta http-equiv="refresh" content="0; url={{.Params.vanity}}">
</head>
</html>

This will generate a page that will auto-forward anyone who hits it on to your github account. Now, there’s one more (optional but recommended) piece - the go-source meta header. This is only relevant to godoc.org, and tells godoc how to link to the sourcecode for your package (so links on godoc.org will go straight to github and not back to your vanity url, see more details here).

Now all you need is to put a value of vanity = https://github.com/you/yourrepo in the frontmatter of the correct page, and the template does the rest. If your repo has multiple directories, you’ll need a page for each directory (such as npf.io/gorram/run). This would be kind of a drag, making the whole directory struture with content docs in each, except there’s a trick you can do here to make that easier.

I recently landed a change in Hugo that lets you customize the rendering of alias pages. Alias pages are pages that are mainly used to redirect people from an old URL to the new URL of the same content. But in our case, they can serve up the go-import and go-source meta headers for subdirectories of the main code document. To do this, make an alias.html template in the root of your layouts directory, and make it look like this:

<!DOCTYPE html><html>
    <head>
        {{if .Page.Params.vanity -}}
        <meta name="go-import" content="npf.io{{substr .Page.RelPermalink 0 -1}} git {{.Page.Params.vanity}}">
        <meta name="go-source" content="npf.io{{substr .Page.RelPermalink 0 -1}} {{.Page.Params.vanity}} {{.Page.Params.vanity}}/tree/master{/dir} {{.Page.Params.vanity}}/blob/master{/dir}/{file}#L{line}">
        {{- end}}
        <title>{{ .Permalink }}</title>
        <link rel="canonical" href="{{ .Permalink }}"/>
        <meta http-equiv="content-type" content="text/html; charset=utf-8" />
        <meta http-equiv="refresh" content="0; url={{ .Permalink }}" />
    </head>
</html>

Other than the stuff in the if statement, the rest is the default alias page that Hugo creates anyway. The stuff in the if statement is basically the same as what’s in the code template, just with an extra indirection of specifying .Page first.

Note that this change to Hugo is in master but not in a release yet. It’ll be in 0.18, but for now you’ll have to build master to get it.

Now, to produce pages for subpackages, you can just specify aliases in the front matter of the original document with the alias being the import path under the domain name:

aliases = [ "gorram/run", "gorram/cli" ]

So your entire content only needs to look like this:

+++
date = 2016-10-02T23:00:00Z
title = "Gorram"
vanity = "https://github.com/natefinch/gorram"
aliases = [
    "/gorram/run",
    "/gorram/cli",
]
+++

Any time you add a new subdirectory to the package, you’ll need to add a new alias, and regenerate the site. This is unfortunately manual, but at least it’s a trivial amount of work.

That’s it. Now go get (and godoc.org) will know how to get your code.

To Enum or Not To Enum

Dec 2, 2015

Enum-like values have come up in my reviews of other people’s code a few times, and I’d like to nail down what we feel is best practice.

I’ve seen many places what in other languages would be an enum, i.e. a bounded list of known values that encompass every value that should ever exist.

The code I have been critical of simply calls these values strings, and creates a few well-known values, thusly: package tool

// types of tools
const (
    ScrewdriverType = "screwdriver"
    HammerType = "hammer"
   // ...
)

type Tool struct {
    typ string
}

func NewTool(tooltype string) (Tool, error) {
    switch tooltype{
        case ScrewdriverType, HammerType:
            return Tool{typ:tooltype}, nil
        default:
            return Tool{}, errors.New("invalid type")
    }
}

The problem with this is that there’s nothing stopping you from doing something totally wrong like this:

name := user.Name()

// ... some other stuff

a := NewTool(name)

That would fail only at runtime, which kind of defeats the purpose of having a compiler.

I’m not sure why we don’t at least define the tool type as a named type of string, i.e.

package tool

type ToolType string

const (
    Screwdriver ToolType = "screwdriver"
    Hammer = "hammer"
   // ...
)

type Tool struct {
    typ ToolType
}

func NewTool(tooltype ToolType) Tool {
        return Tool{typ:tooltype}
}

Note that now we can drop the error checking in NewTool because the compiler does it for us. The ToolType still works in all ways like a string, so it’s trivial to convert for printing, serialization, etc.

However, this still lets you do something which is wrong but might not always look wrong: a := NewTool("drill") Because of how Go constants work, this will get converted to a ToolType, even though it’s not one of the ones we have defined.

The final revision, which is the one I’d propose, removes even this possibility, by not using a string at all (it also uses a lot less memory and creates less garbage):

package tool

type ToolType int

const (
    Screwdriver ToolType = iota
    Hammer
   // ...
)

type Tool struct {
    typ ToolType
}

func NewTool(tooltype ToolType) Tool {
        return Tool{typ:tooltype}
}

This now prevents passing in a constant string that looks like it might be right. You can pass in a constant number, but NewTool(5) is a hell of a lot more obviously wrong than NewTool("drill"), IMO.

The push back I’ve heard about this is that then you have to manually write the String() function to make human-readable strings… but there are code generators that already do this for you in extremely optimized ways (see https://github.com/golang/tools/blob/master/cmd/stringer/stringer.go)

Returning Errors

Oct 10, 2015

There are basically two ways to return errors in Go:

func (c Config) Save() error {
	if err := c.checkDefault(); err != nil {
		return err
	}
	...
}

func (c Config) Save() error {
	if err := c.checkDefault(); err != nil {
		return fmt.Errorf("can't find default config file: %v", err)
	}
	...
}

The former passes the original error up the stack, but adds no context to it. Thus, your saveConfig function may end up printing “file not found: default.cfg” without telling the caller why it was trying to open default.cfg.

The latter allows you to add context to an error, so the above error could become “can’t find default config file: file not found: default.cfg”. This gives nice context to the error, but unfortunately, it creates an entirely new error that only maintains the error string from the original. This is fine for human-facing output, but is useless for error handling code.

If you use the former code, calling code can then use os.IsNotExist(), figure out that it was a not found error, and create the file. Using the latter code, the type of the error is now a different type than the one from os.Open, and thus will not return true from os.IsNotExist. Using fmt.Errorf effectively masks the original error from calling code (unless you do ugly string parsing - please don’t).

Sometimes it’s good to mask the original error, if you don’t want your callers depending on what should be an implementation detail (thus effectively making it part of your API contract). However, lots of times you may want to give your callers the ability to introspect your errors and act on them. This then loses the opportunity to add context to the error, and so people calling your code have to do some mental gymnastics (and/or look at the implementation) to understand what an error really means.

A further problem for both these cases is that when debugging, you lose all knowledge of where an error came from. There’s no stack trace, there’s not even a file and line number of where the error originated. This can make debugging errors fairly difficult, unless you’re careful to make your error messages easy to grep for. I can’t tell you how often I’ve searched for an error formatting string, and hoped I was guessing the format correctly.

This is just the way it is in Go, so what’s a developer to do? Why, write an errors library that does smarter things of course! And there are a ton of these things out there. Many add a stack trace at error creation time. Most wrap an original error in some way, so you can add some context while keeping the original error for checks like os.IsNotExist. At Canonical, the Juju team wrote just such a library (actually we wrote 3 and then had them fight until only one was standing), and the result is https://github.com/juju/errors.

Thus you might return an error this way:

func (c Config) Save() error {
	if err := c.checkDefault(); err != nil {
		return errors.Annotatef(err, "can't find default config file")
	}
}

This returns a new error created by the errors package which adds the given string to the front of the original error’s error message (just like fmt.Errorf), but you can introspect it using errors.Cause(err) to access the original error return by checkDefault. Thus you can use os.IsNotExist(errors.Cause(err)) and it’ll do the right thing.

However, this and every other special error library suffer from the same problem - your library can only understand its own special errors. And no one else’s code can understand your errors (because they won’t know to use errors.Cause before checking the error). Now you’re back to square one - your errors are just as opaque to third party code as if they were created by fmt.Errorf.

I don’t really have an answer to this problem. It’s inherent in the functionality (or lack thereof) of the standard Go error type.

Obviously, if you’re writing a standalone package for many other people to use, don’t use a third party error wrapping library. Your callers are likely not going to be using the same library, so they won’t get use out of it, and it adds unnecessary dependencies to your code. To decide between returning the original error and an annotated error using fmt.Errorf is harder. It’s hard to know when the information in the original error might be useful to your caller. On the other hand, the additional context added by fmt.Errorf can often change an inscrutable error into an obvious one.

If you’re writing an application where you’ll be controlling most of the packages being written, then an errors package may make sense… but you still run the risk of giving your custom errors to third party code that can’t understand them. Plus, any errors library adds some complexity to the code (for example, you always have to rememeber to call os.IsNotExist(errors.Cause(err)) rather than just calling os.InNotExist(err)).

You have to choose one of the three options every time you return an error. Choose carefully. Sometimes you’re going to make a choice that makes your life more difficult down the road.

Take control of your commands with Deputy

Jun 30, 2015

deputy-sm

^{_{image: creative commons, © MatsuRD}}

As a part of my work on Juju, I have published a new package at http://github.com/juju/deputy. I think it’ll be of general use to a lot of people.

I want to name a package "lieutenant", but it's too hard to spell.
— Nate Finch (@NateTheFinch) June 15, 2015

True story. The idea was this package would be a lieutenant commander (get it?)… but I also knew I didn’t want to have to try to spell lieutenant correctly every time I used the package. So that’s why it’s called deputy. He’s the guy who’s not in charge, but does all the work.

Errors

At Juju, we run a lot of external processes using os/exec. However, the default functionality of an exec.Cmd object is kind of lacking. The most obvious one is those error returns “exit status 1”. Fantastic. Have you ever wished you could just have the stderr from the command as the error text? Well, now you can, with deputy.

func main() {
    d := deputy.Deputy{
        Errors:    deputy.FromStderr,
    }
    cmd := exec.Command("foo", "bar", "baz")
    err := d.Run(cmd)
}

In the above code, if the command run by Deputy exits with a non-zero exit status, deputy will capture the text output to stderr and convert that into the error text. e.g. if the command returned exit status 1 and output “Error: No such image or container: bar” to stderr, then the error’s Error() text would look like “exit status 1: Error: No such image or container: bar”. Bam, the errors from commands you run are infinitely more useful.

Logging

Another idiom we use is to pipe some of the output from a command to our logs. This can be super useful for debugging purposes. With deputy, this is again easy:

func main() {
    d := deputy.Deputy{
        Errors:    deputy.FromStderr,
        StdoutLog: func(b []byte) { log.Print(string(b)) },
    }
    cmd := exec.Command("foo", "bar", "baz")
    err := d.Run(cmd)
}

That’s it. Now every line written to stdout by the process will be piped as a log message to your log.

Timeouts

Finally, an idiom we don’t use often enough, but should, is to add a timeout to command execution. What happens if you run a command as part of your pipeline and that command hangs for 30 seconds, or 30 minutes, or forever? Do you just assume it’ll always finish in a reasonable time? Adding a timeout to running commands requires some tricky coding with goroutines, channels, selects, and killing the process… and deputy wraps all that up for you in a simple API:

func main() {
    d := deputy.Deputy{
        Errors:    deputy.FromStderr,
        StdoutLog: func(b []byte) { log.Print(string(b)) },
        Timeout:   time.Second * 10,
    }
    cmd := exec.Command("foo", "bar", "baz")
    err := d.Run(cmd)
}

The above code adds a 10 second timeout. After that time, if the process has not finished, it will be killed and an error returned.

That’s it. Give deputy a spin and let me know what you think.

Testing os/exec.Command

Jun 26, 2015

In Juju, we often have code that needs to run external executables. Testing this code is a nightmare… because you really don’t want to run those files on the dev’s machine or the CI machine. But mocking out os/exec is really hard. There’s no interface to replace, there’s no function to mock out and replace. In the end, your code calls the Run method on the exec.Cmd struct.

There’s a bunch of bad ways you can mock this out - you can write out scripts to disk with the right name and structure their contents to write out the correct data to stdout, stderr and return the right return code… but then you’re writing platform-specific code in your tests, which means you need a Windows version and a Linux version… It also means you’re writing shell scripts or Windows batch files or whatever, instead of writing Go. And we all know that we want our tests to be in Go, not shell scripts.

So what’s the answer? Well, it turns out, if you want to mock out exec.Command, the best place to look is in the exec package’s tests themselves. Lo and behold, it’s right there in the first function of exec_test.go

func helperCommand(t *testing.T, s ...string) *exec.Cmd {
    cs := []string{"-test.run=TestHelperProcess", "--"}
    cs = append(cs, s...)
    cmd := exec.Command(os.Args[0], cs...)
    cmd.Env = []string{"GO_WANT_HELPER_PROCESS=1"}
    return cmd
}

_{^{(one line elided for clarity)}}

What the heck is that doing? It’s pretty slick, so I’ll explain it.

First off, you have to understand how tests in Go work. When running go test, the go tool compiles an executable from your code, runs it, and passes it the flags you passed to go test. It’s that executable which actually handles the flags and runs the tests. Thus, while your tests are running, os.Args[0] is the name of the test executable.

This function is making an exec.Command that runs the test executable, and passes it the flag to tell the executable just to run a single test. It then terminates the argument list with -- and appends the command and arguments that would have been given to exec.Command to run your command.

The end result is that when you run the exec.Cmd that is returned, it will run the single test from this package called “TestHelperProcess” and os.Args will contain (after the --) the command and arguments from the original call.

The environment variable is there so that the test can know to do nothing unless that environment variable is set.

This is awesome for a few reasons:

It’s all Go code. No more needing to write shell scripts.
The code run in the excutable is compiled with the rest of your test code. No more needing to worry about typos in the strings you’re writing to disk.
No need to create new files on disk - the executable is already there and runnable, by definition.

So, let’s use this in a real example to make it more clear.

In your production code, you can do something like this:

var execCommand = exec.Command
func RunDocker(container string) ([]byte, error) {
    cmd := execCommand("docker", "run", "-d", container)
    out, err := cmd.CombinedOutput()
}

Mocking this out in test code is now super easy:

func fakeExecCommand(command string, args...string) *exec.Cmd {
    cs := []string{"-test.run=TestHelperProcess", "--", command}
    cs = append(cs, args...)
    cmd := exec.Command(os.Args[0], cs...)
    cmd.Env = []string{"GO_WANT_HELPER_PROCESS=1"}
    return cmd
}

const dockerRunResult = "foo!"
func TestRunDocker(t *testing.T) {
    execCommand = fakeExecCommand
    defer func(){ execCommand = exec.Command }()
    out, err := RunDocker("docker/whalesay")
    if err != nil {
        t.Errorf("Expected nil error, got %#v", err)
    }
    if string(out) != dockerRunResult {
        t.Errorf("Expected %q, got %q", dockerRunResult, out)
    }
}

func TestHelperProcess(t *testing.T){
    if os.Getenv("GO_WANT_HELPER_PROCESS") != "1" {
        return
    }
    // some code here to check arguments perhaps?
    fmt.Fprintf(os.Stdout, dockerRunResult)
    os.Exit(0)
}

Of course, you can do a lot more interesting things. The environment variables on the command that fakeExecCommand returns make a nice side channel for telling the executable what you want it to do. I use one to tell the process to exit with a non-zero error code, which is great for testing your error handling code. You can see how the standard library uses its TestHelperProcess test here.

Hopefully this will help you avoid writing really gnarly testing code (or even worse, not testing your code at all).

Sharing Godoc of a WIP Branch

Jun 11, 2015

I had a problem yesterday - I wanted to use the excellent godoc.org to show coworkers the godoc for the feature I was working on. However, the feature was on a branch of the main code in Github, and go get Does Not Work That Way™. So, what to do? Well, I figured out a hack to make it work.

https://gopkg.in is a super handy service that lets you point go get at branches of your repo named vN (e.g. v0, v1, etc). It also happens to work on tags. So, we can leverage this to get godoc.org to render the godoc for our WIP branch.

From your WIP branch, simply do

git tag v0
git push myremote v0

This creates a lightweight tag that only affects your repo (not upstream from whence you forked).

You now can point godoc at your branch by way of gopkg.in: https://godoc.org/gopkg.in/GithubUser/repo.v0

This will tell godoc to ‘go get’ your code from gopkg.in, and gopkg.in will redirect the command to your v0 tag, which is currently on your branch. Bam, now you have godoc for your WIP branch on godoc.org.

Later, the tag can easily be removed (and reused if needed) thusly:

git tag -d v0
git push myremote :refs/tags/v0

So, there you go, go forth and share your godoc. I find it’s a great way to get feedback on architecture before I dive into the reeds of the implementation.

Go Plugins are as Easy as Pie

May 25, 2015

When people hear that Go only supports static linking, one of the things they eventually realize is that they can’t have traditional plugins via dlls/libs (in compiled languages) or scripts (in interpreted languages). However, that doesn’t mean that you can’t have plugins. Some people suggest doing “compiled- in” plugins - but to me, that’s not a plugin, that’s just code. Some people suggest just running sub processes and sending messages via their CLI, but that runs into CLI parsing issues and requires runnnig a new process for every request. The last option people think of is using RPC to an external process, which may also seem cumbersome, but it doesn’t have to be.

Serving up some pie

I’d like to introduce you to https://github.com/natefinch/pie - this is a Go package which contains a toolkit for writing plugins in Go. It uses processes external to the main program as the plugins, and communicates with them via RPC over the plugin’s stdin and stout. Having the plugin as an external process can actually has several benefits:

If the plugin crashes, it won’t crash your process.
The plugin is not in your process’ memory space, so it can’t do anything nasty.
The plugin can be written in any language, not just Go.

I think this last point is actually the most valuable. One of the nicest things about Go applications is that they’re just copy-and-run. No one even needs to know they were written in Go. With plugins as external processes, this remains true. People wanting to extend your application can do so in the language of their choice, so long as it supports the codec your application has chosen for RPC.

The fact that the communication occurs over stdin and stdout means that there is no need to worry about negotiating ports, it’s easily cross platform compatible, and it’s very secure.

Orthogonality

Pie is written to be a very simple set of functions that help you set up communication between your process and a plugin process. Once you make a couple calls to pie, you then need to work out your own way to use the RPC connection created. Pie does not attempt to be an all-in-one plugin framework, though you could certainly use it as the basis for one.

Why is it called pie?

Because if you pronounce API like “a pie”, then all this consuming and serving of APIs becomes a lot more palatable. Also, pies are the ultimate pluggable interface - depending on what’s inside, you can get dinner, dessert, a snack, or even breakfast. Plus, then I get to say that plugins in Go are as easy as… well, you know.

Conclusion

I plan to be using pie in one of my own side projects. Take it out for a spin in one of your projects and let me know what you think. Happy eating!

Go Nitpicks

Oct 28, 2014

I saw this tweet last night:

A code interview I like to ask: "What would you change about <your favourite language>?" Having nothing to say to that is a big strike.
— karlseguin (@karlseguin) October 27, 2014

I figured I’d answer it here about Go. Luckily, Go is a very small language, so there’s not a lot of surface area to dislike. However, there’s definitely some things I wish were different. Most of these are nitpicks, thus the title.

#1 Bare Returns

func foo() (i int, err error) {
    i, err = strconv.ParseInt("5") 
    return // wha??
}

For all that Go promotes readable and immediately understandable code, this seems like a ridiculous outlier. The way it works is that if you don’t declare what the function is returning, it’ll return the values stored in the named return variables. Which seems logical and handy, until you see a 100 line function with multiple branches and a single bare return at the bottom, with no idea what is actually getting returned.

To all gophers out there: don’t use bare returns. Ever.

#2 New

a := new(MyStruct)

New means “Create a zero value of the given type and return a pointer to it”. It’s sorta like the C++ new, which is probably why it exists. The problem is that it’s nearly useless. It’s mostly redundant with simply returning the address of a value thusly:

a := &MyStruct{}

The above is a lot easier to read, it also gives you the ability to populate the value you’re constructing (if you wish). The only time new is “useful” is if you want to initialize a pointer to a builtin (like a string or an int), because you can’t do this:

a := &int

but you can do this:

a := new(int)

Of course, you could always just do it in (gasp) two lines:

a := 0
b := &a

To all the gophers out there: don’t use new. Always use &Foo{} with structs, maps, and slices. Use the two line version for numbers and strings.

#3 Close

The close built-in function closes a channel. If the channel is already closed, close will panic. This pisses me off, because most of the time when I call close, I don’t actually care if it’s already closed. I just want to ensure that it’s closed. I’d much prefer if close returned a boolean that said whether or not it did anything, and then if I choose to panic, I can. Or, you know, not.

#4 There is no 4

That’s basically it. There’s some things I think are necessary evils, like goto and panic. There’s some things that are necessary ugliness, like the built-in functions append, make, delete, etc. I sorta wish x := range foo returned the value in x and not the index, but I get that it’s to be consistent between maps and slices, and returning the value in maps would be odd, I think.

All these are even below the level of nitpicks, though. They don’t bug me, really. I understand that everything in programming is a tradeoff, and I think the decisions made for Go were the right ones in these cases. Sometimes you need goto. Sometimes you need to panic. Making those functions built-ins rather than methods on the types means you don’t need any methods on the types, which keeps them simpler, and means they’re “just data”. It also means you don’t lose any functionality if you make new named types based on them.

So that’s my list for Go.

Postscript

Someone on the twitter discussion mentioned he couldn’t think of anything he disliked about C#, which just about made me spit my coffee across the room. I programmed in C# for ~9 years, starting out porting some 1.1 code to 2.0, and leaving as 5.0 came out. The list of features in C# as of 5.0 is gigantic. Even being a developer writing in it 40+ hours a week for 9 years, there was still stuff I had to look up to remember how it worked.

I feel like my mastery of Go after a year of side projects was about equivalent to my mastery of C# after 9 years of full time development. If we assume 1:1 correlation between time to master and size of the language, an order of magnitude sounds about right.

Why Everyone Hates Go

Oct 14, 2014

Obviously, not everyone hates Go. But there was a quora question recently about why everyone criticizes Go so much. (sorry, I don’t normally post links to Quora, but it was the motivator for this post) Even before I saw the answers to the question, I knew what they’d consist of:

Go is a language stuck in the 70’s.
Go ignores 40 years of programming language research.
Go is a language for blue collar (mediocre) developers.
Gophers are ok with working in Java 1.0.

Unfortunately, the answers to the questions were more concerned with explaining why Go is “bad”, rather than why this gets under so many people’s skin.

When reading the answers I had a eureka moment, and I realized why it is. So here’s my answer to the same question. This is why Go is so heavily criticized, not why Go is “bad”.

There’s two awesome posts that inform my answer: Paul Graham’s post about keeping your identity small, and Kathy Sierra’s post about the Koolaid point. I encourage you to read those two posts, as they’re both very informative. I hesitate to compare the horrific things that happen to women online with the pedantry of flamewars about programming languages, but the Koolaid Point is such a valid metaphor that I wanted to link to the article.

Paul says

people can never have a fruitful argument about something that’s part of their identity

i.e. the subject hits too close to home, and their response becomes emotional rather than logical.

Kathy says

the hate wasn’t so much about the product/brand but that other people were falling for it.

i.e. they’d drunk the kool-aid.

Go is the only recent language that takes the aforementioned 40 years of programming language research and tosses it out the window. Other new languages at least try to keep up with the Jones - Clojure, Scala, Rust - all try to incorporate “modern programming theory” into their design. Go actively tries not to. There is no pattern matching, there’s no borrowing, there’s no pure functional programming, there’s no immutable variables, there’s no option types, there’s no exceptions, there’s no classes, there’s no generics…. there’s a lot Go doesn’t have. And in the beginning this was enough to merely earn it scorn. Even I am guilty of this. When I first heard about Go, I thought “What? No exceptions? Pass.”

But then something happened - people started using it. And liking it. And building big projects with it. This is the Koolaid-point - where people have started to drink the Koolaid and get fooled into thinking Go is a good language. And this is where the scorn turns into derision and attacks on the character of the people using it.

The most vocal Go detractors are those developers who write in ML-derived languages (Haskell, Rust, Scala, et al) who have tied their preferred programming language into their identity. The mere existence of Go says “your views on what makes a good programming language are wrong”. And the more people that use and like Go, the more strongly they feel that they’re being told their choice of programming language - and therefore their identity - is wrong.

Note that basically no one in the Go community actually says this. But the Go philosophy of simplicity and pragmatism above all else is the polar opposite of what those languages espouse (in which complexity in the language is ok because it enforces correctness in the code). This is insulting to the people who tie their identity to that language. Whenever a post on Go makes it to the front page of Hacker News, it is an affront to everything they hold dear, and so you get comments like Go developers are stuck in the 70’s, or is only for blue-collar devs.

So, this is why I think people are so much more vocal about their dislike of Go: because it challenges their identity, and other people are falling for it. This is also why these posts so often mention Google and how the language would have died without them. Google is now the koolaid dispenser. The fact that they are otherwise generally thought of as a very talented pool of developers means that it is simultaneously more outrageous that they are fooling people and more insulting that their language flies in the face of ML-derived languages.

Deploy Discourse with Juju in 8 minutes

Oct 1, 2014

Steve Francia asked me to help him get Discourse deployed as a place for people to discuss Hugo, his static site generator (which is what I use to build this blog). If you don’t know Discourse, it’s pretty amazing forum software with community-driven moderation, all the modern features you expect (@mentions, SSO integration, deep email integration, realtime async updates, and a whole lot more). What I ended up deploying is now at discuss.gohugo.io.

I’d already played around with deploying Discourse about six months ago, so I already had an idea of what was involved. Given that I work on Juju as my day job, of course I decided to use Juju to deploy Discourse for Steve. This involved writing a Juju charm which is sort of like an install script, but with hooks for updating configuration and hooks for interacting with other services. I’ll talk about the process of writing the charm in a later post, but for now, all you need to know is that it follows the official install guide for installing Discourse.

The install guide says that you can install Discourse in 30 minutes. Following it took me a lot longer than that, due to some confusion about what the install guide really wanted you to do, and what the install really required. But you don’t need to know any of that to use Juju to install Discourse, and you can get it done in 8 minutes, not 30. Here’s how:

First, install Juju:

sudo add-apt-repository -y ppa:juju/stable
sudo apt-get update && sudo apt-get install -y juju-core

Now, Juju does not yet have a provider for Digital Ocean, so we have to use a plugin to get the machine created. We’re in the process of writing a provider for Digital Ocean, so soon the plugin won’t be necessary. If you use another cloud provider, such as AWS, Azure, HP Cloud, Joyent, or run your own Openstack or MAAS, you can easily configure Juju to use that service, and a couple of these steps will not be necessary. I’ll post separate steps for that later. But for now, let’s assume you’re using Digital Ocean.

Install the juju Digital Ocean plugin:

sudo apt-get install -y python-pip
pip install -U juju-docean

Get your Digital Ocean access info and set the client id in an environment variable called DO_CLIENT_ID and the API key in an environment variable called DO_API_KEY.

Juju requires access with an SSH key to the machines, so make sure you have one set up in your Digital Ocean account.

Now, let’s create a simple configuration so juju knows where you want to deploy your new environment.

juju init

Running juju init will create a boilerplate configuration file at ~/.juju/environments.yaml. We’ll append our digital ocean config at the bottom:

echo "    digitalocean:
        type: manual
        bootstrap-host: null
        bootstrap-user: root
" >> ~/.juju/environments.yaml

Note that this is yaml, so the spaces at the beginning of each line are important. Copy and paste should do the right thing, though.

Now we can start the real fun, let’s switch to the digitalocean environment we just configured, and create the first Juju machine in Digital Ocean:

juju switch digitalocean
juju docean bootstrap --constraints="mem=2g, region=nyc2"

(obviously replace the region with whatever one you want)

Now, it’ll take about a minute for the machine to come up.

Discourse requires email to function, so you need an account at mandrill, mailgun, etc. They’re free, so don’t worry. From that account you need to get some information to properly set up Discourse. You can do this after installing discourse, but it’s faster if you do it before and give the configuration at deploy time. (changing settings later will take a couple minutes while discourse reconfigures itself)

When you deploy discourse, you’re going to give it a configuration file, which will look something like this:

discourse:
  DISCOURSE_HOSTNAME: discuss.example.com
  DISCOURSE_DEVELOPER_EMAILS: foo@example.com,bar@example.com
  DISCOURSE_SMTP_ADDRESS: smtp.mailservice.com
  DISCOURSE_SMTP_PORT: 587
  DISCOURSE_SMTP_USER_NAME: postmaster@example.com
  DISCOURSE_SMTP_PASSWORD: supersecretpassword
  UNICORN_WORKERS: 3

The first line must be the same as the name of the service you’re deploying. By default it’s “discourse”, so you don’t need to change it unless you’re deploying multiple copies of discourse to the same Juju environment. And remember, this is yaml, so those spaces at the beginning of the rest of the lines are important.

The rest should be pretty obvious. Hostname is the domain name where your site will be hosted. This is important, because discourse will send account activation emails, and the links will use that hostname. Developer emails are the email addresses of accounts that should get automatically promoted to admin when created. The rest is email-related stuff from your mail service account. Finally, unicorn workers should just stay 3 unless you’re deploying to a machine with less than 2GB of RAM, in which case set it to 2.

Ok, so now that you have this file somewhere on disk, we can deploy discourse. Don’t worry, it’s really easy. Just do this:

juju deploy cs:~natefinch/trusty/discourse --config path/to/configfile --to 0
juju expose discourse

That’s it. If you’re deploying to a 2GB Digital Ocean droplet, it’ll take about 7 minutes.

To check on the status of the charm deployment, you can do juju status, which will show, among other things “agent-state: pending” while the charm is being deployed. Or, if you want to watch the logs roll by, you can do juju debug- log.

Eventually juju status will show agent-state: started. Now grab the ip address listed at public address: in the same output and drop that into your browser. Bam! Welcome to Discourse.

If you ever need to change the configuration you set in the config file above, you can do that by editing the file and doing

juju set discourse --config=/path/to/config

Or, if you just want to tweak a few values, you can do

juju set discourse foo=bar baz=bat ...

Note that every time you call juju set, it’ll take a couple minutes for Discourse to reconfigure itself, so you don’t want to be doing this over and over if you can hep it.

Now you’re on your own, and will have to consult the gurus at discourse.org if you have any problems. But don’t worry, since you deployed using Juju, which uses their official install instructions, your discourse install is just like the ones people deploy manually (albeit with a lot less time and trouble).

Good Luck!

Please let me know if you find any errors in this page, and I will fix them immediately.

Intro to TOML

Aug 16, 2014

TOML stands for Tom’s Own Minimal Language. It is a configuration language vaguely similar to YAML or property lists, but far, far better. But before we get into it in detail, let’s look back at what came before.

Long Ago, In A Galaxy Far, Far Away

Since the beginning of computing, people have needed a way to configure their software. On Linux, this generally is done in text files. For simple configurations, good old foo = bar works pretty well. One setting per line, name on the left, value on the right, separated by an equals. Great. But when your configuration gets more complicated, this quickly breaks down. What if you need a value that is more than one line? How do you indicate a value should be parsed as a number instead of a string? How do you namespace related configuration values so you don’t need ridiculously long names to prevent collisions?

The Dark Ages

In the 90’s, we used XML. And it sucked. XML is verbose, it’s hard for humans to read and write, and it still doesn’t solve a lot of the problems above (like how to specify the type of a value). In addition, the XML spec is huge, processing is very complicated, and all the extra features invite abuse and overcomplication.

Enlightenment

In the mid 2000’s, JSON came to popularity as a data exchange format, and it was so much better than XML. It had real types, it was easy for programs to process, and you didn’t have to write a spec on what values should get processed in what way (well, mostly). It was sigificantly less verbose than XML. But it is a format intended for computers to read and write, not humans. It is a pain to write by hand, and even pretty-printed, it can be hard to read and the compact data format turns into a nested mess of curly braces. Also, JSON is not without its problems… for example, there’s no date type, there’s no support for comments, and all numbers are floats.

A False Start

YAML came to popularity some time after JSON as a more human-readable format, and its key: value syntax and pretty indentation is definitely a lot easier on the eyes than JSON’s nested curly-braces. However, YAML trades ease of reading for difficulty in writing. Indentation as delimiters is fraught with error… figuring out how to get multiple lines of data into any random value is an exercise in googling and trial & error.

The YAML spec is also ridiculously long. 100% compatible parsers are very difficult to write. Writing YAML by hand is a ridden with landmines of corner cases where your choice of names or values happens to hit a reserved word or special marker. It does support comments, though.

The Savior

On February 23, 2013, Tom Preston-Werner (former CEO of GitHub) made his first commit to https://github.com/toml-lang/toml. TOML stands for Tom’s Obvious, Minimal Language. It is a language designed for configuring software. Finally.

TOML takes inspiration from all of the above (well, except XML) and even gets some of its syntax from Microsoft’s INI files. It is easy to write by hand and easy to read. The spec is short and understandable by mere humans, and it’s fairly easy for computers to parse. It supports comments, has first class dates, and supports both integers and floats. It is generally insensitive to whitespace, without requiring a ton of delimiters.

Let’s dive in.

The Basics

The basic form is key = value

# Comments start with hash
foo = "strings are in quotes and are always UTF8 with escape codes: \n \u00E9"

bar = """multi-line strings
use three quotes"""

baz = 'literal\strings\use\single\quotes'

bat = '''multiline\literals\use
three\quotes'''

int = 5 # integers are just numbers
float = 5.0 # floats have a decimal point with numbers on both sides

date = 2006-05-27T07:32:00Z # dates are ISO 8601 full zulu form

bool = true # good old true and false

One cool point: If the first line of a multiline string (either literal or not) is a line return, it will be trimmed. So you can make your big blocks of text start on the line after the name of the value and not need to worry about the extraneous newline at the beginning of your text:

preabmle = """
We the people of the United States, in order to form a more perfect union,
establish justice, insure domestic tranquility, provide for the common defense,
promote the general welfare, and secure the blessings of liberty to ourselves
and our posterity, do ordain and establish this Constitution for the United
States of America."""

Lists

Lists (arrays) are signified with brackets and delimited with commas. Only primitives are allowed in this form, though you may have nested lists. The format is forgiving, ignoring whitespace and newlines, and yes, the last comma is optional (thank you!):

foo = [ "bar", "baz"
        "bat"
]

nums = [ 1, 2, ]

nested = [[ "a", "b"], [1, 2]]

I love that the format is forgiving of whitespace and that last comma. I like that the arrays are all of a single type, but allowing mixed types of sub-arrays bugs the heck out of me.

Now we get crazy

What’s left? In JSON there are objects, in YAML there are associative arrays… in common parlance they are maps or dictionaries or hash tables. Named collections of key/value pairs.

In TOML they are called tables and look like this:

# some config above
[table_name]
foo = 1
bar = 2

Foo and bar are keys in the table called table_name. Tables have to be at the end of the config file. Why? because there’s no end delimiter. All keys under a table declaration are associated with that table, until a new table is declared or the end of the file. So declaring two tables looks like this:

# some config above
[table1]
foo = 1
bar = 2

[table2]
	foo = 1
	baz = 2

The declaration of table2 defines where table1 ends. Note that you can indent the values if you want, or not. TOML doesn’t care.

If you want nested tables, you can do that, too. It looks like this:

[table1]
	foo = "bar"

[table1.nested_table]
	baz = "bat"

nested_table is defined as a value in table1 because its name starts with table1.. Again, the table goes until the next table definition, so baz="bat" is a value in table1.nested_table. You can indent the nested table to make it more obvious, but again, all whitespace is optional:

[table1]
	foo = "bar"

	[table1.nested_table]
		baz = "bat"

This is equivalent to the JSON:

{ 
	"table1" : {
		"foo" : "bar",
		"nested_table" : {
			"baz" : "bat"
		}
	}
}

Having to retype the parent table name for each sub-table is kind of annoying, but I do like that it is very explicit. It also means that ordering and indenting and delimiters don’t matter. You don’t have to declare parent tables if they’re empty, so you can do something like this:

[foo.bar.baz]
bat = "hi"

Which is the equivalent to this JSON:

{
	"foo" : {
		"bar" : {
			"baz" : {
				"bat" : "hi"
			}
		}
	}
}

Last but not least

The last thing is arrays of tables, which are declared with double brackets thusly:

[[comments]]
author = "Nate"
text = "Great Article!"

[[comments]]
author = "Anonymous"
text = "Love it!"

This is equivalent to the JSON:

{
	"comments" : [
		{
			"author" : "Nate",
			"text" : Great Article!"
		},
		{
			"author" : "Anonymous",
			"text" : Love It!"
		}
	]
}

Arrays of tables inside another table get combined in the way you’d expect, like [[table1.array]].

TOML is very permissive here. Because all tables have very explicitly defined parentage, the order they’re defined in doesn’t matter. You can have tables (and entries in an array of tables) in whatever order you want. This is totally acceptable:

[[comments]]
author = "Anonymous"
text = "Love it!"

[foo.bar.baz]
bat = "hi"

[foo.bar]
howdy = "neighbor"

[[comments]]
author = "Anonymous"
text = "Love it!"

Of course, it generally makes sense to actually order things in a more organized fashion, but it’s nice that you can’t shoot yourself in the foot if you reorder things “incorrectly”.

Conclusion

That’s TOML. It’s pretty awesome.

There’s a list of parsers on the TOML page on github for pretty much whatever language you want. I recommend BurntSushi’s for Go, since it works just like the built-in parsers.

It is now my default configuration language for all the applications I write.

The next time you write an application that needs some configuration, take a look at TOML. I think your users will thank you.

Making It a Series

Aug 8, 2014

Series: Hugo 101

I obviously have a lot to talk about with Hugo, so I decided I wanted to make this into a series of posts, and have links at the bottom of each post automatically populated with the other posts in the series. This turned out to be somewhat of a challenge, but doable with some effort… hopefully someone else can learn from my work.

This now brings us to Taxonomies. Taxonomies are basically just like tags, except that you can have any number of different types of tags. So you might have “Tags” as a taxonomy, and thus you can give a content tags with values of “go” and “programming”. You can also have a taxonomy of “series” and give content a series of “Hugo 101”.

Taxonomy is sort of like relatable metadata to gather multiple pieces of content together in a structured way… it’s almost like a minimal relational database. Taxonomies are listed in your site’s metadata, and consist of a list of keys. Each piece of content can specify one or more values for those keys (the Hugo documentation calls the values “Terms”). The values are completely ad-hoc, and don’t need to be pre-defined anywhere. Hugo automatically creates pages where you can view all content based on Taxonomies and see how the various values are cross-referenced against other content. This is a way to implement tags on posts, or series of posts.

So, for my example, we add a Taxonomy to my site config called “series”. Then in this post, the “Hugo: Beyond the Defaults” post, and the “Hugo is Friggin’ Awesome” post, I just add series = ["Hugo 101"] (note the brackets - the values for the taxonomy are actually a list, even if you only have one value). Now all these posts are magically related together under a taxonomy called “series”. And Hugo automatically generates a listing for this taxonomy value at /series/hugo-101 (the taxonomy value gets url-ized). Any other series I make will be under a similar directory.

This is fine and dandy and pretty aweomse out of the box… but I really want to automatically generate a list of posts in the series at the bottom of each post in the series. This is where things get tricky, but that’s also where things get interesting.

The examples for displaying Taxonomies all “hard code” the taxonomy value in the template… this works great if you know ahead of time what value you want to display, like “all posts with tag = ‘featured’”. However, it doesn’t work if you don’t know ahead of time what the taxonomy value will be (like the series on the current post).

This is doable, but it’s a little more complicated.

I’ll give you a dump of the relevant portion of my post template and then talk about how I got there:

{{ if .Params.series }}
    {{ $name := index .Params.series 0 }}
    <hr/>
	<p><a href="" id="series"></a>This is a post in the 
	<b>{{$name}}</b> series.<br/>
	Other posts in this series:</p>

    {{ $name := $name | urlize }}
    {{ $series := index .Site.Taxonomies.series $name }}
    <ul class="series">
    {{ range $series.Pages }}
    	<li>{{.Date.Format "Jan 02, 2006"}} -
    	<a href="{{.Permalink}}">{{.LinkTitle}}</a></li>
    {{end}}
    </ul>
{{end}}

So we start off defining this part of the template to only be used if the post has a series. Right, sure, move on.

Now, the tricky part… the taxonomy values for the current page resides in the .Params values, just like any other custom metadata you assign to the page.

Taxonomy values are always a list (so you can give things multiple tags etc), but I know that I’ll never give something more than one series, so I can just grab the first item from the list. To do that, I use the index function, which is just like calling series[0] and assign it to the $name variable.

Now another tricky part… the series in the metadata is in the pretty form you put into the metadata, but the list of Taxonomies in .Site.Taxonomies is in the urlized form… How did I figure that out? Printf debugging. Hugo’s auto-reloading makes it really easy to use the template itself to figure out what’s going on with the template and the data.

When I started writing this template, I just put {{$name}} in my post template after the line where I got the name, and I could see it rendered on webpage of my post that the name was “Hugo 101”. Then I put {{.Site.Taxonomies.series}} and I saw something like map[hugo-101:[{0 0xc20823e000} {0 0xc208048580} {0 0xc208372000}]] which is ugly, but it showed me that the value in the map is “hugo-101”… and I realized it was using the urlized version, so I used the pre-defined hugo function urlize to convert the pretty series.

And from there it’s just a matter of using index again, this time to use $name as a key in the map of series…. .Site.Taxonomies is a map (dictionary) of Taxonomy names (like “series”) to maps of Taxonomy values (like “hugo-101”) to lists of pages. So, .Site.Taxonomies.series reutrns a map of series names to lists of pages… index that by the current series names, and bam, list of pages.

And then it’s just a matter of iterating over the pages and displaying them nicely. And what’s great is that this is now all automatic… all old posts get updated with links to the new posts in the series, and any new series I make, regardless of the name, will get the nice list of posts at the bottom for that series.

Hugo: Beyond the Defaults

Aug 8, 2014

Series: Hugo 101

In my last post, I had deployed what is almost the most basic Hugo site possible. The only reason it took more than 10 minutes is because I wanted to tweak the theme. However, there were a few things that immediately annoyed me.

I didn’t like having to type hugo -t hyde all the time. Well, turns out that’s not necessary. You can just put theme = "hyde" in your site config, and never need to type it again. Sweet. Now to run the local server, I can just run hugo server -w, and for final generation, I can just run hugo.

Next is that my posts were under npf.io/post/postname … which is not the end of the world, but I really like seeing the date in post URLs, so that it’s easy to tell if I’m looking at something really, really old. So, I went about looking at how to do that. Turns out, it’s trivial. Hugo has a feature called permalinks, where you can define the format of the url for a section (a section is a top level division of your site, denoted by a top level folder under content/). So, all you have to do is, in your site’s config file, put some config that looks like this:

[permalinks]
    post = "/:year/:month/:filename/"
    code = "/:filename/"

While we’re at it, I had been putting my code in the top level content directory, because I wanted it available at npf.io/projectname …. however there’s no need to do that, I can put the code under the code directory and just give it a permalink to show at the top level of the site. Bam, awesome, done.

One note: Don’t forget the slash at the end of the permalink.

But wait, this will move my “Hugo is Friggin’ Awesome” post to a different URL, and Steve Francia already tweeted about it with the old URL. I don’t want that url to send people to a 404 page! Aliases to the rescue. Aliases are just a way to make redirects from old URLs to new ones. So I just put aliases = ["/post/hugo-is-awesome/"] in the metadata at the top of that post, and now links to there will redirect to the new location. Awesome.

Ok, so cool… except that I don’t really want the content for my blog posts under content/post/ … I’d prefer them under content/blog, but still be of type “post”. So let’s change that too. This is pretty easy, just rename the folder from post to blog, and then set up an archetype to default the metadata under /blog/ to type = “post”. Archetypes are default metadata for a section, so in this case, I make a file archetypes/blog.md and add type= “post” to the archetype’s metadata, and now all my content created with hugo new blog/foo.md will be prepopulated as type “post”. (does it matter if the type is post vs. blog? no. But it matters to me ;)

@mlafeldt on Twitter pointed out my RSS feed was wonky…. wait, I have an RSS feed? Yes, Hugo has that too. There are feed XML files automatically output for most listing directories… and the base feed for the site is a list of recent content. So, I looked at what Hugo had made for me (index.xml in the root output directory)… this is not too bad, but I don’t really like the title, and it’s including my code content in the feed as well as posts, which I don’t really want. Luckily, this is trivial to fix. The RSS xml file is output using a Go template just like everything else in the output. It’s trivial to adjust the template so that it only lists content of type “post”, and tweak the feed name, etc.

I was going to write about how I got the series stuff at the bottom of this page, but this post is long enough already, so I’ll just make that into its own post, as the next post in the series! :)

Hugo Is Friggin' Awesome

Aug 1, 2014

Series: Hugo 101

This blog is powered by Hugo, a static site generator written by Steve Francia (aka spf13). It is, of course, written in Go. It is pretty similar to Jekyll, in that you write markdown, run a little program (hugo) and html pages come out the other end in the form of a full static site. What’s different is that Jekyll is written in ruby and is relatively slow, and Hugo is written in Go and is super fast… only taking a few milliseconds to render each page.

Hugo includes a webserver to serve the content, which will regenerate the site automatically when you change your content. Your browser will update with the changes immediately, making your development cycle for a site a very tight loop.

The basic premise of Hugo is that your content is organized in a specific way on purpose. Folders of content and the name of the files combine to turn into the url at which they are hosted. For example, content/foo/bar/baz.md will be hosted at <site>/foo/bar/baz.

Every content file has a section of metadata at the top that allows you to specify information about the content, like the title, date, even arbitrary data for your specific site (for example, I have lists of badges that are shown on pages for code projects).

All the data in a content file is just that - data. Other than markdown specifying a rough view of your page, the actual way the content is viewed is completely separated from the data. Views are written in Go’s templating language, which is quick to pick up and easy to use if you’ve used other templating languages (or even if, like me, you haven’t). This lets you do things like iterate over all the entries in a menu and print them out in a ul/li block, or iterate over all the posts in your blog and display them on the main page.

You can learn more about Hugo by going to its site, which, of course, is built using Hugo.

The static content for this site is hosted on github pages at https://github.com/natefinch/natefinch.github.io. But the static content is relatively boring… that’s what you’re looking at in your browser right now. What’s interesting is the code behind it. That lives in a separate repo on github at https://github.com/natefinch/npf. This is where the markdown content and templates live.

Here’s how I have things set up locally… all open source code on my machine lives in my GOPATH (which is set to my HOME). So, it’s easy to find anything I have ever downloaded. Thus, the static site lives at $GOPATH/src/github.com/natefinch/natefinch.github.io and the markdown + templates lives in $GOPATH/src/github.com/natefinch/npf. I created a symbolic link under npf called public that points to the natefinch.github.io directory. This is the directory that hugo outputs the static site to by default… that way Hugo dumps the static content right into the correct directory for me to commit and push to github. I just had to add public to my .gitignore so everyone wouldn’t get confused.

Then, all I do is go to the npf directory, and run

hugo new post/urlofpost.md
hugo server --buildDrafts --watch -t hyde

That generates a new content item that’ll show up on my site under /post/urlofpost. Then it runs the local webserver so I can watch the content by pointing a browser at localhost:1313 on a second monitor as I edit the post in a text editor. hyde is the name of the theme I’m using, though I have modified it. Note that hugo will mark the content as a draft by default, so you need –buildDrafts for it to get rendered locally, and remember to delete the draft = true line in the page’s metadata when you’re ready to publish, or it won’t show up on your site.

When I’m satisfied, kill the server, and run

hugo -t hyde

to generate the final site output, switch into the public directory, and

git commit -am "some new post"

That’s it. Super easy, super fast, and no muss. Coming from Blogger, this is an amazingly better workflow with no wrestling with the WYSIWYG editor to make it display stuff in a reasonable fashion. Plus I can write posts 100% offline and publish them when I get back to civilization.

There’s a lot more to Hugo, and a lot more I want to do with the site, but that will come in time and with more posts :)

First Post

Aug 1, 2014

This is the first post of my new blog. You may (eventually) see old posts showing up behind here, those have been pulled in from my personal blog at blog.natefinch.com. I’ve decided to split off my programming posts so that people who only want to see the coding stuff don’t have to see my personal posts, and people that only want to see my personal stuff don’t have to get inundated with programming posts.

Right now the site is pretty basic, but I will add more features to it, such as post history etc.

CI for Windows Go Packages with AppVeyor

Jul 9, 2014

I recently needed to update my npipe package, and since I want it to be production quality, that means setting up CI, so that people using my package can know it’s passing tests. Normally I’d use Travis CI or Drone.io for that, but npipe is a Windows-only Go package, and neither of the aforementioned services support running tests on Windows.

With some googling, I saw that Nathan Youngman had worked with AppVeyor to add Go support to their CI system. The example on the blog talks about making a build.cmd file in your repo to enable Go builds, but I found that you can easily set up a Go build without having to put CI-specific files in your repo.

To get started with AppVeyor, just log into their site and tell it where to get your code (I logged in with Github, and it was easy to specify what repo of mine to test). Once you choose the repo, go to the Settings page on AppVeyor for that repo. Under the Environment tab on the left, set the clone directory to C:\GOPATH\src<your import path> and set an environment variable called GOPATH to C:\GOPATH. Under the build tab, set the build type to “SCRIPT” and the script type to “CMD”, and make the contents of the script

go get -v -d -t <your import path>/…

(this will download the dependencies for your package). In the test tab, set the test type to “SCRIPT”, the script type to “CMD” and the script contents to

go test -v -cover ./…

(this will run all the tests in verbose mode and also output the test coverage).

That’s pretty much it. AppVeyor will automatically run a build on commits, like you’d expect. You can watch the progress on a console output on their page, and get a pretty little badge from the badges page. It’s free for open source projects, and seems relatively responsive from my admittedly limited experience.

This is a great boon for Go developers, so you can be sure your code builds and passes tests on Windows, with very little work to set it up. I’m probably going to add this to all my production repos, even the ones that aren’t Windows-only, to ensure my code works well on Windows as well as Linux.

Intro to BoltDB: Painless Performant Persistence

Jul 7, 2014

BoltDB is a pure Go persistence solution that saves data to a memory mapped file. I call it a persistence solution and not a database, because the word database has a lot of baggage associated with it that doesn’t apply to bolt. And that lack of baggage is what makes bolt so awesome.

Bolt is just a Go package. There’s nothing you need to install on the system, no configuration to figure out before you can start coding, nothing. You just go get github.com/boltdb/bolt and then import “github.com/boltdb/bolt”.

All you need to fully use bolt as storage is a file name. This is fantastic from both a developer’s point of view, and a user’s point of view. I don’t know about you, but I’ve spent months of work time over my career configuring and setting up databases and debugging configuration problems, users and permissions and all the other crap you get from more traditional databases like Postgres and Mongo. There’s none of that with bolt. No users, no setup, just a file name. This is also a boon for users of your application, because they don’t have to futz with all that crap either.

Bolt is not a relational database. It’s not even a document store, though you can sort of use it that way. It’s really just a key/value store… but don’t worry if you don’t really know what that means or how you’d use that for storage. It’s super simple and it’s incredibly flexible. Let’s take a look.

Storage in bolt is divided into buckets. A bucket is simply a named collection of key/value pairs, just like Go’s map. The name of the bucket, the keys, and the values are all of type []byte. Buckets can contain other buckets, also keyed by a []byte name.

… that’s it. No, really, that’s it. Bolt is basically a bunch of nested maps. And this simplicity is what makes it so easy to use. There’s no tables to set up, no schemas, no complex querying language to struggle with. Let’s look at a bolt hello world:

package main

import (
    “fmt”
    “log”

    “github.com/boltdb/bolt”
)

var world = []byte(“world”)

func main() {
    db, err := bolt.Open(“/home/nate/foo/bolt.db”, 0644, nil)
    if err != nil {
        log.Fatal(err)
    }
    defer db.Close()

    key := []byte(“hello”)
    value := []byte(“Hello World!”)

    // store some data
    err = db.Update(func(tx *bolt.Tx) error {
        bucket, err := tx.CreateBucketIfNotExists(world)
        if err != nil {
            return err
        }

        err = bucket.Put(key, value)
        if err != nil {
            return err
        }
        return nil
    })

    if err != nil {
        log.Fatal(err)
    }

    // retrieve the data
    err = db.View(func(tx *bolt.Tx) error {
        bucket := tx.Bucket(world)
        if bucket == nil {
            return fmt.Errorf(“Bucket %q not found!”, world)
        }

        val := bucket.Get(key)
        fmt.Println(string(val))

        return nil
    })

    if err != nil {
        log.Fatal(err)
    }
}

// output:
// Hello World!

I know what you’re thinking - that seems kinda long. But keep in mind, I fully handled all errors in at least a semi-proper way, and we’re doing all this:

1.) creating a database
2.) creating some structure (the “world” bucket)
3.) storing data to the structure
4.) retrieving data from the structure.

I think that’s not too bad in 54 lines of code.

So let’s look at what that example is really doing. First we call bolt.Open to get the database. This will create the file if necessary, or open it if it exists.

All reads from or writes to the bolt database must be done within a transaction. You can have as many Readers in read-only transactions at the same time as you want, but only one Writer in a writable transaction at a time (readers maintain a consistent view of the DB while writers are writing).

To begin, we call db.Update, which takes a function to which it’ll pass a bolt.Tx - bolt’s transaction object. We then create a Bucket (since all data in bolt lives in buckets), and add our key/value pair to it. After the write transaction finishes, we start a read- only transaction with DB.View, and get the values back out.

What’s great about bolt’s transaction mechanism is that it’s super simple - the scope of the function is the scope of the transaction. If the function passed to Update returns nil, all updates from the transaction are atomically stored to the database. If the function passed to Update returns an error, the transaction is rolled back. This makes bolt’s transactions completely intuitive from a Go developer’s point of view. You just exit early out of your function by returning an error as usual, and bolt Does The Right Thing. No need to worry about manually rolling back updates or anything, just return an error.

The only other basic thing you may need is to iterate over key/value pairs in a Bucket, in which case, you just call bucket.Cursor(), which returns a Cursor value, which has functions like Next(), Prev() etc that return a key/value pair and work like you’d expect.

There’s a lot more to the bolt API, but most of the rest of it is more about database statistics and some stuff for more advanced usage scenarios… but the above is all you really need to know to start storing data in a bolt database.

For a more complex application, just storing strings in the database may not be sufficient, but that’s ok, Go has your back there, too. You can easily use encoding/json or encoding/gob to serialize structs into the database, keyed by a unique name or id. This is what makes it easy for bolt to go from a key/value store to a document store - just have one bucket per document type. Again, the benefit of bolt is low barrier of entry. You don’t have to figure out a whole database schema or install anything to be able to just start dumping data to disk in a performant and manageable way.

The main drawback of bolt is that there are no queries. You can’t say “give me all foo objects with a name that starts with bar”. You could make your own index in the database and keep it up to date manually. This could be as easy as a slice of IDs serialized into an “indices” bucket for a particular query. Obviously, this is where you start getting into the realm of developing your own relational database, but if you don’t go overboard, it can be nice that all this code is just that - code. It’s not queries in some external DSL, it’s just code like you’d write for an in-memory data store.

Bolt is not for every application. You must understand your application’s needs and if bolt’s key/value style will be sufficient to fulfill those needs. If it is, I think you’ll be very happy to use such a simple data store with so little mental overhead.

[edited to clarify reader/writer relationship] Bonus Gob vs. Json benchmark for storing structs in Bolt:


BenchmarkGobEncode  1000000       2191 ns/op
BenchmarkJsonEncode   500000       4738 ns/op
BenchmarkGobDecode  1000000       2019 ns/op
BenchmarkJsonDecode   200000      12993 ns/op

Code: http://play.golang.org/p/IvfDUGBpJ6

Autogenerate docs with this one dumb trick

Jun 17, 2014

Yesterday, I was trying to think of a way of automating some doc generation for my go packages. The specific task I wanted to automate was updating a badge in my package’s README to show the test coverage. What I wanted was a way to run go test -cover, parse the results, and put the result in the correct spot of my README. My first thought was to write an application that would do that for me … but then I’d have to run that instead of go test. What I realized I wanted was something that was “compatible with go test” - i.e. I want to run go test and not have to remember to run some special other command.

And that’s when it hit me: What is a test in Go? A test is a Go function that gets run when you run “go test”. Nothing says your test has to actually test anything. And nothing prevents your test from doing something permanent on your machine (in fact we usually have to bend over backwards to make sure our tests don’t do anything permanent. You can just write a test function that updates the docs for you.

I actually quite like this technique. I often have some manual tasks after updating my code - usually updating the docs in the README with changes to the API, or changing the docs to show new CLI flags, etc. And there’s one thing I always do after I update my code - and that’s run “go test”. If that also updates my docs, all the better.

This is how covergen was born. https://github.com/natefinch/covergen

Covergen is a particularly heinous example of a test that updates your docs. The heinous part is that it actually doubles the time it takes to run your tests… this is because that one test re-runs all the tests with -cover to get the coverage percent. I’m not sure I’d actually release real code that used such a thing - doubling the time it takes to run your tests just to save a few seconds of copy and paste is pretty terrible.

However, it’s a valid example of what you can do when you throw away testing convention and decide you want to write some code in a test that doesn’t actually test anything, and instead just runs some automated tasks that you want run whenever anyone runs go test. Just make sure the result is idempotent so you’re not continually causing things to look modified to version control.

Diffing Go with Beyond Compare

May 14, 2014

I love Beyond Compare, it’s an awesome visual diff/merge tool. It’s not free, but I don’t care, because it’s awesome. However, there’s no built-in configuration for Go code, so I made one. Not sure what the venn diagram of Beyond Compare users and Go users looks like, it might be that I’m the one point of crossover, but just in case I’m not, here’s the configuration file for Beyond Compare 3 for the Go programming language: http://play.golang.org/p/G6NWE0z1GC (please forgive the abuse of the Go playground)

Just copy the text into a file and in Beyond Compare, go to Tools->Import Settings… and choose the file. Please let me know if you have any troubles or suggested improvements.

Intro++ to Go Interfaces

May 13, 2014

Standard Interface Intro

Go’s interfaces are one of it’s best features, but they’re also one of the most confusing for newbies. This post will try to give you the understanding you need to use Go’s interfaces and not get frustrated when things don’t work the way you expect. It’s a little long, but a bunch of that is just code examples.

Go’s interfaces are different than interfaces in other languages, they are implicitly fulfilled. This means that you never need to mark your type as explicitly implementing the interface (like class CFoo implements IFoo). Instead, your type just needs to have the methods defined in the interface, and the compiler does the rest.

For example:

type Walker interface {
    Walk(miles int)
}

type Camel struct {
    Name string
}

func (c Camel) Walk(miles int) {
     fmt.Printf(“%s is walking %v miles\n”, c.Name, miles)
}

func LongWalk(w Walker) {
     w.Walk(500)
     w.Walk(500)
}

func main() {
    c := Camel{“Bill”}
    LongWalk(c)
}

// prints
// Bill is walking 500 miles.
// Bill is walking 500 miles.

http://play.golang.org/p/erodX-JplO

Camel implements the Walker interface, because it has a method named Walk that takes an int and doesn’t return anything. This means you can pass it into the LongWalk function, even though you never specified that your Camel is a Walker. In fact, Camel and Walker can live in totally different packages and never know about one another, and this will still work if a third package decides to make a Camel and pass it into LongWalk.

Non-Standard Continuation

This is where most tutorials stop, and where most questions and problems begin. The problem is that you still don’t know how the interfaces actually work, and since it’s not actually that complicated, let’s talk about that.

What actually happens when you pass Camel into LongWalk?

So, first off, you’re not passing Camel into LongWalk. You’re actually assigning c, a value of type Camel to a value w of type Walker, and w is what you operate on in LongWalk.

Under the covers, the Walker interface (like all interfaces), would look more or less like this if it were in Go (the actual code is in C, so this is just a really rough approximation that is easier to read).

type Walker struct {
    type InterfaceType
    data *void
}

type InterfaceType struct {
    valtype *gotype
    func0 *func
    func1 *func
    ...
}

All interfaces values are just two pointers - one pointer to information about the interface type, and one pointer to the data from the value you passed into the interface (a void in C-like languages… this should probably be Go’s unsafe.Pointer, but I liked the explicitness of two actual *’s in the struct to show it’s just two pointers).

The InterfaceType contains a pointer to information about the type of the value that you passed into the interface (valtype). It also contains pointers to the methods that are available on the interface.

When you assign c to w, the compiler generates instructions that looks more or less like this (it’s not actually generating Go, this is just an easier-to-read approximation):

data := c
w := Walker{ 
    type: &InterfaceType{ 
              valtype: &typeof(c), 
              func0: &Camel.Walk 
          }
    data: &data
}

When you assign your Camel value c to the Walker value w, the Camel type is copied into the interface value’s Type.valtype field. The actual data in the value of c is copied into a new place in memory, and w’s Data field points at that memory location.

Implications of the Implementation

Now, let’s look at the implications of this code. First, interface values are very small - just two pointers. When you assign a value to an interface, that value gets copied once, into the interface, but after that, it’s held in a pointer, so it doesn’t get copied again if you pass the interface around.

So now you know why you don’t need to pass around pointers to interfaces - they’re small anyway, so you don’t have to worry about copying the memory, plus they hold your data in a pointer, so changes to the data will travel with the interface.

Interfaces Are Types

Let’s look at Walker again, this is important:

type Walker interface

Note that first word there: type. Interfaces are types, just like string is a type or Camel is a type. They aren’t aliases, they’re not magic hand-waving, they’re real types and real values which are distinct from the type and value that gets assigned to them.

Now, let’s assume you have this function:

func LongWalkAll(walkers []Walker) { for _, w := range walkers { LongWalk(w) } }

And let’s say you have a caravan of Camels that you want to send on a long walk:

caravan := []Camel{ Camel{“Bill”}, Camel{“Bob”}, Camel{“Steve”}}

You want to pass caravan into LongWalkAll, will the compiler let you? Nope. Why is that? Well, []Walker is a specific type, it’s a slice of values of type Walker. It’s not shorthand for “a slice of anything that matches the Walker interface”. It’s an actual distinct type, the way []string is different from []int. The Go compiler will output code to assign a single value of Camel to a single value of Walker. That’s the only place it’ll help you out. So, with slices, you have to do it yourself:

walkers := make([]Walker, len(caravan))
for n, c := range caravan {
    walkers[n] = c
}
LongWalkAll(walkers)

However, there’s a better way if you know you’ll just need the caravan for passing into LongWalkAll:

caravan := []Walker{ Camel{“Bill”}, Camel{“Bob”}, Camel{“Steve”}}
LongWalkAll(caravan)

Note that this goes for any type which includes an interface as part of its definition: there’s no automatic conversion of your func(Camel) into func(Walker) or map[string]Camel into map[string]Walker. Again, they’re totally different types, they’re not shorthand, and they’re not aliases, and they’re not just a pattern for the compiler to match.

Interfaces and the Pointers That Satisfy Them

What if Camel’s Walk method had this signature instead?

func (c *Camel) Walk(miles int)

This line says that the type *Camel has a function called Walk. This is important: *Camel is a type. It’s the “pointer to a Camel” type. It’s a distinct type from (non-pointer) Camel. The part about it being a pointer is part of its type. The Walk method is on the type *Camel. The Walk method (in this new incarnation) is not on the type Camel. This becomes important when you try to assign it to an interface.

c := Camel{“Bill”}
LongWalk(c)

// compiler output:
cannot use c (type Camel) as type Walker in function argument:
 Camel does not implement Walker (Walk method has pointer receiver)

To pass a Camel into LongWalk now, you need to pass in a pointer to a Camel:

c := &Camel{“Bill”}
LongWalk(c)

or

c := Camel{“Bill”}
LongWalk(&c)

Note that this true even though you can still call Walk directly on Camel:

c := Camel{“Bill”}
c.Walk(500) // this works

The reason you can do that is that the Go compiler automatically converts this line to (&c).Walk(500) for you. However, that doesn’t work for passing the value into an interface. The reason is that the value in an interface is in a hidden memory location, and so the compiler can’t automatically get a pointer to that memory for you (in Go parlance, this is known as being “not addressable”).

Nil Pointers and Nil Interfaces

The interaction between nil interfaces and nil pointers is where nearly everyone gets tripped up when they first start with Go.

Let’s say we have our Camel type with the Walk method defined on *Camel as above, and we want to make a function that returns a Walker that is actually a Camel (note that you don’t need a function to do this, you can just assign a *Camel to a Walker, but the function is a good illustrative example):

func MakeWalker() Walker {
    return &Camel{“Bill”}
}

w := MakeWalker()
if w != nil {
    w.Walk(500)  // we will hit this
}

This works fine. But now, what if we do something a little different:

func MakeWalker(c *Camel) Walker {
    return c
}

var c *Camel
w := MakeWalker(c)
if w != nil {
    // we’ll get in here, but why?
    w.Walk(500)
}

This code will also get inside the if statement (and then panic, which we’ll talk about in a bit) because the returned Walker value is not nil. How is that possible, if we returned a nil pointer? Well, let’s go look back to the instructions that get generated when we assign a value to an interface.

data := c
w := Walker{ 
    type: &InterfaceType{ 
              valtype: &typeof(c), 
              func0: &Camel.Walk 
          }
    data: &data
}

In this case, c is a nil pointer. However, that’s a perfectly valid value to assign to the Walker’s Data value, so it works just fine. What you return is a non-nil Walker value, that has a pointer to a nil *Camel as its data. So, of course, if you check w == nil, the answer is false, w is not nil… but then inside the if statement, we try to call Camel’s walk:

func (c *Camel) Walk(miles int) {
     fmt.Printf(“%s is walking %v miles\n”, c.Name, miles)
}

And when we try to do c.Name, Go automatically turns that into (*c).Name, and the code panics with a nil pointer dereference error.

Hopefully this makes sense, given our new understanding of how interfaces wrap values, but then how do you account for nil pointers? Assume you want MakeWalker to return a nil interface if it gets passed a nil Camel. You have to explicitly assign nil to the interface:

func MakeWalker(c *Camel) Walker {
    if c == nil {
        return nil
    }
    return c
}

var c *Camel
w := MakeWalker(c)
if w != nil {
    // Yay, we don’t get here!
    w.Walk(500)
}

And now, finally, the code is doing what we expect. When you pass in a nil *Camel, we return a nil interface. Here’s an alternate way to write the function:

func MakeWalker(c *Camel) Walker {
    var w Walker
    if c != nil {
        w = c
    }
    return w
}

This is slightly less optimal, but it shows the other way to get a nil interface, which is to use the zero value for the interface, which is nil.

Note that you can have a nil pointer value that satisfies an interface. You just need to be careful not to dereference the pointer in your methods. For example, if *Camel’s Walk method looked like this:

func (c *Camel) Walk(miles int) {
    fmt.Printf(“I’m walking %d miles!”, miles)
}

Note that this method does not dereference c, and therefore you can call it even if c is nil:

var c *Camel
c.Walk(500)
// prints “I’m walking 500 miles!”

http://play.golang.org/p/4EfyV21at9

Outro

I hope this article helps you better understand how interfaces works, and helps you avoid some of the common pitfalls and misconceptions newbies have about how interfaces work. If you want more information about the internals of interfaces and some of the optimizations that I didn’t cover here, read Russ Cox’s article on Go interfaces, I highly recommend it.

Mocking functions in Go

Apr 10, 2014

Functions in Go are first class citizens, that means you can have a variable that contains a function value, and call it like a regular function.

printf := fmt.Printf
printf(“This will output %d line.\n”, 1)

This ability can come in very handy for testing code that calls a function which is hard to properly test while testing the surrounding code. In Juju, we occasionally use function variables to allow us to stub out a difficult function during tests, in order to more easily test the code that calls it. Here’s a simplified example:

// in install/mongodb.go
package install

func SetupMongodb(path string) error {
     // suppose the code in this method modifies files in root
     // directories, mucks with the environment, etc… 
     // Actions you actively don’t want to do during most tests.
}


// in startup/bootstrap.go
package startup

func Bootstrap() error {
    …
    path := getPath()
    if err := install.SetupMongodb(path); err != nil {
       return err
    }
    …
}

So, suppose you want to write a test for Bootstrap, but you know SetupMongodb won’t work, because the tests don’t run with root privileges (and you don’t want to setup mongodb on the dev’s machine anyway). What can you do? This is where mocking comes in.

We just make a little tweak to Bootstrap:

package startup

var setupMongo = install.SetupMongodb

func Bootstrap() error {
    …
    path := getRootDirPath()
    if err := setupMongo(path); err != nil {
       return err
    }
    …
}

Now if we want to test Bootstrap, we can mock out the setupMongo function thusly:

// in startup/bootstrap_test.go
package startup

type fakeSetup struct {
    path string
    err error
}

func (f *fakeSetup) setup(path string) error {
    f.path = path
    return f.err
}

TestBootstrap(t *testing.T) {
    f := &fakeSetup{ err: errors.New(“Failed!”) }
    // this mocks out the function that Bootstrap() calls
    setupMongo = f.setup
    err := Bootstrap()
    if err != f.err {
        t.Fail(“Error from setupMongo not returned.  Expected %v, got %v”, f.err, err)
    }
    expPath := getPath()
    if f.path != expPath {
        t.Fail(“Path not correctly passed into setupMongo. Expected %q, got %q”, expPath, f.path)
    }

    // and then try again with f.err == nil, you get the idea
}

Now we have full control over what happens in the setupMongo function, we can record the parameters that are passed into it, what it returns, and test that Bootstrap is at least using the API of the function correctly.

Obviously, we need tests elsewhere for install.SetupMongodb to make sure it does the right thing, but those can be tests internal to the install package, which can use non-exported fields and functions to effectively test the logic that would be impossible from an external package (like the setup package). Using this mocking means that we don’t have to worry about setting up an environment that allows us to test SetupMongodb when we really only want to test Bootstrap. We can just stub out the function and test that Bootstrap does everything correctly, and trust that SetupMongodb works because it’s tested in its own package.

Effective Godoc

Apr 1, 2014

I started to write a blog post about how to get the most out of godoc, with examples in a repo, and then realized I could just write the whole post as godoc on the repo, so that’s what I did. Feel free to send pull requests if there’s anything you see that could be improved.

I actually learned quite a lot writing this article, by exploring all the nooks and crannies of Go’s documentation generation. Hopefully you’ll learn something too.

Either view the documentation on godoc.org:

https://godoc.org/github.com/natefinch/godocgo

or view it locally using the godoc tool:

go get code.google.com/p/go.tools/cmd/godoc
go get github.com/natefinch/godocgo
godoc -http=:8080

Then open a browser to http://localhost:8080/pkg/github.com/natefinch/godocgo

Enjoy!

Unused Variables in Go

Mar 28, 2014

The Go compiler treats unused variables as a compilation error. This causes much annoyance to some newbie Gophers, especially those used to writing languages that aren’t compiled, and want to be able to be fast and loose with their code while doing exploratory hacking.

The thing is, an unused variable is often a bug in your code, so pointing it out early can save you a lot of heartache.

Here’s an example:

50 func Connect(name, port string) error {
51     hostport := ""
52    if port == "" {
53        hostport := makeHost(name)
54        logger.Infof("No port specified, connecting on port 8080.")
55    } else {
56        hostport := makeHostPort(name, port)
57        logger.Infof("Connecting on port %s.", port)
58    }
59    // ... use hostport down here
60 }

Where’s the bug in the above? Without the compiler error, you’d run the code and have to figure out why hostport was always an empty string. Did we pass in empty strings by accident? Is there a bug in makeHost and makeHostPort?

With the compiler error, it will say “53, hostport declared and not used” and “56, hostport declared and not used”

This makes it a lot more obvious what the problem is… inside the scope of the if statement, := declares new variables called hostport. These hide the variable from the outer scope, thus, the outer hostport never gets modified, which is what gets used further on in the function.

50 func Connect(name, port string) error {
51    hostport := ""
52    if port == "" {
53        hostport = makeHost(name)
54        logger.Infof("No port specified, connecting on port 8080.")
55    } else {
56        hostport = makeHostPort(name, port)
57        logger.Infof("Connecting on port %s.", port)
58    }
59    // ... use hostport down here
60 }

The above is the corrected code. It took only a few seconds to fix, thanks to the unused variable error from the compiler. If you’d been testing this by running it or even with unit tests… you’d probably end up spending a non- trivial amount of time trying to figure it out. And this is just a very simple example. This kind of problem can be a lot more elaborate and hard to find.

And that’s why the unused variable declaration error is actually a good thing. If a value is important enough to be assigned to a variable, it’s probably a bug if you’re not actually using that variable.

Bonus tip:

Note that if you don’t care about the variable, you can just assign it to the empty identifier directly:

_, err := computeMyVar()

This is the normal way to avoid the compiler error in cases where a function returns more than you need.

If you really want to silence the unused variable error and not remove the variable for some reason, this is the way to do it:

v, err := computeMyVar() 
_ = v  // this counts as using the variable

Just don’t forget to clean it up before committing.

All of the above also goes for unused packages. And a similar tip for silencing that error:

_ = fmt.Printf // this counts as using the package

Go and Github

Mar 21, 2014

Francesc Campoy recently posted about how to work on someone else’s Go repo from github. His description was correct, but I think there’s an easier way, and also one that might be slightly less confusing.

Let’s say you want to work on your own branch of github.com/natefinch/gocog - here’s the easiest way to do it:

Fork github.com/natefinch/gocog on github
mkdir -p $GOPATH/src/github.com/natefinch/gocog
cd $GOPATH/src/github.com/natefinch/gocog
git clone https://github.com/YOURNAME/gocog .
(optional) go get github.com/natefinch/gocog

That’s it. Now you can work on the code, push/pull etc from your github repo as normal, and submit a pull request when you’re done.

go get is useful for getting code that you want to use, but it’s not very useful for getting code that you want to work on. It doesn’t set up source control. git clone does. What go get is handy for is getting the dependencies of a project, which is what step 5 does (only needed if the project relies on outside repos you don’t already have). (thanks to a post on G+ for reminding me that git clone won’t get the dependencies)

Also note, the path on disk is the same as the original repo’s URL, not your branch’s URL. That’s intentional, and it’s the key to making this work. go get is the only thing that actually cares if the repo URL is the same as the path on disk. Once the code is on disk, go build etc just expects import paths to be directories under $GOPATH. The code expects to be under $GOPATH/src/github.com/natefinch/gocog because that’s what the import statements say it should be. There’s no need to change import paths or anything wacky like that (though it does mean that you can’t have both the original version of the code and your branch coexisting in the same $GOPATH).

Note that this is actually the same procedure that you’d use to work on your own code from github, you just change step 1 to “create the repo in github”. I prefer making the repo in github first because it lets me set up the license, the readme, and the .gitignore with just a few checkboxes, though obviously that’s optional if you want to hack locally first. In that case, just make sure to set up the path under gopath where it would go if you used go get, so that go get will work correctly when you decide to push up to github.

(updated to mention using go get after git clone)

Go Tips for Newbie Gophers

Mar 15, 2014

This is just a collection of tips that would have saved me a lot of time if I had known about them when I was a newbie:

Build or test everything under the current directory and subdirectories:

go build ./…
go test ./…

Technically, both commands take a pattern to match the name of one or more packages, and the … specifier is a wildcard, so you could do …/foo/… to match all packages under GOPATH with foo in their path.

Have an io.Writer that writes to an in-memory data structure:

b := &bytes.Buffer{}
thing.WriteTo(b)

Have an io.Reader read from a string (useful when you want to use a string as the input data for something):

r := strings.NewReader(myString)
thing.ReadFrom(r)

Copy data from a reader to a writer:

io.Copy(toWriter, fromReader)

Timeout waiting on a channel:

select {
   case val := <- ch
       // use val
   case <-time.After(time.Second*5)
}

Convert a slice of bytes to a string:

var b []byte = getData()
s := string(b)

Passing a nil pointer into an interface does not result in a nil interface:

func isNil(i interface{}) bool {
    return i == nil
}
var f *foo = nil
fmt.Println(isNil(f))  // prints false

The only way to get a nil interface is to pass the keyword nil:

var f *foo = nil
if f == nil {
    fmt.Println(isNil(nil))  // prints true
}

How to remember where the arrow goes for channels:

The arrow points in the direction of data flow, either into or out of the channel, and always points left.

The above is generalizable to anything where you have a source and destination, or reading and writing, or assigning.

Data is taken from the right and assigned to the left, just as it is with a := b. So, like io.Copy, you know that the reader (source) is on the right, the writer (destination) is on the left: io.Copy(dest, src).

If you ever think “man, someone should have made a helper function to do this!”, chances are they have, and it’s in the std lib somewhere.

Working at Canonical

Nov 10, 2013

I’ve been a developer at Canonical (working on Juju) for a little over 3 months, and I have to say, this is the best job I have ever had, bar none.

Let me tell you why.

1.) 100% work from home (minus ~2 one week trips per year)
2.) Get paid to write cool open source software.
3.) Work with smart people from all over the globe.

#1 can’t be overstated. This isn’t just “flex time” or “work from home when you want to”. There is literally no office to go to for most people at Canonical. Working at home is the default. The difference is huge. My last company let us work from home as much as we wanted, but most of the company worked from San Francisco… which means when there were meetings, 90% of the people were in the room, and the rest of us were on a crappy speakerphone straining to hear and having our questions ignored. At Canonical, everyone is remote, so everyone works to make meetings and interactions work well online… and these days it’s easy with stuff like Google Hangouts and IRC and email and online bug tracking etc.

Canonical’s benefits don’t match Google’s or Facebook’s (you get the standard stuff, health insurance, 401k etc, just not the crazy stuff like caviar at lunch… unless of course you have caviar in the fridge at home). However, I’m pretty sure the salaries are pretty comparable… and Google and Facebook don’t let you work 100% from home. I’m pretty sure they barely let you work from home at all. And that is a huge quality of life issue for me. I don’t have to slog through traffic and public transportation to get to work. I just roll out of bed, make some coffee, and sit down at my desk. I get to see my family more, and I save money on transportation.

#2 makes a bigger difference than I expected. Working on open source is like entering a whole different world. I’d only worked on closed source before, and the difference is awesome. There’s purposeful openness and inclusion of the community in our development. Bug lists are public, and anyone can file one. Mailing lists are public (for the most part) and anyone can get on them. IRC channels are public, and anyone can ask questions directly to the developers. It’s a really great feeling, and puts us so much closer to the community - the people that have perhaps an even bigger stake in the products we make than we do. Not only that, but we write software for people like us. Developers. I am the target market, in most cases. And that makes it easy to get excited about the work and easy to be proud of and show off what I do.

#3 The people. I have people on my team from Germany, the UK, Malta, the UAE, Australia, and New Zealand. It’s amazing working with people of such different backgrounds. And when you don’t have to tie yourself down to hiring people within a 30 mile radius, you can afford to be more picky. Canonical doesn’t skimp on the people, either. I was surprised that nearly everyone on my team was 30+ (possibly all of them, I don’t actually know how old everyone is ;) That’s a lot of experience to have on one team, and it’s so refreshing not to have to try to train the scrappy 20-somethings to value the things that come with experience (no offense to my old colleagues, you guys were great).

Put it all together, and it’s an amazing opportunity that I am exceedingly pleased to have been given.

60 Days with Ubuntu

Sep 28, 2013

At the end of July, I started a new job at Canonical, the makers of Ubuntu Linux. Canonical employees mostly work from home, and use their own computer for work. Thus, I would need to switch to Ubuntu from Windows on my personal laptop. Windows has been my primary operating system for most of my 14 year career. I’ve played around with Linux on the side a few times, running a mail server on Mandrake for a while… and I’ve worked with Cent OS as server for the software at my last job… but I wouldn’t say I was comfortable spending more than a few minutes on a Linux terminal before I yearned to friggin’ click something already…. and I certainly hadn’t used it as my day to day machine.

Enter Ubuntu 13.04 Raring Ringtail, the latest and greatest Ubuntu release (pro-tip, the major version number is the year it was released, and the minor version number is the month, Canonical does two releases a year, in April and October, so they’re all .04 and .10, and the release names are alphabetical).

Installation on my 2 year old HP laptop was super easy. Pop in the CD I had burned with Ubuntu on it, and boot up… installation is fully graphical, not too different from a Windows installation. There were no problems installing, and only one cryptic prompt… do I want to use Logical Volume Management (LVM) for my drives? This is the kind of question I hate. There was no information about what in the heck LVM was, what the benefits or drawbacks are, and since it sounded like it could be a Big Deal, I wanted to make sure I didn’t pick the wrong thing and screw myself later. Luckily I could ask a friend with Linux experience… but it really could have done with a “(Recommended)” tag, and a link for more information.

After installation, a dialog pops up asking if I want to use proprietary third party drivers for my video card (Nvidia) or open source drivers. I’m given a list of several proprietary drivers and an open source driver. Again, I don’t know what the right answer is, I just want a driver that works, I don’t care if it’s proprietary or not (sorry, OSS folks, it’s true). However, trying to be a good citizen, I pick the open source one and…. well, it doesn’t work well at all. I honestly forget exactly what problems I had, but they were severe enough that I had to go figure out how to reopen that dialog and choose the Nvidia proprietary drivers.

Honestly, the most major hurdle in using Ubuntu has been getting used to having the minimize, maximize, and close buttons in the upper left of the window, instead of the upper right.

In the first week of using Ubuntu I realized something - 99% of my home use of a computer is in a web browser… the OS doesn’t matter at all. There’s actually very little I use native applications for outside of work. So, the transition was exceedingly painless. I installed Chrome, and that was it, I was back in my comfortable world of the browser.

Linux has come a long way in the decade since I last used it. It’s not longer the OS that requires you drop into a terminal to do everyday things. There are UIs for pretty much everything that are just as easy to use as the ones in Windows, so things like configuring monitors, networking, printers, etc all work pretty much like they do in Windows.

So what problems did I have? Well, my scanner doesn’t work. I went to get drivers for it, and there are third party scanner drivers, but they didn’t work. But honestly, scanners are pretty touch and go in Windows, too, so I’m not terribly surprised. All my peripherals worked (monitors, mouse, keyboard, etc), and even my wireless printer worked right away. However, later on, my printer stopped working. I don’t know exactly why, I had been messing with the firewall in Linux, and so it may have been my fault. I’m talking to Canonical tech support about it, so hopefully they’ll be able to help me fix it.

Overall, I am very happy with using Linux as my every day operating system. There’s very few drawbacks for me. Most Windows software has a corresponding Linux counterpart, and now even Steam games are coming to Linux, so there’s really very little reason not to make the switch if you’re interested.

Statically typed generic data structures in Go

Apr 17, 2013

I gave a talk at the Go Boston meetup last night and figured I should write it up and put it here.

The second thing everyone says when they read up on Go is “There are no generics!”.

(The first thing people say is “There are no exceptions!”)

Both are only mostly true, but we’re only going to talk about generics today.

Go has generic built-in data structures - arrays, slices, maps, and channels. You just can’t create your own new type, and you can’t create generic functions. So, what’s a programmer to do? Find another language?

No. Many, possibly even most, problems can be solved with the built-in data structures. You can write pretty huge applications just using maps and slices and the occasional channel. There may be a tiny bit of code duplication, but probably not much, and certainly not any tricky code.

However, there definitely are times when you need more complicated data structures. Most people writing Go solve this problem by using Interface{}, the empty interface, which is basically like Object in C# or Java or void * in C/C++. It’s a thing that can hold any type… but then you need to type cast it to get at the actual type. This breaks static typing, since the compiler can’t tell if you make a mistake and pass the wrong type into something that takes an Interface{}, and it can’t tell until runtime if a cast will succeed or not.

So, is there any solution? Yes. The inspiration comes from the standard library’s sort package. Package sort can sort a slice of any type, it can even sort things that aren’t slices, if you’ve made your own custom data structure. How does it do that? To sort something, it must support the methods on sort.Interface. Most interesting is Less(i, j int). Less returns true if the item at index i in your data structure is Less than the object at index j in your data structure. Your code has to implement what “Less” means… and by only using indices, sort doesn’t need to know the types of objects held in your data structure.

This use of indices to blindly access data in a separate data structure is how we’ll implement our strongly typed tree. The tree structure will hold an index as its data value in each node, and the indices will index into a data structure that holds the actual objects. To make a tree of a new type, you simply implement a Compare function that the tree can use to compare the values at two indices in your data structure. You can use whatever data structure you like, probably a slice or a map, as long as you can use integers to reference values in the data structure.

In this way we separate the organization of the data from the storage of the data. The tree structure holds the organization, a slice or map (or something custom) stores the data. The indices are the generic pointers into the storage that holds the actual strongly typed values.

This does require a little code for each new tree type, just as using package sort requires a little code for each type. However, it’s only a few lines for a few functions, wrapping a tree and your data.

You can check out an example binary search tree I wrote that uses this technique in my github account

https://github.com/natefinch/tree

or go get the runnable sample tree:

go get github.com/natefinch/treesample

This required only 36 lines of code to make the actual tree structure (including empty lines and comments).

In some simple benchmarks, this implementation of a tree is about 25% faster than using the same code with Interface{} as the values and casting at runtime…. plus it’s strongly typed.

Go is for Open Source

Jan 29, 2013

The Go programming language is built from the ground up to implicitly encourage Go projects to be open source. If you want your project not only to contribute to open source, but to encourage other people to write open source code, Go is a great language to choose.

Let’s look at how Go does this. These first two points are overly obvious, but we should get them out of the way.

The language is open source

You can go look at the source code for the language, the compilers, and the build tools for the language. It’s a fully open source project. Even though a lot of the work is being done by Google engineers, there are hundreds of names on the list of contributors of people who are not Google employees.

The standard library is open source

Want to see high quality example code? Look at the code in the standard library. It has been carefully reviewed to be of the best quality, and in canonical Go style. Reading the standard library is a great way to learn the best ways to use and write Go.

Ok, that’s great, but what about all the code that isn’t part of Go itself?

The design of Go really shows its embrace of open source in how third party code is used in day to day projects.

Go makes it trivial to use someone else’s code in your project

Go has distributed version control built-in from the ground up. If you want to use a package from github, for example, you just specify the URL in the imports, as if it were a local package:

import (
    “bytes” // std lib package
    “github.com/fake/foo” // 3rd party package
)

You don’t have to go find and download fake/foo from github and put it in a special directory or anything. Just run “go get github.com/fake/foo”. Go will then download, build, and install the code, so that you can reference it… nicely stored in a directory defined by the URL, in this case $GOPATH/src/github.com/fake/foo. Go will even figure out what source control system is used on the other side so you don’t have to (support for git, svn, mercurial, and bazaar).

What’s even better is that the auto-download happens for anyone who calls “go get” on your code repository. No more giving long drawn-out installation instructions about getting half a dozen 3rd party libraries first. If someone wants your code, they type “go get path.to/your/code”, and Go will download your code, and any remote imports you have (like the one for github above), any remote imports that code has, etc, and then builds everything.

The fact that this is available from the command line tools that come with the language makes it the de facto standard for how all Go code is written. There’s no fragmentation in the community about how packages are stored, accessed, used, etc. This means zero overhead for using third party code, it’s as easy to use as if it were built into the Go standard library.

Sharing code is the default

Like most scripting languages (and unlike many compiled languages), using source code from another project is the default way to use third party code in Go. Go creates a monolithic executable during its build, so there are no DLLs to create and distribute in the way you often see with other compiled languages. In theory you could distribute the compiled .a files from your project for other people to link to in their project, but this is not encouraged by the tooling, and I’ve personally never seen anyone do it.

All Go code uses the same style

Have you ever gone to read the source for a project you’d like to contribute to, and had your eyes cross over at the bizarre formatting the authors used? That almost never happens with Go. Go comes with a code formatting tool called gofmt that automatically formats Go code to the same style. The use of gofmt is strongly encouraged in the Go community, and nearly everyone uses it. Most text editors have an extension to automatically format your code with gofmt on save, so you don’t even have to think about it. You never have to worry about having a poorly formatted library to work with… and in the very rare situation where you do, you can just run it through gofmt and you’re good to go.

Easy cross platform support

Go makes it easy to support multiple platforms. The tooling can create native binaries for any popular operating system from the same source on a single machine. If you need platform-specific code, it’s easy to specify code that only gets compiled for a single platform, by simply appending _<os> to a file name .e.g path_windows.go will only be compiled for builds targeting Windows.

Built-in documentation and testing

Go comes with a documentation generator that spits generates HTML or plain text from minimally formatted comments in the code. It also comes with a standard testing package that can run unit tests, performance benchmarks, and runnable example code. Because this is all available in the standard library and with the standard tools, nearly everyone uses it… which means it’s easy to look at the documentation for any random Go package, and easy check if the tests pass, without having to go install some third party support tool. Because it’s all standardized, several popular websites have popped up to automate generating (and hosting) the documentation for your project, and you can easily run continuous integration on your package, with only a single line in the setup script - “language: go”.

Conclusion

Everything about Go encourages standardization and openness… which not only makes it possible to use other people’s code, it makes it easy to use other people’s code. I hope to see Go blossom as a language embraced by the open source community, as they discover the strengths that make it uniquely qualified for open source projects.

What I love about Go

Jan 25, 2013

The best things about Go have nothing to do with the language.

Single Executable Output

Go compiles into a single executable that runs natively on the target OS. No more needing to install java, .net, mono, python, ruby, whatever. Here’s your executable, feel free to run it like a normal person. And you can target builds for any major OS (windows, linux, OSX, BSD).

One True Coding Style

GoFmt is a build tool that formats your source code in the standard Go format. No more arguing about spacing or brace matching or whatever. There is one true format, and now we can all move on… and even better, many editors integrate GoFmt so that your code can be automatically formatted whenever you save.

Integrated Testing

Testing is integrated into the language. Name a file with the suffix _test.go and it’ll only build under test. You run tests simply by running “go test” in the directory. You can also define runnable example code with output that is checked at test time. This example code is then included in the documentation (see below)… now you’ll never have examples in documentation with errors in them. Finally, you can have built-in benchmarks that are controlled by the go tool to automatically run enough iterations to get a significant result, displayed in number of operations per second.

Integrated Documentation

HTML documentation is built into the language. No need for ugly HTML in your source or weirdly formatted comments. Plaintext comments are turned into very legible documentation, and see above for examples that actually run and can have their output tested as a part of the tests.

DVCS

Support for distributed version control is built into the language. Want to reference code from a project on github? Just use the url of the project as the import path in your code, e.g. import “github.com/jsmith/foo” When you build your code it’ll get downloaded and built automatically.

Want to get a tool written in go? From the command line type “go get github.com/jsmith/bar” - go will download the source, build it, and install the executable in your path. Now you can run bar.

Any git, SVN, mercurial, or bazaar repository will work, but all the major public source code sites are supported out of the box - github, bitbucket, google code, and launchpad.

Other Cool Stuff

Debugging with gdb
Integrated profiling tools
Easy to define custom includes per targeted OS/architecture (simple _windows will only build if targetting windows)
Integrated code parsers and lexers.

Do you even care about the actual language anymore? I wouldn’t. But just in case:

C-like
Garbage Collected
Statically typed
…but with type inference so you’re not typing boilerplate all the time: a := “my string”
Implicit interfaces - if a type has the methods of an interface, it implements the interface
Pointers but no pointer arithmetic (thank god)
First class functions
No exceptions
…but multiple returns from a single function so you don’t have to overload return types
Everything is UTF8 (both strings and source code.. yes you can have Θ as a variable name now)
Highly performant asynchronous code that is trivial to write
A deep standard library that does most of the boring stuff for you

gocog

Jan 25, 2013

I recently got very enamored with Go, and decided that I needed to write a real program with it to properly get up to speed. One thing came to mind after reading a lot on the Go mailing list: a code generator.

I had worked with Ned Batchelder at a now-defunct startup, where he developed cog.py. I figured I could do something pretty similar with Go, except, I could do one better - Go generates native executables, which means you can run it without needing any specific programming framework installed, and you can run it on any major operating system. Also, I could construct it so that gocog supports any programming language embedded in the file, so long as it can be run via command line.

Thus was born gocog - https://github.com/natefinch/gocog

Gocog runs very similarly to cog.py - you give it files to look at, and it reads the files looking for specially tagged embedded code (generally in comments of the actual text). Gocog extracts the code, runs it, and rewrites the file with the output of the code embedded.

Thus you can do something like this in a file called test.html:

<html>
<body>
<!– [[[gocog
print “<b>Hello World!</b>“
gocog]]] –>
<!– [[[end]]] –>
</body>
</html>

if you run gocog over the file, specifying python as the command to run:

gocog test.html -cmd python -args %s -ext .py

This tells gocog to extract the code from test.html into a file with the .py extension, and then run python <filename> and pipe the output back into the file.

This is what test.html looks like after running gocog:

<html>
<body>
<!– [[[gocog
print “<b>Hello World!</b>“
gocog]]] –>
<b>Hello World!</b>
<!– [[[end]]] –>
</body>
</html>

Note that the generator code still exists in the file, so you can always rerun gocog to update the generated text.

By default gocog assumes you’re running embedded Go in the file (hey, I wrote it, I’m allowed to be biased), but you can specify any command line tool to run the code - python, ruby, perl, even compiled languages if you have a command line tool to compile and run them in a single step (I know of one for C# at least).

“Ok”, you’re saying to yourself, “but what would I really do with it?” Well, it can be really useful for reducing copy and paste or recreating boilerplate. Ned and I used it to keep a schema of properties in sync over several different projects. Someone on Golang-nuts emailed me and is using it to generate boilerplate for CGo enum properties in Go.

Gocog’s sourcecode actually uses gocog - I embed the usage text into three different spots for documentation purposes - two in regular Go comments and one in a markdown file. I also use gocog to generate a timestamp in the code that gets displayed with the version information.

You don’t need to know Go to run Gocog, it’s just an executable that anyone can run, without any prerequisites. You can download the binaries of the latest build from the gocog wiki here: https://github.com/natefinch/gocog/wiki

Feel free to submit an issue if you find a bug or would like to request a feature.

Go Win Stuff

Nov 16, 2012

No, not contests, golang (the programming language), and Win as in Windows.

Quick background - Recently I started writing a MUD in Go for the purposes of learning Go, and writing something that is non-trivial to code. MUDs are particularly suited to Go, since they are entirely server based, are text-based, and are highly concurrent and parallel problems (which is to say, you have a whole bunch of people doing stuff all at the same time on the server).

Anyway, after getting a pretty good prototype of the MUD up and running (which was quite fun), I started thinking about using Go for some scripty things that I want to do at work. There’s a bit of a hitch, though… the docs on working in Windows are not very good. In fact, if you look at golang.org, they’re actually non-existent. This is because the syscall package changes based on what OS you’re running on, and (not surprisingly) Google’s public golang site is not running on Windows.

So, anyway, a couple notes here on Windowy things that you (I) might want to do with Go:

Open the default browser with a given URL:

import (
    “syscall/exec”
) 

func OpenBrowser(url string) {
    exec.Command(“rundll32”, “url.dll”, “FileProtocolHandler”, url)
}

Example of a wrapper for syscall’s Windows Registry functions:

import (
    “syscall”
)

func ReadRegString(hive syscall.Handle, subKeyPath, valueName string) (value string, err error) {
    var h syscall.Handle
    err = syscall.RegOpenKeyEx(hive, syscall.StringToUTF16Ptr(subKeyPath), 0, syscall.KEY_READ, &h)
    if err != nil {
        return
     }
     defer syscall.RegCloseKey(h)

    var typ uint32
    var bufSize uint32

    err = syscall.RegQueryValueEx(
              hKey,
              syscall.StringToUTF16Ptr(valueName),
              nil,
              &typ,
              nil,
              &bufSize)
    if err != nil {
        return
    }

    data := make([]uint16, bufSize/2+1)

    err = syscall.RegQueryValueEx(
              hKey,
              syscall.StringToUTF16Ptr(valueName),
              nil,
              &typ,
              (*byte)(unsafe.Pointer(&data[0])),
              &bufSize)
    if err != nil {
        return
    }

    return syscall.UTF16ToString(data), nil
}

Fragile Error Handling

Dave’s Solution - Interfaces

A Better Way - Flags

Supporting Errors.Is and Errors.As

Indirect Coupling

Choosing Flags

Conclusion

The proposal

Complications

The Old Way

Early Returns

One idea per line

Nesting Functions

What is Starlark?

Parser by google

Easy two-way interaction

100% Safe

Example

Caching

Work Ongoing

A Brief History

Maybe go run? Maybe not

The Last Straw

What is Mage?

What is a Magefile?

Targets

Dependencies

Shell Helpers

Verbose Mode

How it Works

Conclusion

Syntax

More Complex Types

Contract Definitions as Code Are Hard

Lack of Names and Documentations

Just Use Interfaces, and Make Some New Built-in Ones

Conclusion

Code Sucks

Avoiding Comments Often Makes Your Code Worse

There are no bad comments

Any comment is better than no comment

Why?

Tests

Conclusion

About Juju

Package Management

Project Organization

Overall Simplicity

Testing

Test Suites

time in a bottle

Cross Compilation Bliss

Multi-OS Mistakes

Error Handling

Stability

Generics

Next Time

Package main

os.Exit

func main()

Project Layout

CLI

Run

Putting it together

Larger Projects

Errors

Logging

Timeouts

Serving up some pie

Orthogonality

Why is it called pie?

Conclusion

#1 Bare Returns

#2 New

#3 Close

#4 There is no 4

Postscript

Long Ago, In A Galaxy Far, Far Away

The Dark Ages

Enlightenment

Maybe `go run`? Maybe not

`time` in a bottle