Hacker School is now the Recurse Center. Read more.

Michael Nielsen joins the Recurse Center to help build a research lab

We’re thrilled to announce that Michael Nielsen will be spending the next year at the Recurse Center to help us launch a research lab focused on discovering better ways of making software.

As part of the launch of the research lab, we’ll also be hosting a symposium later this year, and we will be announcing new ways for researchers, industry professionals, and other kindred spirits to become part of the RC community in the coming months.

Research at RC

Our sole focus since we started RC four years ago has been running an educational retreat: A self-directed program in New York which brings together people from around the world for three-month stints to learn from each other and become better programmers, regardless of if they’ve been programming for a few months or a few decades. We see a research lab as a natural outgrowth of this work. The Recurse Center has always been a place where people come to learn new things. Now, we hope it will also become a place where people discover new things about the world.

Our research lab and educational retreat will be linked and mutually reinforcing. The lab will benefit from the diverse stream of programmers who come through our educational retreat, as well as the supportive, intellectual, and energizing culture we’ve grown. Our retreat will benefit from the expanded group of people who will now be able to participate in our community and the energy, ideas, and expertise they will bring with them.

Michael will be joining us for one year as a Recurse Center Research Fellow. Michael has previously worked on and written about a wide range of topics, from Lisp as the Maxwell’s equations of software to the future of science. He coauthored the canonical quantum computing textbook, and wrote the books Reinventing Discovery and Neural Networks and Deep Learning. He has also been a Research Fellow at Caltech and a senior faculty member at the Perimeter Institute.

We met Michael in 2012, when he attended the first half of our summer batch, and ever since he has impressed us with his deep intellect, unassuming demeanor, and extraordinary knack for looking at things from different perspectives and seeing connections others don’t. We are delighted to have Michael coming back to RC for a full year.

In the future, we hope to fund multiple research fellows for multiple years. In the short-term, we will be focusing on finding new ways to bring together and exchange ideas with people doing related work outside of RC. We will be sharing more about these and our symposium in the coming weeks.

RC’s approach to research

In the process of figuring out how we should start this experiment, my cofounder, David Albert, has had extensive discussions with many researchers and other wise people.1 We’ve gotten a lot of conflicting feedback, but we’ve also heard some consistent themes, which have greatly informed our approach.

Fund people, not projects. Most research today is funded by writing proposals for specific projects. But what if you find a better problem to work on while doing your work? Worse, project funding is frequently tied to deliverables, which can encourage short-term thinking and discourage high-risk and potentially high-reward explorations.

Look to the edges. Much mainstream research work focuses on what is currently fashionable. Yet we believe that much of the most exciting work, what will ultimately be understood as the truly luminous ideas, are at the edges of our knowledge, currently barely visible, not yet in the mainstream. In the words of Stewart Brand, we need to “look to the edges to see where the center is going.” The fashionable fields are often important and worth funding. But if you want to make a difference with limited funding, you need to be pushing the boundary, doing edgy things, things that are not yet part of the mainstream, but with luck and imagination and daring will help create the mainstream of the future.

What we aim to produce. Much academic research is focused on publishing papers. That makes sense for some types of research, including some of the work we hope to support at RC. But in software it may make just as much sense or more to write an interactive essay, release a demo or prototype, or share the source code for a full-fledged piece of software.

Research takes time, and there aren’t enough long-term funders. Some people said they thought the lower-bound for useful work was two or three years, and others said it was at least 10. Regardless, we know we need to fund people for more than a year at a time, and we will be doing this as soon as it’s possible and responsible for us to do so.

We’d like to thank Sam Altman, Greg Brockman, Will Byrd, David Dalrymple (aka Davidad), Patrick Dubroy, Evelyn Eastmond, Jonathan Edwards, Matthias Felleisen, Dan Friedman, Chaim Gingold, Adele Goldberg, Philip Guo, Laura Hill, Ken Kahn, Alan Kay, Lindsey Kuper, Robert Lefkowitz (aka r0ml), Chris Martens, Matt Might, Henry Minsky, Margaret Minsky, Marvin Minsky, David Nolen, Peter Norvig, Ken Perlin, Cynthia Solomon, Oliver Steele, Bret Victor, and Jean Yang for providing their time, expertise, and advice for starting a research lab.

Research focus

More than a decade ago, Alan Kay wrote:

There is nothing in [software engineering] that is like the construction of the Empire State building in less than a year by less than 3000 people: they used powerful ideas and power tools that we don’t yet have in software development. If software does “engineering” at all, it is too often at the same level as the ancient Egyptians before the invention of the arch (literally before the making of arches: architecture), who made large structures with hundreds of thousands of slaves toiling for decades to pile stone upon stone: they used weak ideas and weak tools, pretty much like most software development today.

We believe this is still true, and over the long run RC’s lab will aim to expand the set of “powerful ideas and power tools” we have in the world of software. To use Alan Kay’s metaphor, we will aspire to discover, understand, and build new types of arches.

That goal is intentionally broad. Many people have advised us that a research focus should be specific enough to attract people to work together towards a shared vision and also broadly interpretable, so promising paths of inquiry aren’t off-limits.

To give some flavor for the types of work we’re excited by, here’s a small sampling of things that inspire us: VPRI’s STEPs project, Plan 9, Ken Perlin’s experiments with visual grammars, Vi Hart’s videos, Mirage, Livegrep, Growing a Language, functional composition, Systems Software Research is Irrelevant, and how Eve is making programming more declarative and Elm is making it more observable. Additionally, we’ve been inspired and informed by a number of books, most notably Seymour Papert’s Mindstorms, Jane Jacob’s The Death and Life of Great American Cities, Richard Hamming’s The Art of Doing Science and Engineering, Mitchell Waldrop’s The Dream Machine, and Jon Gertner’s The Idea Factory.

Funding

RC has been sustained the past four years by recruiting fees from partner companies who hire our alumni. Late last year, we reached a tipping point, and finally became convinced that we could sustainably run our educational retreats for free to all participants based purely off recruiting revenue.

One of the many open questions about this experiment is whether it’s possible to sustainably fund research off recruiting fees alone. We believe it is. Our bottom line, and thus ability to fund the lab, is ultimately dependent on our ability to attract great people who someday choose to take a job through us. We believe our research lab will have a halo effect, attracting even more great people to our educational retreat, and that some of them will choose to take jobs through us, thereby funding both the retreat and research.

Only the first step

We know that in the world of research, a year is the blink of an eye. We see today as a small step in a long path to building a full lab and doing meaningful research. Fortunately or not, we have limited capital and a tight cash flow: We are simply not yet able to fund, for example, multiple people for decade-long fellowships. This limits our short-term options but not our long-term ambition. Our constraints also bring some benefits: They force us to focus, and ensure that if we do build a successful lab, it will be financially sustainable.

Like most things we do, this will be an experiment, and we expect to make many mistakes and adjustments as we go. Nevertheless, we are committed to supporting research at RC, both because we think it will make RC significantly better, and because we think it is a good and interesting thing to attempt.

  1. Some of the notes from these discussions are so good that I’m convinced they should be edited and compiled into a long blog post or a short book.

What people do at the Recurse Center (April 2015)

Last summer and fall, Nick wrote two blog posts to answer one of the most common questions we hear from applicants: What do people actually do at the Recurse Center? We still don’t have a simple answer to that question; in fact, we hope we never will. Recursers have diverse backgrounds (we’ve had biologists, musicians, lawyers and CS grads, among many others), and work on an incredible variety of projects.

To give folks interested in applying to the Recurse Center a better sense of what to expect, we thought we’d check in with the current batches to see what they’re currently working on.

A point from Nick’s post that bears repeating is that everything written at the Recurse Center is open source, and everyone works on projects they choose for themselves based on their interests and what they want to learn.

So, without further ado, here are twelve things people are currently doing at the Recurse Center:

  • Develop new ways for programmers to communicate with their machines. Pam built MacVimSpeak, an OS X app that executes spoken Vim commands.

  • Write a language. Sarah wrote Data Monster, a domain-specific language that transpiles to d3.js.

  • Write a Regex Engine. Geoffrey built a simple Regex engine in Scala, and is now adding support for visualizing state machines.

  • Build a view engine. Michelle has been working on Prismo, a front-end templating system for JavaScript which automatically keeps track of variable dependencies and only refreshes the part of the page that’s changed.

  • Write a game. Noella is building a Python implementation of 2048, Aishwarya is writing Pacman in JavaScript, and Nat has been writing a variety of games in JavaScript.

  • Make bots (and make it easier to make bots). Zulip (our internal chat system) bots are a popular project for Recursers. Nikki and Eric built DelayBot, Agustín has built PingBot, and Andrew has built robotbotbot, a platform to make it easier to create bots.

  • Contribute to larger open source projects. Karthik has been working on a native remote desktop client for Guacamole, a clientless remote desktop gateway that supports standard protocols like VNC and RDP. Aditya and Ken contributed to Mozilla’s Servo browser engine project.

  • Find new ways to analyze and visualize data. Alex is working on a Twitter word association project which allows users to search for words and find the terms most strongly correlated with them in recent tweets. Agustín is working on a Python package that scrapes data from Excel spreadsheets.

  • Use programming to explore and share other interests. Cory is working on ComicGator, a webcomic aggregator. Mykola is building a live light sequencer that uses MIDI instruments to trigger LED animations, and Gonçalo is building Music Gist.

  • Learn a new skill or language, or deepen knowledge of an old one. Yuta read The Little Schemer, and then wrote a Scheme interpreter. Mudit took a course on Compilers, and Pietro has been working through Learning Clojure. Anthony has been taking algorithms courses and created a datalogger android app which uploads to Amazon S3.

  • Work with residents. Several Recursers started implementing the Paxos algorithm for solving consensus in a network of unreliable nodes after resident Neha Narula gave a talk about her work on databases and consistency. Resident Mark Dominus worked with Aditya on his Go implementation of Git and with Alex on her synthetic implementation of hashes in Python.

  • Reflect on and share their experiences. Lots of Recursers keep journals and blog about their experiences, like Pam, Nat, Luna and Ahmed.

If you got excited reading about the projects above or have been daydreaming about having enough time to learn a new language, design a game or work on your open source project, apply to the Recurse Center.

Rachel vincent 150
Tweet

Announcing four new residents for summer 2015

We’re excited to announce that Frank Wang, Allison Parrish, Raquel Vélez and Jonathan Edwards will be in residence at the Recurse Center this summer!

If you’d like to work with Frank, Allison, Raquel, Jonathan or residents like them, apply to the Recurse Center.

Frank Wang Frank Wang will be in residence from 6/1 – 6/4. Frank is a PhD student at MIT focusing on building secure systems. He did his undergraduate at Stanford, focusing on applied cryptography. He runs the MIT security seminar where top academics come and talk about their most recent research. He is also a member of Roughdraft Ventures, which provides small amounts of capital to early stage student startups. He is currently running a summer program for early stage security companies called Cybersecurity Factory. He has interned at the security teams at Google and Facebook as well as consulted for security companies like Qualys. When he is not busy worrying about your security, he enjoys going to art museums and being outdoors.

Allison Parrish Allison Parrish will be in residence from 6/8 – 6/19. Allison is a computer programmer, poet, educator and game designer who lives in Brooklyn. Her teaching and practice address the unusual phenomena that blossom when language and computers meet. Allison is currently the Digital Creative Writer-in-Residence at Fordham University and an adjunct professor and “something-in-residence” at NYU’s Interactive Telecommunications Program, where she teaches a course on writing computer programs that generate poetry.

Raquel Velez Raquel Vélez will be in residence from 7/20 – 7/23. Raquel is a Senior Software Developer at npm, Inc. in Oakland, CA. She has previously worked at Caltech, NASA JPL, the MIT Lincoln Laboratory, and various universities in Europe. In her off time, you can find her baking, teaching NodeBots not to fall off of tables, and speaking. Also, hanging out with her hilarious husband and two cats dressed in dog suits.

Jonathan Edwards Jonathan Edwards will be in residence from 8/248/28. Jonathan has been programming for 45 years. He was cofounder and CTO of IntraNet, Inc. where he built a document-oriented transactional replicated database in the 80’s. He learned the most about programming by having to carry a beeper for 15 years. He is currently a Research Fellow at MIT CSAIL and a member of the Communications Design Group at SAP Labs. He blogs at alarmingdevelopment.org. He specializes in being tragically ahead of his time.

Rachel vincent 150
Tweet

A guide to RC at PyCon 2015

PyCon 2015 begins this week, and we’re excited to have many Recurse Center employees, alumni, and residents attending and presenting. We thought it would be nice to collect all the talks by Recursers in one place as a reference for our community and others interested in learning more about it.

In addition to the talks below, the Recurse Center will be at booth #718, where you will find RC employees Zach Allaun and Thomas Ballinger, as well as an ever-changing line-up of past Recursers who will be happy to answer any questions you have about RC. Come say hi!

Friday, April 10th

To kick things off with a healthy dose of energy and excitement, Julia Evans (RC Fall 2013) will be giving the opening statements on Friday at 9am.

Later in the morning, long-time RC resident Jessica McKellar will be going on “a weird and wonderful compiler journey from RPython to C to JavaScript” in her talk about Python in the browser.

In the afternoon, Miriam Lauter (RC Summer 2, 2014) will be talking about how to make your own smart air conditioner using Python and a Raspberry Pi.

Saturday, April 11th

Amy Hanlon (RC Winter 2014) will investigate a series of Python Wats related to identify, mutability, and scope at 10:50am.

At the same time, Julia Evans will return to show how learning about systems programming and kernels can help you become more effective with your everyday Python debugging.

Directly after that, Andreas Dewes (RC Winter 2014) will talk about learning from others’ mistakes and the benefits and pitfalls of statically analyzing Python code.

Next up, Nina Zakharenko (RC Summer 2013) will talk about technical debt and review some case studies and ways to pay it down.

After lunch, Allison Kaptur (facilitator emeritus and RC Summer 2012) will dive into the CPython interpreter to track down a mysterious bug in Byterun, a Python interpreter written in Python. Allison will also be doing a second talk immediately following about understanding CPython without reading the code.

Simultaneously, past RC Resident Glyph will discuss the ethical consequences of our collective activities.

Past RC resident Brandon Rhodes will explore bytearrays and whether or not their performance gains are worth their added complexity.

Sasha Laundy (RC Winter 2013) will share advice for developing two complementary and perennially useful skills: giving and getting technical help.

Sunday, April 12th

Past RC resident Jacob Kaplan-Moss will help kick off the day with a keynote talk at 9:20am.

Decky Coss (RC Winter 2014) will be presenting a poster about building a Python MIDI controller during the poster session from 10am to 1:10pm.

Finally, Thomas Ballinger (facilitator and RC Winter 2012) will do some terminal whispering Sunday afternoon and will show how you can build and modify terminal-based tools and talk to your terminal from scratch.

Founded in 2011, the Recurse Center is a free, self-directed, educational retreat for people who want to get better at programming, whether they’ve been coding for three decades or three months. The retreat is free for everyone, and offers need-based living-expense grants up to $7,000 to women and people from groups traditionally underrepresented in programming.

Mark Dominus and Ben Orenstein are Recurse Center residents

We’re excited to announce that Mark Dominus and Ben Orenstein will be in residence at the Recurse Center this spring and summer!

If you’d like to work with Ben, Mark or residents like them, apply to the Recurse Center.

Mark Dominus Mark Dominus will be in residence from 4/20 – 4/21 and 4/27 – 4/28. Mark has been programming in various capacities since around 1976. He is best-known for writing the 2005 book Higher-Order Perl, in which he adapted higher-order programming techniques widely used in Lisp, Haskell, and SML for use in Perl. His other achievements include setting up Time-Warner’s first corporate web site, developing an online catalog, recommendation, and shopping system for Estée Lauder, and bringing “The Dysfunctional Family Circus” to the Web. Mark also loves Unix system programming, mathematics, and crocuses.

Ben Orenstein Ben Orenstein will be in residence from 7/207/24. Ben hosts the Giant Robots Podcast, runs Upcase, and co-created Trailmix.

He is a frequent teacher and speaker, and works at thoughtbot in Boston.

Rachel vincent 150
Tweet

Code Words Issue Two

We’re excited to announce Issue Two of Code Words, our quarterly publication about programming.

In Not everything is an expression, Michael Robert Arntzenius (RC Summer 1, 2014) presents a look at syntax classes and their importance. We especially love the ability to dynamically change the language used in the examples, as well as the careful syntax-highlighting to help explain the structure of the code.

Recurse Center facilitator Mary Rose Cook takes a deep dive into the inner workings of Git in Git from the inside out. Be sure to check out Gitlet, her JavaScript implementation of Git, after you’re done reading.

Current Recurser Jim Shields shares his experience digging into HTTP in How I learned to (stop worrying and) love HTTP, setting a great example of how to be rigorous and curious as a new programmer.

Last but not least, Nemanja Stanarevic (RC Summer 2, 2014) provides a thoughtful Introduction to reactive programming, inspired in part by Mary’s Introduction to functional programming from Code Words Issue One.

In addition to all of the writers, we’d like to thank Aki Yamada (RC Summer 2013), Rose Ames (RC Winter 2014), Erik Taubeneck (RC Summer 2013), Danielle Pham (RC Summer 1, 2014), and Jari Takkala (RC Fall 2013) for all their careful editing.

Code Words is a quarterly publication written and edited by the Recurse Center community. Like the Recurse Center itself, we aim to make Code Words accessible and useful to both new and seasoned programmers, and to share the joyful approach to programming and learning that typifies Recursers.

Rachel vincent 150
Tweet

Hacker School is now the Recurse Center

Today, we’re correcting one of our oldest and biggest mistakes: We’re changing our name from Hacker School to the Recurse Center.

While catchy, “Hacker School” has always been an actively bad name for us. Both words are problematic and misleading. “Hacker” is bad because so much of the world thinks of hackers as computer criminals and not clever programmers, which is the meaning we intended. And even for many people familiar with our use of the word, “hacker” can feel exclusionary. (“Hacker” was also not exactly helpful to the roughly 30% of each batch who cross the U.S. border to get here.)

“School” is bad for us because it implies the trappings of traditional schools – teachers, classes, and curricula – instead of simply a place where people learn, which is all we intended by it.

Taken together, “Hacker School” is even worse: It sounds like the name of a coding bootcamp. This was a problem we didn’t anticipate, because bootcamps weren’t a thing in 2011. But today, bootcamps are everywhere, and I can’t begin to count the number of times I’ve explained to people that we are not a bootcamp.

Despite our best efforts, the problems with our name have grown worse over time. The media and others have taken to using “hacker school” generically to refer to bootcamps, and despite our many protestations, we’ve failed to stop this. Having our name co-opted and used generically for something so different has been the source of seemingly endless confusion.

There are several downsides to changing our name. To the people familiar with us, “Hacker School” has many positive connotations. It’s memorable, playful, and easily pronounced. We own the .com. We’ve spent years building up our reputation. And even though the name has so many problems, we’re fond of it. Giving up the name “Hacker School” feels a bit like losing an old friend.

A fundamental challenge when running a business is figuring out when you should try to change the world, and when you need to change yourself instead. We believe this is a case of the latter. We concluded this by taking a long-term view. May will mark the five-year anniversary of when we quit our jobs to start building this company, and this summer is the four-year anniversary of Hacker School (or I should say, the Recurse Center). In many ways, nearly half a decade is a long time, but if we hope to build an institution that will last, our history to date will be a blip. Seen this way, it’s obvious we should change our name.

After too much deliberation, we’ve chosen the Recurse Center as our new name, primarily because:

  • “Recurse” gives a friendly nod to programming without the baggage of “hacker” (and we like the connection to going deeper).
  • “Center” doesn’t have the misleading connotations “school” has.
  • We were able to get recurse.com, which is short, pronounceable, and easy to spell.

While our name is changing, who we are and what we do is not. We hope that by calling ourselves the Recurse Center we can focus on doing the work we care about and sharing it with the world, and not explaining why our name doesn’t mean what people think it does.

Onwards!

Founded in 2011, the Recurse Center is a free, self-directed, educational retreat for people who want to get better at programming, whether they’ve been coding for three decades or three months. The retreat is free for everyone, and offers need-based living-expense grants up to $7,000 to women and people from groups traditionally underrepresented in programming.

Read more

Alumni interviewers

Starting this month we’ll be paying alumni to interview Hacker School applicants on a part-time basis. Hacker School facilitators and founders will also continue to do interviews, but our goal is to transition to 100% alumni interviewers as soon as practical.

The only change for applicants should be a greater and more varied number of interview times available. We’ll now have interviewers across nine hours of time zones, and we hope to have more interview times available on weeknights and during weekends. We think this will make it easier for applicants to find times that fit their schedules.

Our admissions process will otherwise remain the same: A short written application, a conversational Skype interview, and a brief remote pair programming session. During the conversational interview, we typically ask questions about why people want to do Hacker School, what they hope to learn and get out of it, how they want to improve as programmers, and questions about their programming backgrounds. For the pairing interview, we use screen-sharing software to write code together, with the applicant as the primary driver. Applicants may choose to work on a task suggested by us ahead of time or their own project, and may use whatever language they’re most comfortable in.

We don’t have any trick questions or gimmicks in our admissions process. Our goal is to make our process as low-stress as possible, and ideally enjoyable. We also don’t want to surprise anyone, which is why we’re blogging about this so that applicants know ahead of time they may interview with alumni.

Since we moved to rolling admissions early last year, interview shifts have eaten up more and more facilitator time. Having alumni do interviews will free up facilitators to focus on what they do best: Making Hacker School a better experience for current Hacker Schoolers. As such, we hope that this change will both make our admissions process more applicant-friendly and improve Hacker School itself.

Goodbye Paper of the Week

This is the last post in our “Paper of the Week” series. For more info, check out the introductory blog post.

We’ve decided to stop publishing Paper of the Week. Paper of the Week has been fun to write, but we don’t think it’s worth continuing given the limited response we’ve gotten and the amount of time it takes to put together. Instead, we’re going to focus on writing other things for our blog1.

For posterity (and because our blog doesn’t have categories), here are links to all past Papers of the Week collected in one place.

Thanks to everyone who submitted a paper, a Read Along, a suggestion, or a comment. We hope that Paper of the Week has been enjoyable, and maybe even a bit enlightening.

Happy reading!

  1. If you want to get email updates when we publish a new blog post or issue of Code Words, we have a new email list that you should subscribe to.

David albert 150
Tweet

A string of unexpected lengths

When you start learning to program, or working in a new language, it’s often suggested that you build a simple program like Battleship or Tic-tac-toe. The games’ rules are well-defined and easy to grasp, and you only need to read and print text to get started. This frees you up to focus on the mechanics and ideas of the programming language you’re learning.

To create the game’s interface in the terminal, you end up doing a lot of string formatting: board layout, progress bars, announcements to the user. The length of a string is useful when formatting for terminals, since they usually use monospaced fonts. For example, while writing a game of Battleship in Python we might use the len() function explicitly for formatting math or implicitly in convenient built-in methods like center() to make exciting messages like the following:

>>> msg = 'battleship sunk!'
>>> len(msg)
16
>>> def underlined(msg):
...     return msg + '\n' + '-' * len(msg)
...
>>> print underlined(msg)
battleship sunk!
----------------
>>> print msg.center(30, '*')
*******battleship sunk!*******

However, the code above won’t always work as we expect because the len() of text isn’t necessarily the same as its width when displayed in a terminal. Let’s explore three ways these numbers can differ.

Multiple bytes for one character

Byte strings (known as “strings” in Python 2) have formatting methods like center() which assume that the displayed width of a string is equal to the number of bytes it contains. But this assumption doesn’t always hold! The single visible character Ä might be encoded as several bytes in a source file or terminal.

>>> shipname = 'Ägir'
>>> shipname
'\xc3\x84gir'
>>> len(shipname)
5
>>> print shipname.center(10, '=')
==Ägir===
>>> print shipname + '\n' + '-' * len(shipname)
Ägir
-----

The number of bytes in this byte string doesn’t match the number of characters so built-in formatting operations don’t behave correctly.

Fortunately, using Unicode strings instead of byte strings solves this problem because they usually report a length equal to the number of Unicode code points they contain.1

>>> shipname = u'Ägir'
>>> len(shipname)
4
>>> shipname.center(10, u'=')
u'===\xc4gir==='
>>> print shipname.center(16, u'*')
===Ägir===
>>> print shipname + '\n' + '-' * len(shipname)
Ägir
----

ANSI escape code formatting

ANSI escape codes let us format text by writing bytes like '\x1b[31m' to start writing in red, and '\x1b[39m' to stop. If we build a string containing these sequences, the calculated length of our string won’t match its displayed width in a terminal:

>>> s = '\x1b[31mhit!\x1b[0m'
>>> print s
hit!
>>> len(s)
13
>>> print s + '\n' + '-' * len(s)
hit!
-------------
>>> print s.center(14, '*')
hit!*

The colored string reports a length larger than its displayed width, causing problems for built-in text-alignment methods. Fortunately, there are several Python libraries that make it easier to work with colored string-like objects that don’t include formatting characters in their length calculations.

Clint’s colored strings have formatting methods that produce the output you expect:

>>> from clint.textui.colored import green
>>> len(green(u'ship'))
4
>>> green(u'ship').center(10)
<GREEN-string: u'   ship   '>
>>> print green(u'ship').center(10)
   ship   

but this no longer works once two colored strings are combined into a new colored string:

>>> from clint.textui.colored import blue, green
>>> len(green('ship') + blue('ocean'))
39
>>> green('ship') + blue('ocean')
'\x1b[31m\x1b[22mship\x1b[39m\x1b[34m\x1b[22mocean\x1b[39m'
>>> print (green('ship') + blue('ocean')).center(10, '*')
shipocean

My own attempt at solving this problem uses smart string objects which know how to concatenate:

>>> from curtsies.fmtfuncs import green, blue
>>> len(green(u'ship'))
4
>>> green(u'ship').center(10)
green("   ship   ")
>>> print green(u'ship').center(10)
   ship   
>>> s = green(u'ship') + blue(u'ocean')
>>> len(s)
9
>>> print s.center(13, '*')
**shipocean**

but doesn’t correctly implement every formatting method yet: above, **shipocean** has lost its color information because a fallback implementation of center() was used.2

The Unicode jungle

Formatting methods of Python Unicode strings like center() assume that the display width of a string is equal to its character count. But this assumption doesn’t always hold!

What if we use fullwidth Unicode characters?

>>> battleship = u'扶桑'
>>> len(battleship)
2
>>> print battleship + '\n' + '-' * len(battleship)
扶桑
--

What about multiple Unicode code points that combine to display a single character?3

>>> battleship = u'Fuso\u0304'
>>> print battleship
Fusō
>>> len(battleship)
5
>>> print battleship.center(6, u'*')
*Fusō

The width of a Unicode string differs from the number of characters in it. Fortunately, we can use the POSIX standard function wcswidth to calculate the display width of a Unicode string. We can use this function to rebuild our basic formatting functionality.4

>>> from wcwidth import wcswidth
>>> wcswidth(battleship)
4
>>> def center(s, n, fillchar=' '):
...     pad = max(0, n - wcswidth(s))
...     lpad, rpad = (pad + 1) // 2, pad // 2
...     return lpad * fillchar + s + rpad * fillchar
...
>>> print center(c, 6, '*')
*Fusō*
  1. Unfortunately, for versions of Python earlier than 3.3 it’s still possible that the len() of a Unicode character like u'\U00010123' will be 2 if your Python was built to use the “narrow” internal representation of Unicode. You can check this with sys.maxunicode - if it’s a number less than the total number of Unicode code points, some Unicode characters are going to have a len() other than 1.

  2. Want to fix this? Pull requests are welcome! The fix would be pretty similar to the fix for this issue about .ljust and .rjust.

  3. The Unicode spec calls this an extended grapheme cluster. Interestingly, the Character class in the Swift programming language represents an extended grapheme cluster and may be composed of multiple Unicode code points.

  4. Here we’re using a pure Python implementation for compatibility and readability.

Thomas ballinger 150
Tweet
View older blog posts