Ranges in Vim

Posted on Feb 9, 2020

VimExplorer vim vimscript tricks

Although I briefly mentioned the concept of range in my previous article about substitutions, I simply described them as a “set of lines”, on each of which the :substitute command would be called. But amongst the (much appreciated) feedback I received on reddit , some mentioned the fact that this somewhat hand-wavy explanation was not enough, and that I should elaborate.

This in turn made me realize that everything about ranges was not clear for me. What exactly is a range? How is it different from an address? I decided to do some research, leading to the article you are reading right now!

Range or Address?

As with any article of the VimExplorer series, I started by looking at the help. This time however, I quickly ran into a problem: :help address and :help range refer to the exact same location! Does this mean that ranges and addresses are the same thing?

As you might expect, the answer to the previous question is no. Consider the % commonly used in :%s/pat/repl/ to replace a given pat with repl over the whole file: because the syntax of the substitute command is :[range]s[ubstitute]/{pattern}/{string}/[flags] [count], it is easy to understand that % is a range. But is it also an address? Using it in a command that takes an address like :move demonstrates the opposite, as :m % will result in E14: Invalid address (which is not too surprising - “move to the whole file” does not make much sense!).

So what is an address then? It turns out that it is simply a line number. Nothing more, nothing less. This also means that any positive number smaller or equal to the number of lines in the current buffer is an address! But dealing only in terms of absolute numbers is not really convenient, so Vim introduces several special characters that (when used as addresses) will be translated to a given line number. Here are some examples:

. is the current line,
$ is the last line in the buffer,
1 is the first line in the buffer,
0 is a special “virtual” line before the first line in the buffer (some commands, like :move will interpret it as “before the first line”, while others like :substitute will treat it as the first line),
't is the line in which mark t is placed,
'< and '> are, respectively, the line on which the previous selection begins and ends,
/pat/ (note the second / !) is the next line where pat matches,
\/ is the next line in which the previous search pattern matches, and
\& is the next line in which the previous substitute pattern matches (if the difference with search pattern is not clear for you, check out my previous article !)

In addition, for any address, {address}+{number} and {address}-{number} are also valid addresses, pointing respectively to the line {number} lines after, and the line {number} lines before address (for example, .+5 means “5 lines after the current line”).

There is nothing like a little practice to remember, so let’s try a few common commands that use addresses.

This is the first line.

This is the third line.
And a fourth one.

Given the above file, with the cursor on the first line, :m 3 will move it to below the third line:


This is the third line.
This is the first line.
And a fourth one.

Notice how your cursor also got moved with the line! Now, try :co /fourth/:


This is the third line.
This is the first line.
And a fourth one.
This is the first line.

Once again, your cursor moved. Finally, do :m .-5:

This is the first line.

This is the third line.
This is the first line.
And a fourth one.

Your cursor was on the line number five, so :m .-5 becomes :m 0, and the line is moved to the first line of the buffer.

All of this is nice, but fairly limited: what if you wanted to move all the lines below the second one? For this, and for a lot of more advanced operations, you’ll need a range.

A Short History Lesson

Before we move on to ranges, let’s take a moment to think about why addresses came to be defined this way. As some of you might have noticed, while '< (first line of the last selected visual area) is a valid address, `< (first character of the last selected visual area) is not. Why is that so? After all, Vim supports character-oriented marks, so why keep addresses line-oriented?

To understand the origins of this limitation, and many other Vim quirks (why would you use :wq to leave?), it is necessary to take a look at UNIX history and the evolution of text editors in general. For those of you who have no interest in this, you can directly skip to the next section .

Back in the 1960’s, computers were enormous, room-filling machines without words, images, sounds nor music: just pure number crushing. Of course, they were also prohibitively expensive, so only universities, government agencies and big corporations could afford them.

Ranges in Vim /img/ibm-7094.jpeg An IBM 7094 computer, released in 1962. The magnetic tapes in the background were used for storage!

This is fairly known. What is not as much however, is that those computers could only run one program at a time! Although techniques like batch-processing existed to reduce downtimes, this still implied that only one user could use the computer at a time. People used to write their code on paper tape then give it to an operator that would enqueue them on the physical queue of code the computer had to process. It could take days, or weeks, before your code would be executed and the result returned to you (usually printed).

Researchers and computer scientists at the time quickly realized this could be optimized by sharing resources across multiple users (giving CPU time to process A while process B is waiting for disk, for instance), which lead to the development of time-sharing systems. Project Genie is a great example of such research efforts, funded by the US Defense Advanced Research Program Agency and carried by the University of Berkeley in 1963-1965. This program ended up pioneering many techniques, culminating in the creation of the SDS-940 , one of the first commercial machines using time-sharing and memory paging¹.

What we are more interested in today however, is that the SDS-940 also came with a built-in text editor, called QED, for “Quick EDitor” (to be exact, QED was not at its first iteration as it was originally developed on the Berkeley Timesharing System, precursor of the SDS-940). Let me reproduce here one excerpt of the original QED manual :

[In QED] lines can be referred to by number, although line numbers are subject to change if carriage returns are added or deleted within the string during the course of editing.

QED has three modes of operations:

The command mode

The text mode

The line edit mode

Does that sound familiar? It gets better in the appendix:

Line Addressing Methods

By position (line number).

By . (symbol for current line).

By $ (symbol for last line).

By :label: (where label is the first word or string of characters in a line).

By [text] (where text is a unique word or string of characters in a line).

By adding or substracting lines from any of the above addresses, e.g., [text]+2

Wait. Isn’t that exactly the same as Vim? Almost, but with one big difference: regular expressions are not supported. And the missing link to explain the similarities between QED and Vim, as well as the usage of regular expressions in addresses, is a computer scientist named Ken Thompson.

The Age of Unix: ed, sed and grep

Having just graduated from the University of California Berkeley with a Master’s degree in Electrical Engineering and Computer Science, Ken Thompson was hired by Bell Labs in 1966 and promptly started working on another time-sharing system called Multics . Along the way, he proceeded to reimplement the QED editor he worked on during his time as a student for the Multics operating system, adding regular expression searching and substitution².

Although Bell Labs pulled out of the Multics project in 1969, Ken Thompson and some fellow researchers still felt the need for a time-sharing operating system, and started developing their own version on a spare PDP-7 : this new operating system would end up becoming Unix³. The first things that Thompson implemented on this new system were the shell, an assembler… and an editor, called ed.

ed was yet another take on QED, and while it ditched many features for the sake of simplicity, it kept its parent’s core philosophy, at least regarding line addressing. It had, amongst other things, a global command, similar to the one in current day vim, that allowed to run commands like substitutes on all lines in a file. For a while, Bell researchers were happy with ed, but it was not long before more specific needs arise.

ed was a heavy beast: it required loading each file one by one in a buffer of limited size, meaning that one often had to split a file in multiple chunks and load them in succession in order to edit them, a very tedious and time-consuming process. The need for a simpler, quicker program that simply searches a file for a regular expression and prints every line that matches lead Thompson to implement grep in 1973, short for the g/re/p (global regular expression print), the ed command that it emulated⁴.

grep was great, but the demand for g/re/s (global regular expression substitution) soon arose, and Lee McMahon, another Unix pioneer working on it realized that it was just a matter of time before people would ask him for g/re/a, g/re/d, etc. Anticipating that the greX family would have no end, he instead chose to develop the utility now know as sed.

The Dawn of Visual Editors: vi and vim

One thing you have to keep in mind is that early Unix software like ed, grep and the likes were written for hardware that had no screens. You read that right: the only way to know what you were editing was to output the current line on paper, using a teletype printer. With ink. This, amongst other things, explain why addressing text in terms of characters was not really convenient: most of the time, you could not see where in the line you were, and if you changed part of that line, the entire line would have to be printed out anyway.

Ranges in Vim /img/teletype-asr33.jpg The ASR-33 teletype, from the 1960s. If you were a Unix pioneer, then this was your screen.

As you would guess, while such a system worked fine for Thompson and friends, a lot of people found it hard to use, if not outright hostile to the novice⁵. As video displays started to appear, a British man named George Coulouris wrote an improved version of ed, called em, or “editor for mortals”. Unlike ed, em came with a mode allowing users to edit a single line in place on screen (pretty similar to Vim, but one line at a time).

In 1976, Coulouris brought his code back to Berkeley, where a graduate student named Bill Joy (who later carried on to co-found Sun Microsystems in 1982) decided to build upon it. The resulting editor became known as ex, for “extended ed”. While largely compatible with ed, ex shipped with an “open” mode for line by line editing like em, along with a “visual” mode which used the whole screen to enable live editing of an entire file like we are used to today.

vi was added to BSD Unix in May 1979, being a shortcut that took users directly into ex’s visual mode. A lot of Vim’s normal and visual modes come directly from vi, such as navigation with h, j, k, l, and the use of the escape key to exit modes. Oh, and before I forget: the reason why h, j, k and l were picked is because Joy used an ADM-3A , which had arrows painted on these keys. Perhaps this little fact will help hjkl maniacs to be a little more understanding towards people using arrow keys :)

Ranges in Vim /img/adm-3a-keyboard.jpg The ADM-3A keyboard, that Bill Joy used to develop vi in the late 1970s.

vi quickly became enormously popular, but suffered from two main limitations: it was a direct descendant of ed, and thus any modification required an AT&T license, and it was only available on Unix. The first motivated the creation of many vi clones like STEVIE or nvi, and the second imitation is what pushed Bram Moolenar to write Vim.

In 1988, Moolenar was working on an Amiga 2000, but was disappointed by the fact that it did not include vi. Using STEVIE as a starting point, he started creating vim, for “Vi Imitation”, ensuring full compatibility with vi before adding new features such as multi-level undo. Released in 1993, Vim 2.0 saw the name change to “Vi Improved”, and from this point the editor grew at a steady pace, adding support for buffers in Vim 3.0, and Vimscript and syntax highlighting in Vim 5.0. Quickly becoming popular, Vim was ported to many platforms, including Unix where it competed with vi for a while before overtaking it completely, to the point that vi is now a simple simlink to vim on many distributions.

Computer science history is fascinating, and there would be a lot more to talk about, but I’ll keep it to that for now. I personally find it pretty cool to think that by using Vim everyday, I am perpetuating a tradition of more than 50 years, older than Unix itself! It is also satisfying to finally understand why sed commands like sed -i 's/pat/replacement/' <file> seemed weirdly familiar: they stem from the same common ancestor.

Besides, I am convinced that, if some features like line addressing survived for such a long time (in computer science, 50 years is pretty much like going back to the Paleolithic), then they were probably good ideas.

Ranges

Now that we’ve got the definition of address out of the way, understanding ranges is a lot easier, as it is simply all the lines between two addresses (including the first and last line!). In other words, true to their ed and ex heritage, ranges only operate on lines. Even if you were to select two words out of a line in visual mode then press :, the resulting range would still be the entire line!

Syntax

There are not one, but two ways to write a range:

with a , (example: 1,$ for “first line to end of buffer”)
with a ; (example: 1;$ for “first line to end of buffer”)

Wait, what is the difference? The only thing that changes is where your cursor will go. With , it will stay where it was, while ; will make it go to whatever address came before it. For example,1;$ will move to the address given by 1 (so the first line of the buffer).

You might think that it does not impact range definition, and that the lines included in a range do not change if you use , or ;… In fact, it can. Of course, absolute addresses (5, $ and the like) will behave the same way, but for addresses defined from the cursor position (such as searches and .), the result may be different! The reason for this is that with ;, the cursor will be moved during the command, just after Vim parses the first address (although it is not visible in the UI until when the command completes).

To understand the implications, consider this file (reminder from the previous section: the /pat/ address is defined as “the next line where pat matches!):

banana apple
apple banana
apple apricot banana

Running :/apple/,/apple/s/banana/pear/ with your cursor on the first line will do the following:

from the current line (the first one), find the first match on the line below (so the second line),
from the current line (the first one), find the first match on the line below (so the second line),
perform the replacement on the range 2,2

However, running :/apple/;/apple/s/banana/pear/ with your cursor on the first line will do:

from the current line (the first one), find the first match on the line below (so the second line),
move the cursor to that first address (the second line),
from the current line (the second one), find the first match on the line below (so the third line),
Perform the replacement on the range 2,3

Shortcuts

Manually typing the first and second addresses each time you want to create a range can quickly prove annoying. Fortunately, Vim offers several shortcuts to make the process smoother:

to create a range of one single line, writing one address is enough (:5 p is equivalent to :5,5 p),
the . is implied when using modifiers such a + or - (so :.,+2 p is like :.,.+2 p and prints three lines starting from the current one),
when in visual mode, pressing : will append '<,'> which means that the command will operate on that selection
using :* is a shorthand for :'<,'>, which can be convenient to run a second command on the same selection without re-selecting it (It is like using gv to re-select then :, but with one key less!)
pressing any number then : will automatically append .,.+(number - 1), and the command will affect a total of number lines starting at the cursor.

Do remember however that ex commands can only affect complete lines, and visual mode is no exception: as you now should know, '<,'> is the range starting at the “first line of the previous selection” and ending at the “last line of the last selection”, which means that even if your selection only includes one character on line 2 and about half of line 5, the resulting range will include the full lines 2 and 5!

I want to end this section with a tip for people who like me who like using set relativenumber: because relative lines start at 0, it means that to include up to the line number X, I have to remember to type (X+1):. That’s annoying, so I’ve remapped : using nnoremap : :<C-R>=v:count ==# 0 ? "" : "+1"<CR>, to add a +1 to the second address whenever I use a count before : (if you’ve never seen it before, have a read at :help c_CTRL-R_=, it’s pretty interesting).

A few useful range commands

You probably think, like I used to, that substitute is the range command you use the most day to day. In fact, I am pretty sure that it is :w. The formal syntax is indeed :[range]w[rite], meaning that you can control what part of the file you write (although the default is the entire file)!

There are a lot of other range commands. Some useful examples include:

:[range]m[ove] {address} to move text contained in range below address, like :,/pat/m $ to move the text from the current line to the next line containing pat (inclusive) to the end of the buffer,
:[range]co[py] {address} to copy text contained in range below address,
:[range]d[elete] [x] to delete text contained in range into register x, like :1,-1d to delete all lines before the current one, and more!

It is also interesting to note that although most range commands default to .,. (current line only) when no range is given, some behave differently, like :w defaulting to :%w.

Final Words

There are more things that you can do with ranges and addresses, such as pre-pending an address in front of a search to say “find the first line containing this pattern, starting from this address” (/pat1//pat2/ is in fact the line number of the first line containing pat2 after the first line containing pat1), but I’ll leave them for you to explore!

I hope you learned one thing or two with me today! May the hjkl be with you, and please shoot me a message or tweet @nicol4s_c if you want to chat about any of this, if you spotted any mistakes or typos, or if you want to show me other cool vim tricks! Have a great day :)

Paul Spinrad and Patti Meagher, “Project Genie: Berkeley’s piece of the computer revolution” , archived from the original on July 19, 2011, and accessed on February 6, 2020. ↩︎
Tom Van Vleck, “Glossary of Multics acronyms and terms” , accessed on February 7, 2020. ↩︎
Ritchie, Dennis M. “The Evolution of the Unix Time-sharing System” , archived from the original on April 3, 2017, and accessed on February 7, 2020. ↩︎
Michael and Ronda Hauben, “Netizens: On the History and Impact of Usenet and the Internet”, “Chapter 9 - On the Early History and Impact of UNIX: Tools to Build the Tools for a New Millennium” , accessed on February 5, 2020. ↩︎
Donald A. Norman, “The truth about Unix: The user interface is horrid” , accessed on February 7, 2020. ↩︎