Category Archives: Grammar

The Data Debate

So, there’s a bit of a debate amongst language geeks and professionals about the word “data.”

You see, one of the ways to split up nouns is between singular nouns (which have separate singular and plural forms), and collective nouns (which don’t). “Cat” is a singular noun. It’s possible to have one cat, or many cats. The unit of cats is a cat. “Milk” is a collective noun; you can’t have one milk or many milks, you just have less milk or more milk. The unit of milk is hard to define, but I suppose it would be one each of the molecules that make up milk? Or one of the least common molecule, and however many of the others you need to get the right ratio?

“Data” used to be the plural of a singular noun, “datum.” A datum is the information contained in a single point on a graph or a single cell in a table. It’s a clearly defined unit, and when you have a bunch of them, that’s data. You can have one datum, or many data; the unit of data is a datum.

Except… then computers happened. Now data is a thing your harddrive is full of. You don’t have one data or many data, you have less data or more data. The unit of data isn’t a datum, it’s a bit, or possibly a byte depending on how you look at it.

Why does this matter? Well, some of us write for a living, and we might end up having to write about data. Grammar is important, not only for clarity of communication, but also as a matter of professional pride and a measure of quality. We don’t want our bosses or coworkers telling us we made a mistake on something as simple as subject-verb agreement, and the possible collectivity of data creates an issue there. If “data” is the plural of “datum,” then “The data are reliable,” is correct grammar. But if “data” is a collective noun, then “The data is reliable,” is correct.

Cue years of debate.

I have, generally speaking, come down heavily on the collective noun side of the debate. I think data, in the modern world, behaves more like a fluid than a collection of solid objects (because the Internet isn’t a truck, it’s a series of tubes–that’s what he was trying to say!).

But I recently started a new job, and a lot of the writing there involves communicating with statisticians and statistical tables and most of what I’ve seen consistently treats “data” as the plural of “datum.” It bugged me at first, and I chalked it up to the typical lag of government standards behind the times.

But then I thought about it, and I realized that my milk example is incomplete. It’s not true that no one talks about “one milk” or “five milks.” In a restaurant, “one milk” is a glass of milk. It’s meaningful and sensible, in that context, to treat milk as a singular noun, to say “The milks are ready.”

And in the context of making statistical tables, well, isn’t that exactly what a datum is? One cell of a table? So wouldn’t the contents of many cells be many data? It sounds weird to me because I’m not used to the context, but that’s my problem, not theirs. So rather than try to force this community I’ve just entered to adapt to my ways, maybe I should try to see the sense behind theirs.

I dunno, just felt like something that ended up having a wider applicability than I expected.

Advertisements

The Dark Divinity of Passivity

Which of these three sentences makes you saddest?

Your favorite show has been canceled.
Your favorite show has been canceled by Republicans.
Republicans canceled your favorite show.

Which makes you angriest?

Behold the power of the divine passive. Let’s talk about its abuse.

Before we get there, though, let’s run quickly through what the divine passive is. Most of us learn two grammatical voices in school: active and passive. Consider these sentences:

The dog bites the person.

“Dog” is the grammatical subject (that is, the sentence is constructed to be about the dog) and “person” is the grammatical object (that is, the target of the verb). “Dog” is also the agent–the entity that causes the action of the verb–while “person” is the patient–the entity that is acted upon. A sentence where the agent is also the subject is said to use the active voice–it is a sentence about agents doing things, essentially.

The person is bitten by the dog.

In this case, even though the sentence describes the same action, and therefore the agent and patient are the same, “person” is now the subject. The agent, “dog,” has moved to a prepositional phrase. This sentence is in the passive voice–it is a sentence about things happening to a patient, incidentally caused by an agent.

The person is bitten.

I said “incidentally” because, grammatically speaking, “by the dog” in the second sentence was optional. You can drop it without rendering the sentence ungrammatical, and the result is the divine passive: the agent vanishes entirely. We are now in a world where things happen to passive patients, not because some agent causes them to, but because that is the nature of existence.

I hate the divine passive. Oh, sometimes it’s okay–I used it myself to define “patient” a couple of paragraphs ago–but it should be used very sparingly. The passive voice is for when you want to focus attention on the patient, but the divine passive erases the agent entirely. Thus, it should only be used when the agent is truly irrelevant, which is pretty rare.

Why does this matter? “Mistakes were made,” that’s why it matters.

The divine passive allows the speaker to erase the agents that cause action, and thereby erase responsibility. It encourages passivity by asserting that events are the result of impersonal cosmic forces so vast they can’t even be named, as opposed to the actions of agents.

Compare another pair of sentences:

One in ten people is unemployed.
Companies are not employing one in ten people.

Starting to see how it’s possible for the right to blame unemployment on laziness?

One in seven Americans is denied health care.
We deny one in seven Americans health care.

And so on. It’s amazing how many issues get much harder to do nothing about when you restate them in active voice.

And of course, let’s not forget the single most evil phrase in the English language, which derives its power entirely from the divine passive: “supposed to.” We’re so used to that particular instant of the divine passive that most of us never stop to ask, to quote the endlessly brilliant web comic Triangle and Robert, “Who is doing the supposing and what are their qualifications for doing so?”

Consider the vast difference in norm-setting power between these sentences:

Girls are supposed to like shopping.
I suppose girls like shopping.

How much vastly better a place would the world be for, well, everyone if we replaced all instances of “X are supposed to Y” with “I suppose X Y?”

Down with the divine passive!

"Long live the- wait. Why is it ‘live’?" — Or how I learned to stop worrying and love the subjunctive

So at this point some of you might be thinking, “’The board is frozen; long live the board’ – but why is it ‘live’?” We know, on a level that feels almost instinctual but is in fact a result of long ingrained language acquisition, that it’s live. We’ve heard often enough, “The king is dead; long live the king.” We know, “Live long and prosper,” and we know that these things are correct, but not all of us necessarily know why.

But “long live the king” and “live long and prosper” are both talking about the future. Not the present. And the future is, “Will live.” “He will live a long time,” “She will live a good life,” “Ze will live a life of prosperity.”

Yet you don’t say, “long will live the king,” though you probably could under a different set of circumstances involving augury or time travel, and you would never say, “Will live long and will prosper,” the way one says, “Live long and prosper,” it would be an awkward and improper platitude.

And the reason is a question of mood. You see sometimes there are good moods, and sometimes there are bad moods, and sometimes there are up moods, and sometimes there are down moods, and … wrong kind of mood.

I mean grammatical mood.

Verbs have tense, voice, mood, person, number, aspect, and other things as well.

Tense is when something happened.

Voice is about who was doing the action:
I hit it. Active. The subject is doing the action.
I was hit by it. Passive. The subject is acted upon.
When the action goes in a loop and so ends where it began (I steadied myself / I was steadied by myself) it’s middle in languages that have it, but English doesn’t so we’re moving on.

Mood is what this article is about, so we’ll get back to it.

Person is the whole first second third person thing you’ve doubtless heard about:
I am. First person.
You are. Second person
He/she/it is. Third person.

“Am”, “are”, and “is” are all versions of the same verb, but which one you use depends on whether the subject is first second or third person, in the case of this particular verb it also depends on number.
I am. Singular.
We are. Plural.

Aspect is about how long something takes/whether it’s complete, sort of.
I stood there. Perfect.
I was standing there. Imperfect.

A lot more could be written on aspect, especially if we delve into ancient Greek because the aorist is … well the aorist. But right now we’re blowing through this stuff to get back to mood. Which is why, “and other things as well” from above isn’t even going to be covered.

So, mood.

English has three moods: Happy, Sad and… I’m not funny, am I?

English has three moods: Indicative, Imperative, and Subjunctive.

Most of what we say is in the indicative mood:
He went to the store.
She saved the world.
It does that a lot.
Alot is the name of a species of large cute animals.

The indicative is about the way things are. When you say, “It did happen,” you mean that it really happened in the real world. When you say, “It is happening,” you mean it is really happening in reality. When you say, “It will happen,” then (barring time machines and the like) you’re saying that you believe it will actually happen.

The indicative is not hypothetical. It is about the way things are. Even when you don’t know the way things are, it’s still about that. When you ask, “Did it happen?” you’re asking for information about the non-hypothetical land of reality.

It’s not wishes or counterfactuals or hopes. It’s what was, what is, and what will be, to the best of the knowledge of the speaker. (Unless the speaker is lying.)

Imperative is how we give orders. I used it above.
“Live, damn you. Live!” is doctor Frankenstein ordering his creation to come to life.
“Go to the store.” Is an order that you go to the store.
“Go to Hell.” Is a command that you go the place where Nicolae Carpathia will one day be doomed to reside.

The same phrases, in different contexts, can be different moods. For example, “Will you go to the store?” is an indicative question. It’s not, technically at least, ordering the person to go to the store. It’s asking them for information on whether or not the person will do something in the real world in the future.

We could do the same thing with another of the phrases, “I did go to Hell,” contains the same words in the same order as the last example I gave of an imperative command, but it is instead a dubious claim about reality. Thus indicative.

And now, finally, we come to “Live long and prosper.”

We’re about to talk about the subjunctive mood. Use it wisely; use it well.

Wikipedia has a page on the subjunctive mood in general, and the version of it that exists in English-

Right here, right as I was getting to the point of the article, I had to stop writing and move on to something else. Then life hit me with the worst weekend ever. There’s no need to get into that here, I’ve talked about it elsewhere, but I do want to apologize if there’s an abrupt change in tone or anything.

Wikipedia has a page on the subjunctive mood in general, and the version of it that exists in English; they’re reasonable places to go if you find me confusing or would just like another source. There are other languages with more moods than English, which means that English generally has to cram the same amount of meaning into a smaller number of slots, and that can make the subjunctive seem somewhat overloaded at some times, so let’s start from a simple place:

“Long live the King” expresses a wish. It’s not saying that the king will live long, we don’t know that, merely that we wish him to. “Live long and prosper,” likewise is a language of wishing or hoping. It’s talking about something that may or may not come to be, but the speaker hopes will.

These are not talking about this world, but the world that we hope this world will come to be. They’re talking about hypotheticals and, in pretty much all cases, that’s what the subjunctive does.

When you look at the world and see it not just as it is, but also as it ought to be, you’re in a subjunctive frame of mind.

But it’s not all about hopes and dreams, it’s also about other ways things might have gone, it’s about possibility and counterfactuals and fears and stuff. It’s even about orders.

And before I can talk about any of that, I have to talk about tense lest we be confused.

The subjunctive tenses are, by convention, named stupidly. They’re named for what they resemble, not when they happen. A “present subjunctive” isn’t a subjunctive verb referring to the present tense, it’s a subjunctive verb that looks vaguely like it belongs in the present tense to people who don’t know about the subjunctive. As I said, they’re named stupidly.

So, consider, “Live long and prosper,” it’s talking about the future. It is hoping that your future life will be both long and prosperous. Is it called a future subjunctive? No. Of course not. That would be too easy. It’s called a “present subjunctive” because “live” looks like present tense and “prosper” looks like present tense, well, to me they look like imperatives which generally don’t have tense, but that’s another story. They can also be considered bare infinitives (to live, to prosper, but without the “to”.)

So here we are calling something present when it’s talking about the future, and we’ll call something past when it’s talking about the present, and pluperfect when it’s talking about the past. As I said, it’s badly named. I said it three times, it must be true.

OK, so let’s get to talking about some stuff.

If I was there…
If I were there…

These mean very different things. The first one, which is indicative, indicates that I may well have been there. It might be the start of the sentence, “If I was there why don’t I remember?” There is uncertainty about the past.

The second one means that I am not there, but I’m about to talk about the hypothetical world in which, at this present moment, I am there. “If I were there I’d feel a lot better than I do being stuck here,” would be a completely legitimate sentence to start that way.

But what if I wanted to be in the same actual tense (as opposed to grammatical tense) as the first one? What if I wanted to talk about the past instead of the present? Well then I’d need to push the apparent tense back one notch:

If I had been there…

Say someone says I did something somewhere last night, if I respond with:
“If I was there I wouldn’t have done that,”
I’m leaving open the possibility that I really was there. If I instead respond with:
“If I had been there I wouldn’t have done that,”
I’m indicating that I was not there in the first place.

If, on the other hand, I say:
“If I were there I wouldn’t have done that,”
I’m confusing my tenses all to hell, because I’m indicating that if I were at that place now, I wouldn’t have done something in the past. That only makes sense coming from a time traveler because the causality is backwards. Present events are being said to influence past outcomes.

So, hopefully at this point you know how to wish someone a long and prosperous life and how to deny even being somewhere while still claiming that even had you been you wouldn’t have done the thing you’re accused of doing, and you understand why the song is, “If I were a rich man” (emphasis added.)

There’s also something named the “future subjunctive” which, oddly enough considering the names of the other subjunctive tenses, is about things taking place in the future. “If we were to go to the four O’clock showing we wouldn’t get out until six.” We are not currently going, we have not previously gone, so the possibility really is in the future.

Of course, I can also say, “If we go to the four O’clock showing, we won’t get out until six.” The difference is that the future subjunctive is used of things deemed unlikely by the speaker. “If I were to go outside I’d get wet, so I will stay inside.” The person “were to”ing isn’t expecting it to happen.

When one says “If you were to go through with this plan the result would be disaster,” the expectation is that the person being spoken to will be talked out of it.

Now that we’ve talked about the future, let’s talk about the past. Remember when I said, “lest we be confused,” above? That’s passive (“be confused” instead of “confuse”) and subjunctive. Just as, “Lest we forget” is subjunctive. Why is it subjunctive? Well because lest and subjunctive go together really well. “Lest” indicates that the speaker is trying to stop something from happening, that is they’re trying to make the thing that follows the “lest” remain hypothetical and not become actual. And hypothetical is what the subjunctive is all about.

Now that we’ve talked about things people don’t want to happen with “lest”, let’s talk about things that people do want to happen. There’s a thing called the jussive subjunctive, which I always mispronounce so don’t ask me how to say it. It’s like the imperative but not. It can be used to express orders, and requests and stuff.

Screamed at the top of one’s lungs, “Get out now!” – Imperative.
Low voice, menacing in its lack of emotion, “I recommend you get out now.” Subjunctive.

“I insist that the revolution be televised.” – Subjunctive.
“Televise the revolution.” – Imperative.

You might notice that the subjunctive here has more words, it lets you chose a verb of order/requesting where the imperative just spits out the order. That’s accurate. Using the imperative is basically using a bare order, so it doesn’t leave a lot of room for nuance amoung order/insist/recommend/request/so on. That’s not to say that you need the subjunctive to get it, you can pull it off without the subjunctive it’s just a bit more awkward. Consider:

“Get out now. I recommend it.” Imperative, indicative.
“Allow me to recommend something. Get out now.” Imperative, Imperative. (The recommend is an infinitive functioning as an object.)

“Televise the revolution. I insist.” Imperative. Indicative.

So you can still get that, but it becomes more awkward.

OK, I’m just going to close on using the subjunctive to get rid of “If” by fiddling with the word order.

“If I were a rich man…” Perfectly good subjunctive usage
“Were I a rich man…” Means the exact same thing as the above.

“If I had been there I’d’ve smacked that dragon.” Bad for the stacking of contractions and the unnecessary violence toward dragons, but otherwise fine use of the subjunctive to express a contrary to fact conditional in the past.
“Had I been there I’d’ve smacked that dragon.” Means exactly the same thing.

So, anyway, hopefully someone learned something. I’d like this to make everyone start paying more attention and using the subjunctive correctly because when I hear it used wrong, or not used at all, it just rings wrong. Sometimes I don’t catch that it was screwing up the subjunctive; I’m just left with this lingering sense of wrongness.

So I’d like this to solve all grammar problems. Would that it were that simple.

Some notes:

  1. One of my proofreaders noted that I never actually used the word “confounded” until it appeared in a parenthetical where I treated it as if I had already used it in place of the word I had actually used. That’s true. It has been corrected. But “confounded” is a fun word and as soon as I took it out I wanted to work it back in somehow. Now I have.
  2. There are some things I have done above grammatically that will make some people pull out their hair for their wrongness, yet I think them right. Such is the way of things. The way of the force.
  3. An argument can be made that there are more than three moods in English.
  4. I mentioned the aorist. In ancient Greek (I know nothing of modern Greek) the aorist has tense and aspect. Sometimes it’s all about tense, sometimes it’s all about aspect, a lot of the time it’s about both. Tense is when something happened. Aspect is how long it took/takes/will take.

    So, for example, you would never have a past tense imperative because, barring time travel, you can’t order someone to have already done something. (Even if you could, it would be in their personal future.) You can have an imperative aorist. In that case it is entirely about aspect. It’s about how long it takes. “Close the door (quickly)” would be an aorist imperative. “Close the door (slowly over a period of hours)” would not be an aorist imperative because the aspect is different.

    It’s more complicated than that, it usually is, but the basic thing is that aspect is about duration.

    I like to think of it as ancient Greek allowing you to say, “Leave me alone!” and have the listener know whether you mean, “Leave me alone right now,” or, “Leave me alone forever,” but that too oversimplifies because most things with longer-than-aorist duration don’t last forever.

  5. I was going to add asterisks and have this be a footnote, but since I have an entire notes section anyway I am going to just put it here. I used the word “hopefully” above in a way that is fairly fraught. I knew it when I did it, but I didn’t make any note about it. Hopefully is an adverb and thus generally expected to modify something, usually the verb. Yet, I use it apparently modifying nothing.

    I see it as modifying the sense of the sentence. Consider, “Hopefully it will happen.” It is not saying that the happening will be filled with hope, rather that the speaking of the sentence is filled with hope. The speaker is full of hope that it will happen.

    The online Merriam Webster dictionary notes other adverbs that are used by the speaker/writer to comment directly to the hearer/reader on the sentence to which the adverbs are attached listing a few examples (those being: interestingly, frankly, clearly, luckily, and unfortunately) in the process. Why was I at the online Merriam Webster dictionary? To check that the definition of “fraught” can be stretched to mean what I wanted it to mean two paragraphs ago.

    Anyway all of this is to say that while I defend this usage of hopefully, at least one of my proofreaders finds it ugly. These disagreements happen.

By special request I shall now address “may” and “might”. The will/shall dichotomy, on the other hand, I’m not even going to touch. Anyone who does not like how I used “shall” above, please pretend I used “will”.

These examples courtesy of the requester:
“He wasn’t wearing a helmet, which might have saved his life.” He died. If he had been wearing a helmet it is possible that he would have lived. (That’s probably what it means, I leave a bit of wiggle room.)
“He wasn’t wearing a helmet, which may have saved his life.” He lived. If he had been wearing a helmet it is possible that he would have died.

As you can see, these two things mean the exact opposite of one another, so confusing them is a bad idea.

Both refer in some way to what could have been had things been different, but they do it in different ways. The difference here is actually tense “may” is in the present here, “might” is in the past. So when we say that something may have been the cause of X where X is a past event, we know that X already happened. Because if X didn’t happen, then it didn’t have causes, which means that nothing may have been the cause of X.

In the examples we’re talking about “saved his life” a past event. So if it’s, “may have saved his life,” then his life definitely was saved. (Or the person writing/speaking doesn’t understand English.)

Might is a little trickier. If you’re reading a headline you can probably assume he died. But the fact is that might is the past tense of may, so it can theoretically mean the same thing provided we’re speaking entirely in past tense which, in spite of my best attempts, I have been unable to do in a way that doesn’t sound completely awkward and painful and unnatural.

Maybe it can be done with sequence of tenses.

“He wasn’t wearing a helmet, which might have saved his life, and so from that day forth he was an anti-helmet crusader.” But it still doesn’t really work. You just don’t talk that way and it still reads as if he died and was an anti-helmet crusader as a zombie.

In fact, the only way I can see to do it is to take the usages they have in the future, and yank them back into the past while adding words to flesh out what I’m saying.

If X is in the future, then may and might become different. “You may have caused X” and “You might have caused X” both mean that X could happen. Not will happen, not won’t happen, just could happen. The difference is in likelihood. When discussing future events “may” is more likely than “might”. So, “The test results are in and the world might end tomorrow,” is a much better thing to hear than, “The test results are in and the world may end tomorrow.”

So now, let me try to drag that into the past and see if I can use “might” to save the guy’s life.

“He wasn’t wearing a helmet. That might have been what saved his life, or it may have had something to do with the wizard casting a protection spell on him.”

I think that works. But I had to explicitly state that his life was indeed saved to do it.

“He wasn’t wearing a helmet. That might have saved his life.”

I don’t know. It’s still ambiguous as to whether the “that” refers to not wearing a helmet, or wearing a helmet.

Regardless, the big deal is this:
“Might” is the past tense of “may”.
In the future tense both words mean something has the potential to happen, but might means it’s less likely than may.

Both words are about uncertainty. If the uncertainty is in the past tense then you use “might”:
“They searched the plane because they thought it might contain dinosaurs.” Correct.
“They searched the plane because they thought it may contain dinosaurs.” Incorrect.

That’s the reason for the whole helmet thing above:
“He wasn’t wearing a helmet, which might have saved his life.” Since might is past tense it’s bringing us back to the event, it means we’re talking about what could possibly have been if things were different in the past. But things weren’t different, so we know he died. (Again, maybe. There’s potential for wiggle room, even if I haven’t found it.)
“He wasn’t wearing a helmet, which may have saved his life.” Since may is present tense (certainly not future and it is never past tense) it means that we’re talking about present uncertainty. Which means that the uncertainty isn’t about what may have happened if things had been different in the past (except in rather indirect ways.)  It is clearly not about what could possibly have been if he had been wearing a helmet in the past. It can only mean that we’re talking about present uncertainty about the reason his life was saved. He definitely lived.

In the future tense, they’re about different levels of uncertainty.
“We may go to the movies.” There is a certain possibility that we will go to the movies.
“We might go to the movies.” There is a certain possibility that we will go to the movies, but it’s definitely a smaller possibility than in the above sentence.

So finally I leave you with this one simple observation:
If you use “might” when you should have used “may”, you can probably make some kind of argument that it was legitimate. It might not be a good argument, but the argument can probably made.
If you use “may” when you should have used “might” in any tense other than future, you’re screwed. There’s no defense you can offer.

It’s probably better to err on the side of “might”.