29 June 2014

Abstract

Seminar and conference organizers ask speakers to submit abstracts before they give a presentation. They do it because abstracts have the potential to convince people who could benefit from attending your talk to actually show up. Unfortunately, most abstracts are only effective at keeping attendees away.

Take the first paragraph from this typical abstract:

Target structure-based “hit” optimization in a drug discovery project is challenging from the computational point of view. Scoring functions cannot predict binding affinity, thus, computational chemists must use their intuition or prior knowledge about the target class to prioritize compounds for synthesis.

Would you like to attend this talk?

Potential audience members that read this abstract will fall into one of these three camps:

  • indifferent bystanders, people who are completely uninterested in the subject, even if the presenter was the Nobel prize winner in “Target Structure-Based Hit Optimization.”

  • teachable newbies, people who could potentially enjoy TSBHO if somebody explained it to them properly.

  • driven experts, people who think that TSBHO is truly important and fundamentally interesting.

How you write your abstract will influence how many people from each group show up.

Let’s analyze the structure of the previous paragraph by breaking it up into sentences and abstracting away the details:

  • A is challenging from the computational point of view. (Target structure-based “hit” optimization in a drug discovery project is challenging from the computational point of view.)

  • B cannot predict C. (Scoring functions cannot predict binding affinity.)

  • Thus, computational chemists must use D to prioritize E. (Thus, computational chemists must use their intuition or prior knowledge about the target class to prioritize compounds for synthesis.)

Looking at the bare-bones structure makes it easier to determine why some sentences work and others don’t. The problem with this paragraph is that it contains technical terms that lack context (A, B, C and E). Driven experts might be able to fill in the gaps, but teachable newbies will either miss the point or stop reading (if they attend the talk, it will be for the free pizza).

Compare the previous example with this one:

The chemotaxis network of bacteria such as E. coli is remarkable for its sensitivity to minute relative changes in chemical concentrations in the environment. Indeed, E. coli cells can detect concentration changes corresponding to only ~3 molecules in the volume of a cell. Much of this acute sensitivity can be traced to the collective behavior of teams of chemoreceptors on the cell surface.

Would you attend this talk?

I didn’t know much about chemotaxis before I read it, but it got me interested.

Let’s do the same split-and-simplify breakdown:

  • A is remarkable for its sensitivity to B. (The chemotaxis network of bacteria such as E. coli is remarkable for its sensitivity to minute relative changes in chemical concentrations in the environment.)

  • Indeed, E. coli cells can detect C. (Indeed, E. coli cells can detect concentration changes corresponding to only ~3 molecules in the volume of a cell.)

  • Much of this acute sensitivity can be traced to D. (Much of this acute sensitivity can be traced to the collective behavior of teams of chemoreceptors on the cell surface.)

This paragraph doesn’t dumb the science down, it presents information in a way that’s easy to assimilate. It opens with an interesting fact, it minimizes the use of technical terms, and it links new ideas with previous ones. Let’s explore each of these strategies in turn.

Open with interesting

What does the chemotaxis network of bacteria do? It detects changes in concentration. Why is that interesting? Because it can detect very small changes. How small? As small as a 3-molecule change! OK, that’s interesting. How about this?:

Heparan sulfate (HS) is a type of linear oligosaccharide in the glycosaminoglycan (GAG) family. HS consists of repeating disaccharide units, with acetate groups and sulfate groups modified at different regions along the sequence. HS chains are attached to proteoglycan core proteins, and bind to a variety of protein ligands, which as a whole, mediate tissue-specific physiological functions, including embryogenesis, angiogenesis, tumor metastasis, neuro-degeneration and host-pathogen recognition.

The interesting part of this paragraph is that heparan sulfate is a critical component in many cellular functions. But teachable newbies need to jump over 8 technical terms (underlined) to assimilate this insight. Exhaustive definitions never made anyone walk down to the conference room. Only interesting facts have the potential to do that.

This is my attempt at a rewrite:

Heparan sulfate proteoglycans are present in almost every cell type, they play crucial roles in a diverse array of cellular processes, from embryo formation to immune cell activation. The most important feature of these molecules is their GAG sequence, the string of repeating pairs of sugar molecules that define them.

Minimize technical terms

My abstract makeover still has technical terms (HS proteoglycans, GAG sequence), but you no longer have to be familiar with them to follow the story. I can get away with not defining what proteoglycan means or what GAG stands for because driven experts don’t need to be reminded, and teachable newbies can mentally replace them with “this thing” without losing any crucial information.

On the other hand, I replaced disaccharides with the more intuitive pairs of sugar molecules because I wanted to emphasize the importance of the GAG sequence. These compromises depend heavily on the audience, especifically on what the least knowledgeable attendees are familiar with.

This is a counter-example from a jargon-happy abstract:

Computer-assisted motion analysis coupled to flash photolysis of caged chemoeffectors provides a means for time-resolved analysis of bacterial chemotaxis.

Technical terms are useful because they provide mental shorcuts but they’re not free, they interrupt the reading flow of non-experts. If you must use them, do so sparingly, and only after you come up with a real-word alternative. Once you come up with multiple options for each technical term, you are in a better position to choose between being precise and being clear. Here are a few alternatives that mix precise technical terms and clear approximations:

  • Our method (clarity) provides a way to analyze bacterial chemotaxis (precision) over time.
  • Analyzing the motion (clarity) of each bacterium using caged chemoeffectors (precision) makes it possible to track their response to external stimuli (clarity).
  • Flash photolysis (precision) allows us to record bacterial trajectories in high resolution (clarity).

Link ideas together

Progressions are more satisfying than lists. Here is an abstract that lists five different ideas:

The new high-throughput and low cost sequencing technologies offer a wealth of opportunities to the scientific community. For example, in 2011, on average, 1.37 fully sequenced genomes became available every day. On the other hand, enzymatic processes have been studied by biochemists for many decades. But surprisingly, the knowledge gap between biochemistry and genetics is much more than what people expected: at least 36% of enzymatic activities have no representative sequence in any organism.

  1. Sequencing offers opportunities to scientists.
  2. In 2011, a lot of genomes were sequenced.
  3. Biochemists have spent a long time studying enzymatic processes.
  4. There is a knowledge gap between biochemistry and genetics.
  5. A big percentage of enzymatic activities don’t have representative sequences.

The problem with this paragraph is that the ideas are not linked. Does sequencing genomes imply scientific opportunities? What’s the connection between biochemists and genetics? What links enzymatic activities with representative sequences?

Compare that to the way ideas progress in part 2 of the chemotaxis abstract:

Instead of receptors switching individually between active and inactive configurations, teams of 6-18 receptors switch on and off, and bind or unbind ligand, collectively. […] The advantage for chemotaxis is gain—[where] small relative changes in chemical concentrations are transduced into large relative changes in signaling activity […]. However, something is troubling about this simple explanation: in addition to providing gain, the coupling of receptors into teams also increases noise, and the net result is a decrease in the signal-to-noise ratio of the network. Why then are chemoreceptors observed to form cooperative teams? We present a novel hypothesis that the run-and-tumble chemotactic strategy of bacteria leads to a “noise threshold”, below which noise does not significantly decrease chemotactic velocity, but above which noise dramatically decreases this velocity.

  1. Teams of receptors cause chemotaxis by switching on and off collectively.
  2. Chemotaxis provides gain.
  3. Besides gain, it also increases noise.
  4. Increased noise decreases the signal-to-noise ratio.
  5. Why would increasing the signal-to-noise ratio be useful?
  6. We think it’s because the run-and-tumble behavior creates a noise threshold.

Each new idea uses the previous one as a stepping stone, building satisfying anticipation and leading up to the conclusion (also known as telling a story).


These are just tips. The best advice I can give is to write the kind of abstracts you would like to read, prioritize the newbies over the experts and iterate. Obscure scientific writing is not fun to write or read. Better science writing starts with our abstract.