Received November 21, 2002
Accepted February 13, 2003
Genome-Wide Analysis of NBS-LRR-Encoding Genes in Arabidopsis
Blake C. Meyers 1, Alexander Kozik 2, Alyssa Griego 2, Hanhui Kuang 2, and Richard W. Michelmore 2*
1
Department of Vegetable Crops, University of California, Davis, California 95616;
Department of Plant and Soil Sciences, University of Delaware, Newark, Delaware 19711
2
Department of Vegetable Crops, University of California, Davis, California 95616
* To whom correspondence should be addressed. E-mail: rwmichelmore{at}ucdavis.edu.
The Arabidopsis genome contains
200 genes that encode proteins with similarity
to the nucleotide binding site and other domains characteristic of plant resistance
proteins. Through a reiterative process of sequence analysis and reannotation, we
identified 149 NBS-LRR-encoding genes in the Arabidopsis (ecotype Columbia)
genomic sequence. Fifty-six of these genes were corrected from earlier annotations.
At least 12 are predicted to be pseudogenes. As described previously, two distinct
groups of sequences were identified: those that encoded an N-terminal domain with
Toll/Interleukin-1 Receptor homology (TIR-NBS-LRR, or TNL), and those that encoded
an N-terminal coiled-coil motif (CC-NBS-LRR, or CNL). The encoded proteins are distinct
from the 58 predicted adapter proteins in the previously described TIR-X, TIR-NBS,
and CC-NBS groups. Classification based on protein domains, intron positions, sequence
conservation, and genome distribution defined four subgroups of CNL proteins, eight
subgroups of TNL proteins, and a pair of divergent NL proteins that lack a defined
N-terminal motif. CNL proteins generally were encoded in single exons, although two
subclasses were identified that contained introns in unique positions. TNL proteins
were encoded in modular exons, with conserved intron positions separating distinct
protein domains. Conserved motifs were identified in the LRRs of both CNL and TNL
proteins. In contrast to CNL proteins, TNL proteins contained large and variable
C-terminal domains. The extant distribution and diversity of the NBS-LRR sequences
has been generated by extensive duplication and ectopic rearrangements that involved
segmental duplications as well as microscale events. The observed diversity of these
NBS-LRR proteins indicates the variety of recognition molecules available in an individual
genotype to detect diverse biotic challenges.