Table 1. Genomic Features of Essential Genes in A. thaliana
CategoryFeatureData TypeSign of Lethal AssociationaP ValuebSeq. Based FeaturecRicedYeastd
Gene duplicationα WGD duplicate retainedBinary3.17E-10NoNoNo
βγ WGD duplicate retainedBinary3.07E-10NoNoNo
Pseudogene presentBinary+0.035YesYesNo
Tandem duplicateBinary7.93E-06YesYesNo
Paralog KsNumeric+2.17E-08YesYesYes
Gene family sizeNumeric1.20E-24YesYesYes
ExpressionMedian expressionNumeric+1.60E-08NoYesYes
Expression variationNumeric0.002NoYesYes
Expression breadthNumeric+5.47E-20NoYesNo
Expression correlationNumericNA0.072NoNoNo
Expression correlation (Ks < 2)Numeric0.004NoNoNo
Evolution and conservationCore eukaryotic geneBinary+2.44E-08NoNoYes
Homolog not found in riceBinary4.04E-10YesNoNo
Percentage identity in plantsNumeric+2.73E-06YesNoNo
Percentage identity in metazoansNumericNA0.254YesNoNo
Percentage identity in fungiNumericNA0.077YesNoNo
A. lyrata homolog Ka/KsNumeric0.012YesNoNo
P. trichocarpa homolog Ka/KsNumeric0.008YesNoNo
V. vinifera homolog Ka/KsNumeric0.003YesNoNo
Rice homolog Ka/KsNumeric0.012YesNoNo
P. patens homolog Ka/KsNumeric0.038YesNoNo
Nucleotide diversityNumeric0.001NoNoNo
Paralog Ka/KsNumeric+2.51E-14YesYesYes
NetworksExpression module sizeNumeric+1.94E-34NoNoYes
Gene network connectionsNumeric+9.84E-11NoNoYes
Protein-protein interactionsNumericNA0.72NoNoNo
MiscellaneousGene body methylatedBinary+3.46E-10NoNoNo
Paralog percentage identityNumeric2.75E-33YesYesYes
Protein lengthNumeric+1.22E-06YesYesYes
Domain numberNumeric+0.023YesYesYes
  • a For each binary feature, + and – indicate that the proportion of lethal genes are significantly higher (overrepresentation) or lower (underrepresentation) than nonlethal genes, respectively. For each numeric feature, + and − indicate that lethal genes have significantly higher or lower feature values compared to nonlethal genes, respectively. NA indicates that there is no significant difference between lethal and nonlethal genes.

  • b P values from Fisher’s exact tests (used for binary data) or Kolmogorov-Smirnov tests (used for numeric data).

  • c Sequence-based features, where “Yes” indicates that a feature can be derived from genome sequence data.

  • d Feature used (“Yes”) or not used (“No”) in rice or yeast lethal phenotype gene predictions.