Cacao

Theobroma cacao

BLAST Expression Atlas Sequence Extraction Annotation Extraction Downloads
More genomic tools and resources at:
CGH Icon Cocoa Genome Hub Phytozome Phytozome

Theobroma cacao is a diploid tree species (2n = 20) native to the tropical rainforests of the Americas—primarily the Amazon basin. Belonging to the Malvaceae family, cacao thrives in warm, humid climates and is cultivated extensively in regions from Central and South America to West Africa.

  • Botanical Classification: Cacao trees are grouped into three principal types and specific research cultivars based on genetic origin and cultivation characteristics:
    • T. cacao cv. Criollo: Celebrated for its fine, complex flavor profile and lower bitterness, this traditional variety is historically esteemed but is less common commercially due to its sensitivity to pests and diseases.
    • T. cacao cv. Forastero: Representing the vast majority of global cocoa production, Forastero is known for its robustness, higher yield, and greater disease resistance, making it the backbone of most commercial plantations. This group includes the Matina 1-6 cultivar, which is a cornerstone of cacao genomic research, serving as the basis for major genome assemblies.
    • T. cacao cv. Trinitario: A natural hybrid of Criollo and Forastero, Trinitario blends the superior flavor traits of Criollo with the hardiness and productivity of Forastero, a combination that has broadened its cultivation across diverse environments.
  • Tree Characteristics: Cacao trees are small, evergreen, and adapted to understory conditions. They typically reach heights of 4 to 8 meters under managed shade environments. In commercial plantations, trees are pruned to around 4 to 5 meters to facilitate harvesting and maintain optimal canopy conditions for growth.
  • Floral Biology: Cacao flowers are minute, delicate, and borne directly on the trunk and older branches—a phenomenon known as cauliflory. These clustered, pale blooms have a short lifespan and depend primarily on tiny midges and other small insects for cross-pollination, a process that ultimately governs fruit set and yield.
  • Fruit Characteristics: The cocoa fruit, commonly referred to as a pod, is an elongated, thick-walled berry ranging from 15 to 30 centimeters in length and weighing between 300 to 600 grams. Each pod contains 20 to 60 seeds (beans) suspended in a sweet, mucilaginous pulp. Unlike many fruits that ripen on the tree, the true development of cocoa’s rich flavor occurs during a post-harvest fermentation process—a vital step in transforming raw beans into the foundation of chocolate.
  • Germplasm Conservation: The Institute of Subtropical and Mediterranean Horticulture (IHSM) conserves accessions in its germplasm collection.

Documents


Genome assembly stats

Summary of the Theobroma cacao cv. Matina genome assembly, from Motamayor et al., 2013, from 2018 v2 version at Phytozome and cv. Criollo, from Argout et al., 2017 and Cocoa Genome Hub.

Matina Criollo
Total assembly size (Gb) 0.346 0.325
Total assembled sequences 711 554
Longest sequence length (Mb) 34.4 6.5
Average sequence length (Mb) 0.487 0.586
N50 index (sequences) 5 17
L50 length (Mb) 34.4 6.5

Taxonomy

Kingdom Plantae
Phylum Magnoliophyta
Class Magnoliopsida
Order Malvales
Family Malvaceae
Genus Theobroma L.
Species Theobroma cacao L.


Other interesting links: