I want to make a simple phylogenetic tree for a marine biology course as an educational example. I have a list of species with a taxonomic rank:
Group <- c("Benthos","Benthos","Benthos","Benthos","Benthos","Benthos","Zooplankton","Zooplankton","Zooplankton","Zooplankton", "Zooplankton","Zooplankton","Fish","Fish","Fish","Fish","Fish","Fish","Phytoplankton","Phytoplankton","Phytoplankton","Phytoplankton") Domain <- rep("Eukaryota", length(Group)) Kingdom <- c(rep("Animalia", 18), rep("Chromalveolata", 4)) Phylum <- c("Annelida","Annelida","Arthropoda","Arthropoda","Porifera","Sipunculida","Arthropoda","Arthropoda","Arthropoda", "Arthropoda","Echinoidermata","Chorfata","Chordata","Chordata","Chordata","Chordata","Chordata","Chordata","Heterokontophyta", "Heterokontophyta","Heterokontophyta","Dinoflagellata") Class <- c("Polychaeta","Polychaeta","Malacostraca","Malacostraca","Demospongiae","NA","Malacostraca","Malacostraca", "Malacostraca","Maxillopoda","Ophiuroidea","Actinopterygii","Chondrichthyes","Chondrichthyes","Chondrichthyes","Actinopterygii", "Actinopterygii","Actinopterygii","Bacillariophyceae","Bacillariophyceae","Prymnesiophyceae","NA") Order <- c("NA","NA","Amphipoda","Cumacea","NA","NA","Amphipoda","Decapoda","Euphausiacea","Calanioda","NA","Gadiformes", "NA","NA","NA","NA","Gadiformes","Gadiformes","NA","NA","NA","NA") Species <- c("Nephtys sp.","Nereis sp.","Gammarus sp.","Diastylis sp.","Axinella sp.","Ph. Sipunculida","Themisto abyssorum","Decapod larvae (Zoea)", "Thysanoessa sp.","Centropages typicus","Ophiuroidea larvae","Gadus morhua eggs / larvae","Etmopterus spinax","Amblyraja radiata", "Chimaera monstrosa","Clupea harengus","Melanogrammus aeglefinus","Gadus morhua","Thalassiosira sp.","Cylindrotheca closterium", "Phaeocystis pouchetii","Ph. Dinoflagellata") dat <- data.frame(Group, Domain, Kingdom, Phylum, Class, Order, Species) dat
I would like to get a dendrogram (cluster analysis) and use Domain as the first cutting point, Kindom as the second, Phylum as the third, etc. Missing values should be ignored (instead there is no cutting point, straight line), the Group should be used as a coloring category for labels.
I am a little vague how to make a distance matrix from this data frame. There are many phylogenetic tree packages for R, they seem to want new data / DNA / other extended information. So help with this will be appreciated.