To my mind...
Approach 2
This is a kind of defeat in order to adopt a formal class system, and then to create a class that contains fuzzy slots ("A" or "NULL"). At a minimum, I would try to force DataClass1 to have a "NULL" default. As a simple example, a numbered numerical vector is used here by default.
setClass("DataClass1", representation=representation(x="numeric")) DataClass1 <- function(x=numeric(), ...) { new("DataClass1", x=x, ...) }
Then
setClass("MasterClass1", representation=representation(dataClass1="DataClass1")) MasterClass1 <- function(dataClass1=DataClass1(), ...) { new("MasterClass1", dataClass1=dataClass1, ...) }
One of the advantages of this is that the methods do not have to check if the instance in the slot is NULL or DataClass1
setMethod(length, "DataClass1", function(x) length( x@x )) setMethod(length, "MasterClass1", function(x) length( x@dataClass1 )) > length(MasterClass1()) [1] 0 > length(MasterClass1(DataClass1(1:5))) [1] 5
In response to your comment about warning users when they access βemptyβ slots, and remembering that users usually want functions to do something and not tell them that they are doing something wrong, I would, probably returned an empty DataClass1()
object that accurately reflects the state of the object. Maybe the show
method will provide an overview that would improve the status of the slot - DataClass1: none. This seems particularly appropriate if MasterClass1 is a way of coordinating several different analyzes, of which the user can only make a few.
The limitation of this approach (or your approach 2) is that you do not receive method submission - you cannot write methods that are suitable only for an instance with DataClass1
instances that are of non-zero length and are forced to perform some kind of manual submission (for example , with if
or switch
). This may seem limited to the developer, but it also applies to the user - the user does not understand what operations are unique to MasterClass1 instances that have nonzero lengths of DataClass1 instances.
Approach 1
When you say that class names in the hierarchy will confuse your user, it looks like this may indicate a more fundamental problem - you are trying too hard to make the data types comprehensive; the user will never be able to track ClassWithMatrixDataFrameAndTree because it does not reflect the way data is viewed. Perhaps this is an opportunity to reduce your ambition in order to really get involved in only the most famous parts of the field you are studying. Or, perhaps, the opportunity to rethink how the user can think and interact with the data they collected, and also use the separation of the interface (what the user sees) from the implementation (as you decide to present the data in classes) provided by class systems to more effectively encapsulate what the user can do.
Distracting the naming and the number of classes to the side, when you say "it is difficult to extend for additional data types in the future", it makes me wonder if some of the nuances of S4 classes can be confusing to you? The short solution is to not write your own initialize
methods and rely on constructors to do the complex work in accordance with
setClass("A", representation(x="numeric")) setClass("B", representation(y="numeric"), contains="A") A <- function(x = numeric(), ...) new("A", x=x, ...) B <- function(a = A(), y = numeric(), ...) new("B", a, y=y, ...)
and then
> B(A(1:5), 10) An object of class "B" Slot "y": [1] 10 Slot "x": [1] 1 2 3 4 5