How do I jitter node splitting lines in building ctree output from partykit?

I have a problem when I use mostly categorical data defined by a factor class in the categorical tree. I use the partykit package in R and not party , as the previous answers here suggested that the previous package is better suited for handling graphics output.

I don’t have many nodes (about 7) in my real data set, but for some variables I have quite a few factors, and I encounter the problem that the levels of factors on the left side of the split and those on the right side interfere with each other. In particular, this is due to the horizontal orientation of the lists of factor levels in combination with the length of the factor levels.

I can reproduce the problem using the Aids2 dataset in the MASS package. This is an example of nonsense, but it breeds the behavior that I want to solve.

 library("partykit") SexTest <- ctree(sex ~ ., data=Aids2) plot(SexTest) 

If you look at the node split information for node 1, you will see the behavior that I am describing:

In my real data frame, font compression only works if I get it up to 4 points, which is unreadable.

Is there a way to define a text box for this line and enable text wrapping? I looked at par and gpar , trying to find a solution, but was unsuccessful. Another option that would be appropriate would be to sway the vertical position of the factor information for each node so that they are one under the other.

+4
source share
2 answers

Hmmm. I was here. Without changing the inside of the partykit package, I don’t know a way to improve the output on this particular size (I often have problems with the X axis labels that are too long on the output chart that is displayed from the tree graph with a polychotomically dependent variable).

This is an ugly workaround, but you can get the result from the tree to find out which categories go there and then use something like GIMP to appropriately highlight the image for your PowerPoint / Report / whatever.

 Model formula: sex ~ state + diag + death + status + T.categ + age Fitted party: [1] root | [2] T.categ in hs, hsid, haem, other | | [3] T.categ in hs, hsid, haem | | | [4] state in NSW, Other, VIC: M (n = 2386, err = 0.0%) | | | [5] state in QLD: M (n = 197, err = 0.5%) | | [6] T.categ in other: M (n = 70, err = 10.0%) | [7] T.categ in id, het, blood, mother: M (n = 190, err = 42.6%) Number of inner nodes: 3 Number of terminal nodes: 4 

You can also adjust the output size to something larger, say with png ()

 png('tmp.png',width=1024,height=768) plot(SexTest) dev.off() 

larger resolution output from plot

+2
source

An alternative to this kind of work is manual breakdown of lists at appropriate points. You can do this by changing the names of the levels at which you want the new line to include "\ n": "haem \ n". It looks a little ugly because the line partially overlaps with the factor level, but this is the only real work I have found so far.

0
source

Source: https://habr.com/ru/post/1481109/


All Articles