Well, it's a little more complicated than just the number of characters in a noun compared to English, for example, the Japanese also have a different grammatical structure than English, so some sentences will use BIG words in Japanese, while others use LESS words. I really don't know Japanese, so please forgive me for using Korean.
In Korean, a sentence is often shorter than an English sentence, mainly because they are abbreviated, using context to fill in the missing words. For example, the expression โI love youโ can be as short as ์ฌ๋ ์ด (โsarangiโ, just the verb โloveโ) or as long as a fully qualified sentence ์ ๋ ๋น์ ์ด ์ด์ ์ด์์ (I [topic] you [object ] love [verb + polite modifier]. In the text, as it is written, depends on the context, which is usually set by the earlier sentences in the paragraph.
In any case, having an algorithm to actually KNOW would make this kind of thing very difficult, so you are probably much better off just using statistics. What you have to do is use random samples where famous Japanese texts and English texts have the same meaning. The larger the sample (and the more random it is), the better ... although if they are really random, it will not matter much how many hundreds have passed.
Now, another thing, this ratio would completely change to the type of text being translated. For example, a highly technical document is likely to have a much higher length factor in Japanese / English than a boring novel.
Regarding the simple use of your dictionary of verbal translations - this probably won't work (and probably wrong). The same word does not translate into the same word every time in a different language (although it happens much more often in technical discussions). For example, the word is beautiful. There are not only a few words that I could assign in Korean (i.e. there is a choice), but sometimes I lose that choice, as in the sentence (that the food is fine), where I do not mean that the food looks good . I mean, it tastes good, and my translation for this word is changing. And this is a VERY common circumstance.
Another big problem is optimal translation. Something that a person is really bad, and something that computers are much worse. Whenever I correct a document translated from another text into English, I always see various ways to reduce it much shorter.
So, despite the statistics, you could work out a pretty good average length ratio between translations, it would be far from the same as if all translations were optimal.