Given the context of reinforcement training, through which the Tensorflow agent will perform a βstepβ in the environment for each observation received. 
How to most effectively use variable length observation, for example, when the observation exceeds the 1024 placement limit (i.e. 10,000 char of a long Wikipedia article) , noting that placeholders are immutable and canβt dynamically change when calculating:
self.obs = tf.placeholder(tf.float32, shape=(None,1024), name='obs')
I am familiar with the filling method , in which the maximum length of the string is set, and characters that are not busy are filled with a placeholder. However, this method seems inefficient when processing inputs with different lengths between 100 characters and 10,000 characters, resulting in a placeholder shape that should be (None, 10000) even when processing inputs that are only 100 characters long, effectively using 9,900 placeholder characters and large The decent amount of memory and computing that is virtually useless is further compounded in the context of reinforcement training, which takes millions of steps to learn about effective policies.
I know the method of embedding words.

However, there are many problems with this method, which include, but are not limited to: Lack of detail and extension performance, lack of understanding of punctuation and other representations of characters and logic, as in the mathematical formula .etc
How can the input stream be structured efficiently, whereby the variable input size can be efficiently consumed by the Tensorflow agent in such a way as to avoid the above problems?