Latex - apply an operation to each character in a string

I am using LaTeX and I have a problem with string manipulation. I want the operation to be applied to each character of the string, in particular I want to replace each character "x" with "\ discretionary {} {} {} x". I want to do this because I have a long string (DNA) that I want to split into any point without hyphens.

Thus, I would like to have a command called "myDNA" that will do this for me instead of manually inserting \ discretionary {} {} {} after each character.

Is it possible? I looked at the website and there wasnโ€™t much useful information on this topic (at least I couldnโ€™t understand), and I was hoping you could help.

- edit To clarify: What I want to see in the finished document looks something like this:

     the dna sequence is CTAAAGAAAACAGGACGATTAGATGAGCTTGAGAAAGCCATCACCACTCA
     AATACTAAATGTGTTACCATACCAAGCACTTGCTCTGAAATTTGGGGACTGAGTACACCAAATACGATAG
     ATCAGTGGGATACAACAGGCCTTTACAGCTTCTCTGAACAAACCAGGTCTCTTGATGGTCGTCTCCAGGT
     ATCCCATCGAAAAGGATTGCCACATGTTATATATTGCCGATTATGGCGCTGGCCTGATCTTCACAGTCAT
     CATGAACTCAAGGCAATTGAAAACTGCGAATATGCTTTTAATCTTAAAAAGGATGAAGTATGTGTAAACC
     CTTACCACTATCAGAGAGTTGAGACACCAGTTTTGCCTCCAGTATTAGTGCCCCGACACACCGAGATCCT
     AACAGAACTTCCGCCTCTGGATGACTATACTCACTCCATTCCAGAAAACACTATATTCCCAGCAGGAATT

just alternating lines without hyphens. The DNA sequence will be one long line without any spaces or anything else, but it can break at any point. That's why my idea was to โ€œa. A character, so that it can break at any point without inserting any hyphens.

+4
source share
4 answers

This takes a string as an argument and calls \discretionary{}{}{} after each character. The input line stops with the first dollar sign, so you should not use this.

 \def\hyphenateWholeString #1{\xHyphenate#1$\wholeString} \def\xHyphenate#1#2\wholeString {\if#1$% \else\say{#1}\discretionary{}{}{}% \takeTheRest#2\ofTheString \fi} \def\takeTheRest#1\ofTheString\fi {\fi \xHyphenate#1\wholeString} \def\say#1{#1} 

You name it as \ hyphenateWholeString {CTAAAGAAAACAGGACG}.

Instead of \ discretionary {} {} {}, you can also try \ hspace {0pt} if you prefer (and in a latex environment). To align the right edge, I think you need to do an even finer adjustment (but see below). Of course, the effect is minimized with a fixed-width font.

Revision:

 \def\hyphenateWholeString #1{\xHyphenate#1$\wholeString\unskip} \def\xHyphenate#1#2\wholeString {\if#1$% \else\transform{#1}% \takeTheRest#2\ofTheString\fi} \def\takeTheRest#1\ofTheString\fi {\fi \xHyphenate#1\wholeString} \def\transform#1{#1\hskip 0pt plus 1pt} 

Steve's suggestion for using \hskip sounds very good to me, so I made a few corrections. Note that Ive renamed the \say macro and made it more useful since it now actually does the conversion. (However, if you remove \hskip from \transform , you also need to remove \unskip in the definition of the main macro.


Edit:

There is also a seqsplit package which seems to be designed to print DNA data or long numbers. They also bring several options for better output, so maybe this is what you are looking for ...

+6
source

Posting Debilski is certainly a reliable way to do this, although \say not required. Here's a shorter way to use some of the internal LaTeX shortcuts ( \@gobble and \@ifnextchar ):

  \ makeatletter
 \ def \ hyphenatestring # 1 {\ xHyphen@te # 1 $ \ unskip}
 \ def \ xHyphen@te {\ @ifnextchar $ {\ @ gobble} {\ sw@p {\ hskip 0pt plus 1pt \ xHyphen@te }}}
 \ def \ sw@p # 1 # 2 {# 2 # 1}
 \ makeatother 

Pay attention to using \hskip 0pt plus 1pt instead of \discretionary - when I tried your example, I got a dangling field because there is no extensibility. \hskip adds some extensible glue between each character (and \unskip then cancels the added extra). Also note the LaTeX-style convention that "end user" macros have lowercase letters, and internal macros have @ in them somewhere so that users don't accidentally call them.

If you want to find out how it works, \@gobble just eats everything in front of it (in this case $ , since this branch only starts when $ is the next char), the main thing is that \ sw@p is set only one argument in the else branch, so it changes this argument to the next char (this is not $ ). We could also write \def\hyphenate#next#1{#1\hskip...\ xHyphen@te } and put this without arguments in the else branch, but (in my opinion) \ sw@p more general (and I'm surprised that it is no longer in standard LaTeX).

+3
source
  • Assuming your string is the same, in the preamble use \newcommand{}{} . Like this: \newcommand{\myDNA}{blah blah blah}

If this does not meet your requirements, I suggest: 2. Split the lines to the smallest part, then use \newcommand , and then invoke the new commands sequentially: \myDNA1 \myDNA2 .

If this still does not work, you may need to write a perl script to meet your string replacement needs.

-2
source

Source: https://habr.com/ru/post/1304309/


All Articles