Unix [SOLVED]: How to split a string depends on a pattern in other column (UNIX environment)

Unix [SOLVED]: How to split a string depends on a pattern in other column (UNIX environment)

Home Forums Unix Unix [SOLVED]: How to split a string depends on a pattern in other column (UNIX environment)

Tagged: , , , ,

Viewing 2 posts - 1 through 2 (of 2 total)
  • Author
    Posts
  • #37014

    Anonymous

    QuestionQuestion

    I have a TAB file something like:

    V    I      280     6   -   VRSSAI
    N    V      2739    7   -   SAVNATA
    A    R      203     5   -   AEERR
    Q    A      2517    7   -   AQSTPSP
    S    S      1012    5   -   GGGSS
    L    A      281    11   -   AAEPALSAGSL
    

    And I would like to check the last column respect to the order of letters in 1st and 2nd column. If are coincidences between the first and last letter in last column comparing to the 1st and 2nd column respectively remain identical. On the contrary if there are not coincidences I would like to locate the reverse order pattern in last column and then print the string from the letter in 1st column to the end and then take the first letter and print to the letter in 2nd column. The desired output would be:

    V    I      280     6   -   VRSSAI
    N    V      2739    7   -   NATASAV
    A    R      203     5   -   AEERR
    Q    A      2517    7   -   QSTPSPA
    S    S      1012    5   -   SGGGS
    L    A      281    11   -   LSAGSLAAEPA
    

    In this way I’m try to do different scripts but do not work correctly I don’t know exactly why.

    awk 'BEGIN {FS=OFS="t"}{gsub(/$2$1/,"t",$6); print $1$7$6$2}' "input" > "output";
    

    Other way is:

    awk 'BEGIN {FS=OFS="t"} {len=split($11,arrseq,"$7$6"); for(i=0;i<len;i++){printf "%s ",arrseq[i],arrseq[i+1]}' `"input" > "output";`
    

    And I try by means of substr function too but finally no one works correctly. Is it possible to do in bash? Thanks in advance

    I try to put an example in order to understand better the question.

    $1                 $2                 $6
    L                  A                  AAEPALSAGSL (reverse pattern 'AL' $2$1)
    

    desired output in $6 from the corresponding $2 letter within reverse pattern to the end following by first letter to corresponding $1 letter within the reverse pattern

    $1                 $2                 $6
    L                  A                  LSAGSLAAEPA
    

    #37015

    Anonymous

    Accepted AnswerAnswer

    If I understood the question correctly, this awk should do it:

    awk '( substr($6, 1, 1) != $1 || substr($6, length($6), 1) != $2 ) && i = index($6, $2$1) { $6 = substr($6, i+1) substr($6, 1, i)  }1' OFS=$'t' data
    

    You basically want to rotate the string so that the beginning of the string matches the char in $1 and the end of the string matches the char in $2. Strings that cannot be rotated to match that condition are left unchanged, for example:

    A    B    3    3    -    BCAAB
    

    Source: https://stackoverflow.com/questions/47996071/how-to-split-a-string-depends-on-a-pattern-in-other-column-unix-environment
    Author: PesaThe
    Creative Commons License
    This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Viewing 2 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic.