cells.py 8.5 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179
  1. """
  2. Implements support for grapheme clusters and cells (columns on screen).
  3. Graphemes are sequences of codepoints, which are interpreted together based on the Unicode
  4. standard. Grapheme clusters are sequences of graphemes, glued together by Zero Width Joiners.
  5. These graphemes may occupy one or two cells on screen, depending on their glyph size.
  6. Support for these cool chars, like Emojis 😃, was so damn hard to implement because:
  7. 1. Python don't know chars that occupy two columns on screen, nor grapheme clusters that are
  8. rendered as a single char (wide or not), it only understands codepoints;
  9. 2. Alive-progress needs to visually align all frames, to keep its progress bars' lengths from
  10. spiking up and down while running. For this I must somehow know which chars are wide and
  11. counterbalance them;
  12. 3. To generate all those cool animations, I need several basic operations, like len, iterating,
  13. indexing, slicing, concatenating and reversing, which suddenly don't work anymore, since they
  14. do not know anything about these new concepts of graphemes and cells! Argh.
  15. 4. As the first step, I needed to parse the codepoints into Unicode graphemes. I tried to parse them
  16. myself, but soon realized it was tricky and finicky, in addition to changing every year...
  17. 5. Then I looked into some lib dependencies, tested several, created the validate tool to help me
  18. test some Unicode versions, and chose one lib to use;
  19. 6. I finally implemented the operations I needed, to the best of my current knowledge, but it
  20. still wouldn't work. So I tried several spinners to check their alignments, until I finally
  21. realized what was wrong: I actually needed to align cells, not lengths nor even graphemes!
  22. Look this for example: Note that in your editor both strings below are perfectly aligned,
  23. although they have 6 and 16 as their Python lengths!!! How come?
  24. Graphemes didn't help either, 6 and 3 respectively... Then how does the editor know that they
  25. align? I'm not sure exactly, but I created this "cell" concept to map this into, and finally
  26. they both have the same: 6 cells!! 💡😜
  27. string \\ length python graphemes cells
  28. nonono 6 6 6
  29. 🏴󠁧󠁢󠁥󠁮󠁧󠁿👉🏾🏴󠁧󠁢󠁥󠁮󠁧󠁿 16 3 6
  30. 7. With that knowledge, I implemented "wide" marks on graphemes (so I could know whether a grapheme
  31. glyph would occupy 1 or 2 cells on screen), and refactored all needed operations. It seemed fine
  32. but still didn't work... I then realized that my animations made those wide chars dynamically
  33. enter and leave the frame, which can split strings AT ANY POINT, even between the two cells of
  34. wide-graphemes, yikes!!! To make the animations as fluid as always, I had to continue moving
  35. only one cell per tick time, so somehow I would have to draw "half" flags and "half" smiling-
  36. face-with-smiling-eyes!!
  37. 8. So, I had to support printing "half-graphemes", so I could produce frames in an animation with
  38. always the same sizes!! This has led me to implement a fixer for dynamically broken graphemes,
  39. which detects whether the head or tail cells were missing, and inserted a space in its place!
  40. 9. It worked! But I would have to run that algorithm throughout the whole animation, in any and all
  41. displayed frame, in real time... I feared for the performance.
  42. I needed something that could cache and "see" all the frames at once, so I could equalize their
  43. sizes only once!! So I created the cool spinner compiler, an ingenious piece of software that
  44. generates the entire animation ahead of time, fixes all the frames, and leverages a super light
  45. and fast runner, which is able to "play" this compiled artifact!!
  46. 10. Finally, I refactored the frame spinner factory, the simplest one to test the idea, and WOW...
  47. It worked!!! The joy of success filled me..........
  48. 11. To make the others work, I created the check tool, another ingenious software, which allowed me
  49. to "see" a spinner's contents, in a tabular way, directly from the compiled data! Then I could
  50. visually ensure whether ALL generated frames of ALL animations I could think of, had the exact
  51. same size;
  52. 12. A lot of time later, everything was working! But look at that, the spinner compiler has enabled
  53. me to make several improvements in the spinners' codes themselves, since it ended up gaining
  54. other cool functionalities like reshaping and transposing data, or randomizing anything playing!
  55. The concepts of "styling" and "operational" parameters got stronger with new commands, which
  56. enabled simpler compound animations, without any code duplication!
  57. And this has culminated in the creation of the newer sequential and alongside spinners, way more
  58. advanced than before, with configurations like intermixing and pivoting of cycles!
  59. 13. Then, it was time I moved on to the missing components in this new Cell Architecture: the bar,
  60. title, exhibit, and of course the alive_bar rendering itself... All of them needed to learn this
  61. new architecture: mainly change ordinary strings into tuples of cells (marked graphemes)...
  62. 14. And finally... Profit!!! Only no, this project only feels my soul, not my pocket...
  63. But what a ride! 😅
  64. """
  65. import unicodedata
  66. from . import sanitize
  67. VS_15 = '\ufe0e'
  68. def print_cells(fragments, cols, term, last_line_len=0):
  69. """Print a tuple of fragments of tuples of cells on the terminal, until a given number of
  70. cols is achieved, slicing over cells when needed.
  71. Spaces used to be inserted automatically between fragments, but not anymore.
  72. Args:
  73. fragments (Tuple[Union[str, Tuple[str, ...]]): the fragments of message
  74. cols (int): maximum columns to use
  75. term: the terminal to be used
  76. last_line_len (int): if the fragments fit within the last line, send a clear end line
  77. Returns:
  78. the number of actually used cols.
  79. """
  80. available = cols
  81. term.write(term.carriage_return)
  82. for fragment in filter(None, fragments):
  83. if fragment == '\n':
  84. term.clear_end_line(available)
  85. available = cols
  86. elif available == 0:
  87. continue
  88. else:
  89. length = len(fragment)
  90. if length <= available:
  91. available -= length
  92. else:
  93. available, fragment = 0, fix_cells(fragment[:available])
  94. term.write(join_cells(fragment))
  95. if last_line_len and cols - available < last_line_len:
  96. term.clear_end_line(available)
  97. return cols - available
  98. def join_cells(fragment):
  99. """Beware, this looses the cell information, converting to a simple string again.
  100. Don't use unless it is a special case."""
  101. return ''.join(strip_marks(fragment))
  102. def combine_cells(*fragments):
  103. """Combine several fragments of cells into one.
  104. Remember that the fragments get a space between them, so this is mainly to avoid it when
  105. not desired."""
  106. return sum(fragments, ()) # this is way faster than tuple(chain.from_iterable()).
  107. def is_wide(g):
  108. """Try to detect wide chars.
  109. This is tricky, I've seen several graphemes that have Neutral width (and thus use one
  110. cell), but actually render as two cells, like shamrock and heart ☘️❤️.
  111. I've talked to George Nachman, the creator of iTerm2, which has explained to me [1] the fix
  112. would be to insert a space after these cases, but I can't possibly know if this
  113. behavior is spread among all terminals, it probably has to do with the Unicode version too,
  114. so I'm afraid of fixing it.
  115. Use the `alive_progress.tools.print_chars` tool, and check the section around `0x1f300`
  116. for more examples.
  117. [1]: https://gitlab.com/gnachman/iterm2/-/issues/9185
  118. Args:
  119. g (str): the grapheme sequence to be tested
  120. """
  121. return g[-1] != VS_15 and (len(g) > 1 or unicodedata.east_asian_width(g) in ('W', 'F'))
  122. def fix_cells(chars):
  123. """Fix truncated cells, removing whole clusters when needed."""
  124. if not chars:
  125. return chars
  126. start = (' ',) if chars[0] is None else ()
  127. end = (' ',) if chars[-1] is not None and is_wide(chars[-1]) else ()
  128. return (*start, *chars[bool(start):-1 if end else None], *end) # noqa
  129. def to_cells(text):
  130. return mark_graphemes(split_graphemes(sanitize(text)))
  131. def split_graphemes(text):
  132. from grapheme import graphemes
  133. return tuple(graphemes(text))
  134. def mark_graphemes(gs):
  135. return sum(((g, *((None,) if is_wide(g) else ())) for g in gs), ())
  136. def strip_marks(chars):
  137. return (c for c in chars if c)
  138. def has_wide(text):
  139. return any(is_wide(x) for x in text)