Improve bidi support (a better support for RTL)

Hi

As a social media, Mastodon instances host people with different background including different languages. Some wrote in LTR and some in RTL. It gets more complicated when people make a toot which includes both RTL and LTR text. Apparently in such situations enforcing RTL or LTR cannot solve the issue. The solution is to add bidi support.

So far, Mastodon uses a simple algorithm to handle this issue. The algorithm checks the content of a post and if it finds more than 30 percent of the pure content (not links, not mentions, not hashtags) is RTL, then applies RTL direction to whole toot. This might work for those who post only in one language, but as there are many geeks whose main language is RTL but they want to add some LTR text like some pieces of code in their toots, this method doesn’t work fine.

Let me explain the situation with some screenshots:

Mastodon official web UI checks the percentage of characters in RTL language. If it be higher than some level, it turns whole toot into RTL.

The better way is to check first letter of each paragraph and set the direction of that paragraph based on that. Sengi, Fedilab and Tusky use this approach and as it is demonstrated above, they do better job.

There is one small issue with them and that is when a paragraph starts with a mention. this can be handled by keeping mentions and hashtags in isolation (like in an span with style of unicode-bidi: isolate and setting the parent p to use unicode-bidi: plaintext) so that they wont be considered while checking first letter.

The easiest way is to set all p tags to use style of unicode-bidi=plaintext and then wrap all hashtags and mentions in a span with style of unicode-bidi=isolate. This is the result of removing the effects of the current algorithm and then apply the mentioned approach:

image

and here is my sample toot and the screenshot of before/after applying what I suggested here:

To me, it is perfect. Share your opinion please.