Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Just a note that the most common Markdown flavor (Commonmark) doesn't actually support frontmatter. The author is using presumably Obisidian-flavored Markdown (which is a mixture of Commonmark, GH-flavored Markdown, and Latex).

For file-tagging, I would consider TMSU [0] instead of writing bespoke tools. (ideally we would just use xattrs, but the world isn't ready for that)

[0]: https://tmsu.org/



> Markdown flavor (Commonmark) doesn't actually support frontmatter.

That leads to mixing presentation logic (meta data, ToC) and content. When typesetting the Markdown, the ToC can be derived from headings and meta data should be isolated to avoid duplication. The following videos demonstrate some of the advantages to this approach:

* https://www.youtube.com/watch?v=cjQ-dle-tAE

* https://www.youtube.com/watch?v=3QpX70O5S30

See my editor's screenshots for more details:

https://keenwrite.com/screenshots.html

My FOSS editor is a cross-platform CLI and GUI application that replaces the shell scripts developed in my blog series about typesetting Markdown.

https://dave.autonoma.ca/blog/2019/05/22/typesetting-markdow...


I like Commonmark but I wish it would have been more opinionated. They chose to allow two ways to do everything[0].

Making * always be used for bold and _ always for italicizing is so much clearer, and some Markdown flavors (notably WhatsApp) do this. So you only have to do *haha* or _haha_, which also makes italic-bold more _*intuitive*_.

Similarly they should have gone with one style of headings, probably with #.

This frees up more visual clarity. Because you are no longer using *** for bold-italic, you can use that for lines, instead of both --- and ***.

This then further frees --- up to be used for tables.

Although I imagine there's a decent subset of people that uses the alternate style of doing headings === and the 'normal' way of doing lines ---, which would have killed adoption.

And good luck convincing people to adopt a new variant at this point. "Commonermark"? "Peasantmark"? "Rabblemark" actually sounds decent.

Edit: actually, having checked the discourse around it a bit more, Commonmark wasn't created as "one Markdown to rule them all", but rather as "Venn diagram markdown with the most overlap".

[0]https://commonmark.org/help/


> Making * always be used for bold

This is probably to support potential ambiguities and intraword emphasis e.g. underscore is a common pseudo-space so doesn't support intraword use but * does e.g.

    is_not_italic

    this*is*italic.
I recently implemented a commonmark parser for emphasis. Holy shit it's painful. I regret doing it but it became a battle I refused to surrender.

It's way harder than I expected because of the combination of the ambiguity of * and ** in multi-symbol runs which support infinite nesting even of the same type of emphasis. A given delimiter run could be many different permutations of plain text `*`, `em` and `strong` depending on context of other delimiter runs that might open and close sections along side other context like punctuation, intraword-ness, flanking and whether sums of runs can be be factored by three!

https://spec.commonmark.org/0.31.2/#emphasis-and-strong-emph...

I never expected "**" could be nested emphasis instead of bold so interpretation requires multiple passes to break down delimiter runs and match them up e.g.

    ***this* and that* -> *<em><em>this</em> and that </em>


> This is probably to support potential ambiguities and intraword emphasis e.g. underscore is a common pseudo-space so doesn't support intraword use but * does e.g.

    is_not_italic

    this*is*italic.

That seems like a legacy spec mistake they had to adhere to. I'd expect

    this_is_italic
to work and for _ literal usage to require

    is\_not\_italic


This is what I would have chosen too as it's natural for programmer sensibilities.

I can see it as a choice from the "plain text first" philosophy i.e. the things you typically write in plain text should not need escaping. My intuition pump is that you can copy-paste an email into .md without edits or surprising rendering.

As such, it's doomed to never satisfy everyone. Personally I never use intraword emphasis and I typically only have underscores in non-code names i.e. `this_is_normally_code`.


If you want strictness, use a linter or a pretty-printer that follows your preferred style. Adopting an opinionated parser means you can't lint or pretty-print input from those with different opinions (I do not like underscores for emphasis), and thus somewhat goes against the goals of TFA here:

> Markdown files are essentially plaintext with some extra syntax for common elements like sections, bullet points, and links. The format deliberately avoids precise control over display details like font selection. Following the rule of least power, I consider this limitation a feature.

One of my biggest ongoing frustrations has been MDX - a sort of markdown-and-JSX mixture whose spec is now in its third release and which has made very little effort to maintain compatibility with either CommonMark or itself. It is fairly strict and fairly elegant, and moving to a new version requires rewriting all previously-written documents to eliminate no-longer-supported syntax and re-training writers. Both of those things are miserable tasks; it has absolutely killed any tolerance I might have had for a stricter parser.


I'm surprised they didn't make a conversion tool to do MDX(old) -> AST -> MDX(new). The library support is there, but it doesn't look like anyone has created a tool to do it.


> ... new version requires rewriting all previously-written documents to eliminate no-longer-supported syntax and re-training writers

Wonder if any of the LLMs could do that for you?


I would just like to see Obsidian adopt MDX. I feel like there is a whole class of interactivity that could be easily implemented that way.


OP here. I'm pretty cavalier about which Markdown features I use. I employ them differently in various contexts - in plain Markdown files and on my blog, for instance.

But primarily, I treat them as plaintext files. If I needed to remove frontmatter at some point, it would be a simple script. For any feature specific to a particular Markdown flavor, preprocessing, or system - I expect it to work only as plain text elsewhere.

Also, thanks for sharing about TSMU! I was thinking about similar issues—for example, a photo can simultaneously be "from 2022," "from a conference," and "emotionally important." This doesn't work well with typical nested filesystems, where we need to decide on a single folder hierarchy rather than allowing us to filter based on need (as we can in SQL).


Re TMSU, a scan through the bug list turns up this:

https://github.com/oniony/TMSU/issues/264


Thanks for sharing tmsu, I had never seen that before.

Though I wonder what benefits it has over just plain symlinks?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: