# #install_printer Yocaml.Path.pp ;;
# #install_printer Yocaml.Deps.pp ;;
# open Yocaml ;;
Since the purpose of YOCaml is, essentially, to read files and to write them, it’s important to have fine-grained control over file and directory paths! The Path module provides a fairly complete API for working with paths.
Why use an abstraction
As mentioned in the key concepts, there are several reasons to rely on an abstraction when working with file paths. In this section, we’ll briefly cover the two main ones.
Platform abstraction
YOCaml abstracts over the platform it runs on (through its
runtime), which allows us to push platform-specific logic to the
outer edges of a program (typically the execution function). While
it’s possible to isolate constants for file path expressions depending
on the platform (for example, GNU/Linux
vs. Microsoft Windows
), we
also need to account for structural differences:
/home/yocaml/my-folder
is aUnix-like
pathC:\home\yocaml\my-folder
is aWindows-like
path
In practice, these differences can often be handled using the
Filename module
(particularly the
dir_sep
function). However, relying directly on Filename
would make path
handling in YOCaml programs less portable, and there are runtime
contexts very different from those supported by Filename
. For
example, in the Yocaml_git
runtime, file paths are expressed as keys
exposed by
Mirage_kv
.
Using an intermediate representation therefore makes a lot of sense:
we only need to translate the abstract path representation at the
runtime level, which lets us target a wide variety of
platforms. Additionally, since the Path.t
type is abstract, we
retain the freedom to change its internal representation in the future
if we discover a more efficient (or more ergonomic) encoding.
A convenient API
As we’ve seen, the Filename module is extremely minimal. By introducing our own path type, we can provide a richer API while only requiring the ability to convert our representation into the one expected by a given runtime.
Path validity
Path descriptions do not perform any additional validation, just
like the Filename
module (for example, whether all characters used
are valid). In practice, this isn’t a real problem, since invalid
paths are quickly caught when used, and we didn’t want to make the API
unnecessarily complicated.
Creating paths
YOCaml distinguishes between two kinds of paths:
- absolute paths
- relative paths
In practice, when writing a generator, you’ll mostly work with relative paths to describe locations from the directory where the binary is executed.
# Path.rel [] ;;
- : Path.t = ./
# Path.abs [] ;;
- : Path.t = /
The functions rel
(for relative) and abs
(for absolute) both
take a list as their argument. For example, to describe the relative
path ./foo/bar/baz
, you would write:
# Path.rel ["foo"; "bar"; "baz"] ;;
- : Path.t = ./foo/bar/baz
The module also provides a set of Infix operators that make path handling more convenient.
For example, list concatenation:
# Path.(rel ["foo"; "bar"] ++ ["baz"; "index.html"]) ;;
- : Path.t = ./foo/bar/baz/index.html
Adding an element to a path (an operator we use often, since it’s common to define a directory and then specify a single child of that directory):
# Path.(rel ["foo"; "bar"] / "index.html") ;;
- : Path.t = ./foo/bar/index.html
A shortcut for Path.rel x
. Since we often describe relative paths,
we can use the ~/
operator to define one:
# Path.(~/["foo"; "bar"; "baz"]) ;;
- : Path.t = ./foo/bar/baz
Extensions
In YOCaml, file extensions are often used to indicate how to process
a file. For example, if a file has a .md
extension, we assume it
should be treated as a Markdown file. The Path
module provides
several functions to work with extensions. For illustration, we’ll use
these two paths:
let path_html = Path.(rel ["foo"; "bar"] / "index.html")
let path = Path.(rel ["foo"; "bar"] / "index")
From here on, we’ll generally refer to extensions without the
.
. However, path-handling functions treat.ext
andext
as equivalent.
Return the extension of a path:
# Path.extension path_html ;;
- : string = ".html"
# Path.extension path ;;
- : string = ""
You can see that if the path has no extension, the function returns an
empty string. There’s also its counterpart, extension_opt
, which
returns the extension wrapped in an option:
# Path.extension_opt path_html ;;
- : string option = Some ".html"
# Path.extension_opt path ;;
- : string option = None
You can also check whether a path has an extension or not:
# Path.has_extension "html" path_html ;;
- : bool = true
# Path.has_extension "md" path ;;
- : bool = false
You can also get a path without its extension:
# Path.remove_extension path_html ;;
- : Path.t = ./foo/bar/index
This has no effect if the path has no extension:
# Path.remove_extension path ;;
- : Path.t = ./foo/bar/index
Since it’s possible to get a path without its extension, we can also return a path with a specific extension:
# Path.add_extension "html" path ;;
- : Path.t = ./foo/bar/index.html
Note: adding an extension simply concatenates it to the last
fragment of the path. So if you add the html
extension to
path_html
, the extension will be duplicated:
# Path.add_extension "html" path_html ;;
- : Path.t = ./foo/bar/index.html.html
You can also combine remove_extension
and add_extension
using the
change_extension
function, which returns a path with its extension
replaced:
# Path.change_extension "md" path_html ;;
- : Path.t = ./foo/bar/index.md
In practice, we’ll mostly use has_extension
to include paths in
batches. Since it’s common to read Markdown files that we want to
convert to HTML, we’ll often use change_extension
to switch from
Markdown to HTML.
Resolving paths
Using a path to calculate a new one is very common when working with YOCaml. For example, imagine that our content is organized like this:
let content = Path.rel ["content"]
let articles = Path.(content / "articles")
let target = Path.rel ["_www"]
let articles_target = Path.(target / "articles")
We could imagine the following article (in Markdown):
# Path.(articles / "my-first-article.md") ;;
- : Path.t = ./content/articles/my-first-article.md
And from this article, we might want to calculate the following path:
# Path.(articles_target / "my-first-article.html") ;;
- : Path.t = ./_www/articles/my-first-article.html
There are several ways to achieve this result; here, we’ll focus on the following function:
# Path.move ;;
- : into:Path.t -> Path.t -> Path.t = <fun>
The move
function replaces the dirname
of a given path. So, we
could imagine a function like this:
let article_path path =
path
|> Path.move ~into:articles_target
# article_path Path.(articles / "my-first-article.md") ;;
- : Path.t = ./_www/articles/my-first-article.md
Now, all that’s left is to change the extension:
let article_path path =
path
|> Path.move ~into:articles_target
|> Path.change_extension "html"
# article_path Path.(articles / "my-first-article.md") ;;
- : Path.t = ./_www/articles/my-first-article.html
In practice, these two functions are sufficient for most scenarios we’d want to handle. Later, we’ll see how to build a resolver to simplify the calculation of potentially complex paths.
Preserving Prefixes
In addition to the move
function, there is also the relocate
function:
# Path.relocate ;;
- : into:Path.t -> Path.t -> Path.t = <fun>
It is used in exactly the same way, but it offers a subtle difference
compared to move
: the function tries to preserve common
prefixes. Let’s look at a few examples:
If the two paths are of the same type (absolute
or relative
) but
have no common prefixes, the paths are simply concatenated.
# Path.(relocate
~into:(rel ["foo"; "bar"])
(rel ["baz"; "index.html"])) ;;
- : Path.t = ./foo/bar/baz/index.html
This differs from move
, which simply relocates index.html
(and not
baz/index.html
):
# Path.(move
~into:(rel ["foo"; "bar"])
(rel ["baz"; "index.html"])) ;;
- : Path.t = ./foo/bar/index.html
If the path types are different, the ~into
argument takes precedence
(logically), and the path fragments are concatenated:
# Path.(relocate
~into:(abs ["foo"; "bar"])
(rel ["baz"; "index.html"])) ;;
- : Path.t = /foo/bar/baz/index.html
# Path.(relocate
~into:(rel ["foo"; "bar"])
(abs ["baz"; "index.html"])) ;;
- : Path.t = ./foo/bar/baz/index.html
If, on the other hand, the path types are the same and the paths share common prefixes, the target will merge the prefixes:
# Path.(relocate
~into:(rel ["foo"; "bar"])
(rel ["foo"; "bar"; "index.html"])) ;;
- : Path.t = ./foo/bar/index.html
In practice, move
is sufficient. However, when trying to generalize
path calculations — as we’ll see shortly — relocate
can be
particularly useful.
Conclusion
We’ve quickly gone over how to manipulate file paths! We’ve seen why it’s useful to abstract paths — not only to make them platform-independent, but also to provide useful features (like extension handling, moving, and relocating).
In the next section, we’ll see how to create a resolver, a module that centralizes path manipulations to handle multiple scenarios.